0% found this document useful (0 votes)
5 views

Lecture-9-Univariate-Time-Series-Modelling_Part-2 (2)

The document discusses univariate time series modeling, focusing on model selection criteria, trends, and stationarity. It highlights the use of the Bayes Information Criterion (BIC) for distinguishing between trend stationary and difference stationary processes, along with the application of the auto.arima() function for model selection in R. Additionally, it covers seasonal ARIMA models and provides examples of forecasting using Canadian Lynx data and European quarterly retail trade data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lecture-9-Univariate-Time-Series-Modelling_Part-2 (2)

The document discusses univariate time series modeling, focusing on model selection criteria, trends, and stationarity. It highlights the use of the Bayes Information Criterion (BIC) for distinguishing between trend stationary and difference stationary processes, along with the application of the auto.arima() function for model selection in R. Additionally, it covers seasonal ARIMA models and provides examples of forecasting using Canadian Lynx data and European quarterly retail trade data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Univariate Time Series Modelling 2

OlaOluwa S. Yaya

2022-12-04

Bibliography
• Box, G. E. P., and G. Jenkins. 1976. Time Series Analysis: Forecasting and Control. Second. San
Francisco: Holden Day.
• Racine, J. S. 2019. Reproducible Econometrics using R. Oxford University Press, New York.
• Schwarz, G. 1978. “Estimating the Dimension of a Model.” The Annals of Statistics 6:461–64.

Model Selection Criteria, Trends, and Stationarity


• A trend stationary process is said to contain a deterministic trend, while a difference stationary process
such as a random walk with drift is said to contain a stochastic trend
• Thus, a trend stationary process is nonstationary and can be made stationary by removing the trend.
Also, a difference stationary process can be made stationary by taking series differencing discussed
earlier
• Consider the practical issue of how one might distinguish between a non-stationary random walk
process with drift (case of trend stationarity) and a non-stationary autoregressive process that is trend
stationary (case of difference stationarity)

• One way would be to estimate a set of candidate models and allow some model selection criteria to
determine which model is the best
• In what follows we consider the Bayes Information Criterion [@SCHWARZ:1978] and use the R
command BIC to compute the criterion function value for each of the candidate models (smaller values
are preferred)
• We consider two data generating processes for this exercise, one a non-stationary AR(1) containing a
time trend given by
yt = ayt−1 + bt + ϵt , |a| < 1

• The other is a non-stationary random walk with drift given by

yt = yt−1 + d + ϵt

• The former is trend stationary while the latter is difference stationary

• We simulate two series plotted in the figure that follows, one from each process, and compute four
candidate models:
– A classical AR(1) model (‘Model yt ’ in what follows)
– A classical first-differenced AR(1) model (‘Model (1 − B)yt ’ in what follows)

1
– An augmented AR(1) model with a linear trend (‘Model yt − a − bt’ in what follows)
– An augmented first-differenced AR(1) model with a linear trend (‘Model (1 − B)(yt − ct)’ in what
follows)
• We present the BIC values for these models for each series in the table that follows

Trend stationary AR(1)


400

Random walk with drift


300
200
$y_t$

100
0
−100

0 50 100 150 200 250

Time

Figure 1: Trend Stationary AR(1) and Random Walk with Drift.

Table 1: BIC Model Selection Criterion Values for Four Candidate


Models Based on a Trend Stationary AR(1) Process and a Random
Walk with Drift (smaller values are better).

AR(1) RWD
Model yt 2392.883 2322.317
Model (1 − B)yt 2366.227 2306.183
Model yt − a − bt 2326.711 2320.778
Model (1 − B)(yt − ct) 2371.132 2311.251

• For the trend stationary AR(1) process, we observe that the BIC criterion selected Model yt − a − bt
(BIC = 2326.711)

2
• For the non-stationary random walk with drift, we can see that the BIC criterion selected Model
(1 − B)yt (BIC = 2306.183)
• The model selection criteria that we study later in the course therefore appear to have the ability to
distinguish between these two cases
• Note, however, that these selection procedures are based on a random sample and are not guaranteed
to deliver the true model in applied settings, even in the unlikely event that it was contained in the set
of candidate models

Model Selection via auto.arima()


• The function auto.arima() returns the best ARIMA(p, d, q) model according to either its AIC, AICc
or BIC value by conducting a search over possible models within the order constraints provided
• The function auto.arima() is not guaranteed to deliver the best model according to the selection
criterion used
• You might therefore wish to consider the model produced to be a candidate model and then proceed
with some of the diagnostics that follow to see whether you might improve upon the candidate
• We will study model selection via these criterion later in this course
• The following figure plots the series and an h=10 step-ahead prediction for the Canadian Lynx data
(i.e., a sequence of 1-10 year horizon forecasts)

Example - Canadian Lynx Data (auto.arima())


data(lynx)
model <- auto.arima(lynx,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=10))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=10)"),lty=c(1,2,1),col=c(1,2,4),bty="n")

3
Forecasts from ARIMA(4,0,0) with non−zero mean

Data
6000

fitted(model)
forecast(model,h=10)
4000
2000
0
−2000

1820 1840 1860 1880 1900 1920 1940

Diagnostics for ARIMA(p, d, q) Models


• If a model provides an adequate fit to the data, the residuals from the model ought to be white noise
• There is a useful function in base R called tsdiag() that you can apply to a candidate model
• This is a generic function that plots
– The residuals, often standardized
– The autocorrelation function of the residuals
– The p-values of a portmanteau test for all lags up to gof.lag

tsdiag() Illustration For the Canadian Lynx Data


tsdiag(model)

4
3
2
1
0
−1
−2
Standardized Residuals

1820 1840 1860 1880 1900 1920

Time

ACF of Residuals
1.0
0.8
0.6
ACF

0.4
0.2
0.0
−0.2

0 5 10 15 20

Lag

p values for Ljung−Box statistic


1.0
0.8
0.6
p value

0.4
0.2
0.0

2 4 6 8 10

lag

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


• If we were selling ceiling fans then we might predict this July’s sales using last July’s sales (lag 12)
• This relationship of predicting using last year’s data would hold for any month of the year
• We might also expect that this June’s sales would also be useful (lag 1)
• The non-seasonal ARIMA(p, d, q) model cannot incorporate seasonal data
• It can, however, be overloaded to model seasonal data
• We can overload the non-seasonal ARIMA(p, d, q) model to not only include first differences, second
differences and so on, but also seasonal differences

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


• We overload a non-seasonal ARIMA(p, d, q) model by including additional seasonal parts
• The seasonal parts of the model consist of terms that are analogous to the non-seasonal terms, but
instead involve simple backshifts of the seasonal period, denoted m
• m represents the number of periods in the season (see the R function frequency())
• These models are denoted ARIMA(p, d, q)(P, D, Q)m
• The (P, D, Q)m component denotes the seasonal part of the model
• You specify a seasonal ARIMA(0, 1, 3)(2, 0, 0)12 model in R as follows:
Arima(UKDriverDeaths,order=c(0,1,3),seasonal=list(order=c(2,0,0),period=12))

5
Seasonal ARIMA(p, d, q)(P, D, Q)m Models
Example - ARIMA(1, 1, 1)(1, 1, 1)4
• An ARIMA(1, 1, 1)(1, 1, 1)4 model (without a constant) constructed for quarterly data (m = 4) can be
expressed as
(1 − ϕ1 B)(1 − Φ1 B 4 )(1 − B)(1 − B 4 )yt = (1 − θ1 B)(1 − Θ1 B 4 )ϵt

• The terms involving B 4 are the seasonal components


– ϵt ∼ (0, σϵ2 ) is i.i.d.
– The first term on the left hand side, (1 − ϕ1 B), is the non-seasonal AR(1) component
– The second term (1 − Φ1 B 4 ) is the seasonal AR(1) component
– The third term (1 − B) is the non-seasonal difference component
– The fourth term (1 − B 4 ) is the seasonal difference component

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Example - ARIMA(1, 1, 1)(1, 1, 1)4
• The first term on the right hand side, (1 − θ1 B), is the non-seasonal MA(1) component
• The second term (1 − Θ1 B 4 ) is the seasonal MA(1) component
• Letting wt = ∆yt = (1 − B)yt = yt − yt−1 , we can see that

(1 − ϕ1 B)(1 − Φ1 B 4 )(1 − B)(1 − B 4 )yt = wt − ϕ1 wt−1


− (1 + Φ1 )(wt−4 − ϕ1 wt−5 )
+ Φ1 (wt−8 − ϕ1 wt−9 )

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Example - ARIMA(1, 1, 1)(1, 1, 1)4
• Similarly,

(1 − θ1 B)(1 − Θ1 B 4 )ϵt = ϵt − θ1 ϵt−1 − Θ1 (ϵt−4 − θ1 ϵt−5 )

• So,

wt = ϕ1 wt−1 + (1 + Φ1 )(wt−4 − ϕ1 wt−5 ) − Φ1 (wt−8 − ϕ1 wt−9 )


+ ϵt − θ1 ϵt−1 − Θ1 (ϵt−4 − θ1 ϵt−5 )

• This is simply an ARIMA(9, 0, 5) model in wt with a large number of parameter constraints or an


ARIMA(10, 1, 5) model in yt

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Estimation, Properties, Forecasts, and Forecast Error Variances
• ARIMA(p, d, q)(P, D, Q)m models are essentially ARIMA(p′ , d, q ′ ) models with a large number of
constraints (zero parameters and so forth)

6
• Therefore, there is nothing new about these models when it comes to properties, estimation, forecasting,
forecast errors, and forecast error variances
• We can leverage existing results for identification (white noise tests etc.)
• Here we might use both ndiffs() and nsdiffs() in addition to the acf() and pacf() functions
• We can also use auto.arima() for identifying a candidate model

Seasonal ARIMA(p, d, q)(P, D, Q)m Models


Example - Modeling and Forecasting European Quarterly Retail Trade
• You will need to install the R packages fpp and forecast to run the following examples
• We will use the function auto.arima() to determine a candidate ARIMA(p, d, q)(P, D, Q)m model,
compare the series with the fitted values, and conduct h=10 step-ahead prediction
• The function auto.arima() returns the best ARIMA(p, d, q)(P, D, Q)m model according to either the
AIC, AICc or BIC values by conducting a search over possible models within the order constraints
provided
• The auto.arima() function is in the R package forecast, while the euretail data is in the R package
fpp so these packages need to be installed prior to running this code

Example - Modeling and Forecasting European Quarterly Retail Trade


## Loading required package: fpp
## Loading required package: fma
## Loading required package: expsmooth
## Loading required package: lmtest

Example - Modeling and Forecasting European Quarterly Retail Trade


## The euretail data is in the fpp package, auto.arima() in the forecast package
require(fpp)
require(forecast)
data(euretail)
model <- auto.arima(euretail,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=10))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=10)"),lty=1:2,col=c(1,2,4),bty="n")

7
Seasonal plot: euretail
2007
2006
2005
2008
100
2004

2003
2009
2002
2010
2000
96 2011
2001

1999

1998
92

1997
1996

Q1 Q2 Q3 Q4
Quarter

Figure 2: ‘ggseasonplot‘ of European Quarterly Retail Trade.

Forecasts from ARIMA(0,1,3)(0,1,1)[4]


102

Data
fitted(model)
forecast(model,h=10)
98
96
94
92
90
88

2000 2005 2010

Example - Modeling Monthly Cortecosteroid Drug Sales


• We consider a data set from the fpp package, h02
• h02 is a series of monthly cortecosteroid drug sales in Australia from 1992 to 2008
• The following table tabulates h=12 step-ahead forecasts (horizon 1 through 12, monthly data) using the
argument forecast(model,h=12)

8
• Make sure that the fpp package is installed prior to running this code
• We also present a figure with the time series, fitted values, and predictions along with prediction
intervals for h=24 step-ahead forecasts

Example - Modeling Monthly Cortecosteroid Drug Sales

Seasonal plot: h02


1.25 2004
2003
2007
2005
2006
2002
1999
1.00 2000
2001
1997
1998
1996
1993
1995

1994
0.75 2008 1992

1991

0.50

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month

Figure 3: ‘ggseasonplot‘ of Monthly Cortecosteroid Drug Sales.

Example - Modeling Monthly Cortecosteroid Drug Sales


require(fpp)
require(forecast)
data(h02)
model <- auto.arima(h02,stepwise=FALSE,approximation=FALSE)
plot(forecast(model,h=24))
lines(fitted(model),col=2,lty=2)
legend("topleft",c("Data","fitted(model)","forecast(model,h=24)"),lty=1:2,col=c(1,2,4),bty="n")

9
Forecasts from ARIMA(3,1,1)(0,1,1)[12]
1.4

Data
fitted(model)
1.2

forecast(model,h=24)
1.0
0.8
0.6
0.4

1995 2000 2005 2010

Example - Modeling Monthly Cortecosteroid Drug Sales


## The h02 data is in the fpp package, auto.arima() in the forecast package
require(fpp)
require(forecast)
data(h02)
model <- auto.arima(h02,stepwise=FALSE,approximation=FALSE)
knitr::kable(data.frame(forecast(model,h=12)),
caption="Australian Monthly Cortecosteroid Drug Sales Forecasts.")

Table 2: Australian Monthly Cortecosteroid Drug Sales Forecasts.

Point.Forecast Lo.80 Hi.80 Lo.95 Hi.95


Jul 2008 1.0166464 0.9481087 1.0851840 0.9118270 1.1214657
Aug 2008 1.0566558 0.9878375 1.1254740 0.9514073 1.1619042
Sep 2008 1.0981460 1.0248512 1.1714409 0.9860513 1.2102408
Oct 2008 1.1617772 1.0838991 1.2396553 1.0426729 1.2808814
Nov 2008 1.1685904 1.0894889 1.2476919 1.0476152 1.2895657
Dec 2008 1.2000699 1.1186714 1.2814684 1.0755816 1.3245582
Jan 2009 1.2467725 1.1638749 1.3296701 1.1199916 1.3735535
Feb 2009 0.7093491 0.6253626 0.7933355 0.5809029 0.8377952
Mar 2009 0.7130936 0.6279837 0.7982036 0.5829292 0.8432580
Apr 2009 0.7525096 0.6665585 0.8384607 0.6210587 0.8839605
May 2009 0.8225565 0.7358643 0.9092487 0.6899722 0.9551407
Jun 2009 0.8340600 0.7467027 0.9214174 0.7004585 0.9676616

External Predictors
• Sometimes you may have additional predictor variables that you believe affect yt but are not themselves lagged
values of either yt or ϵt

10
• These predictors (often called external predictors) turn out to be straightforward to incorporate in the
ARIMA(p, d, q)(P, D, Q)m framework
• They simply appear as additional explanatory variables on the right hand side of the model
• In R, the arima(), Arima() and auto.arima() functions support the arguments xreg=
• Note that xreg is an optional vector or matrix of external predictors, which must have the same number of
rows as y

Example - UK Driver Deaths and Compulsory Seat Belt Laws


• UKDriverDeaths is a time series giving the monthly totals of car drivers in Great Britain killed or seriously
injured January 1969 to December 1984
• Compulsory wearing of seat belts was introduced on 31 January 1983
• The figure that follows presents the ggseasonplot for this series
• You can see a secular decline in fatalities from 1983 onward

Example - UK Driver Deaths and Compulsory Seat Belt Laws

Seasonal plot: UKDriverDeaths


1972

2500 1970

1976
1978
1977
1979
1975
1971
1973
1969
1982
1974
2000
1980

1984
1981

1500 1983

1000
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month

Figure 4: ‘ggseasonplot‘ of Monthly Totals of Car Drivers in Great Britain Killed or Seriously Injured January
1969 to December 1984.

Example - UK Driver Deaths and Compulsory Seat Belt Laws


• We wish to introduce an external predictor (a dummy variable for compulsory wearing of seat belts) into a
seasonal ARIMA(p, d, q)(P, D, Q)m model
• We use the auto.arima() function to locate a candidate model, then use this candidate model and fit a model
with the external predictor seatbelt using the Arima() function
• The following table presents a summary of the model

11
• Note that the external predictor seatbelt is and accounts for an average decrease of deaths or serious injuries
per month

12

You might also like