0% found this document useful (0 votes)
7 views

Chapter 12 Part 2 - Arima Model Estimation - 2023

The document discusses the procedure for identifying and estimating ARIMA models. It describes taking differences of data to achieve stationarity, identifying the AR and MA components by examining autocorrelation and partial autocorrelation functions, and estimating model parameters. Special cases like multiplicative seasonality and detrending are also addressed.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chapter 12 Part 2 - Arima Model Estimation - 2023

The document discusses the procedure for identifying and estimating ARIMA models. It describes taking differences of data to achieve stationarity, identifying the AR and MA components by examining autocorrelation and partial autocorrelation functions, and estimating model parameters. Special cases like multiplicative seasonality and detrending are also addressed.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

11/30/2023

Chapter 11. ARIMA Model Estimation

Nguyen VP Nguyen, Ph.D.


Department of Industrial & Systems Engineering, HCMUT
Email: [email protected]

“Full” ARIMA Model


• Box-Jenkins approach uses iterative approach
(cách tiếp cận lặp) to identify model, and then
estimate model’s parameters
• Incorporate elements from both the
autoregressive and moving average models
ARIMA (p d q)
AR I MA
p is the number of AR terms,
d is the number of differences (amount of
differencing), and
q is the number of MA terms.
All data in ARIMA analysis is assumed to be
"stationary"
2

1
11/30/2023

Step 1. Model Identification


Step 1.1 Analyze the data: Plot time series chart and do
check the data features including presence of
a. Trend or Seasonality or Both
b. Outliers (Reason out and find out reasons)
c. Variance of data across time is it multiplicative or just
additive
Tools to use:
1. Time Series Plot
2. Use the correlograms (the autocorrelation function ACF and
the partial autocorrelation function PACF ) to suggest an
appropriate model.
What patterns exist?
1. Is the data stationary? Double-check: Use the unit root test is a
formal test for non-stationarity such as the Dickey-Fuller test
2. No? Go to Step 1.2 “Methods to make the data stationary”
3

The procedure of model identification


Step 1.2 Make the data stationary
a. If the time series has a trend: Take the 1st order difference
of the data series in order to make the data stationary
b. If the time series has seasonality: after taking the 1st
difference of the data,
a. Take 2nd difference of the data if there is still a trend
or seasonality. If it is monthly data we could use
seasonal difference of 12 and if it is quarterly data we
could use a seasonal difference of 4
b. Still data is non stationary? Take the logarithm of the data
if there is a variance in seasonality
c. Still not stationary? Take the difference of the log data to
make the data stationary

2
11/30/2023

The procedure of model identification


d. Most series should not require more than two difference
operations or orders.
e. Be careful not to overdifference
a. If spikes in the ACF die out quickly, there is no need for more
differencing.
b. A sign of an overdifferenced series is the first autocorrelation close to -
0.5 and small values elsewhere.

THE BOTTOM LINE IS TO MAKE THE DATA STATIONARY! The non


stationary data will contain more information yet to be extracted, hence not
good for forecasting

Step 2. Model Estimation


• Trial model is proposed:
ARIMA(p, d, q) e.g. ARIMA(0,1,2)
• Model for seasonal data
ARIMA(p,d,q)x(P,D,Q)s model
where p, d, q: regular terms
P=number of seasonal autoregressive terms,
D=number of seasonal differences,
Q=number of seasonal moving average terms
Model parameters are estimated
and a statistical output includes
1. Parameter estimates
2. Test statistics
3. Goodness of fit measures
4. Residuals
5. Diagnostics
6

3
11/30/2023

How to estimate AR and MA components


• As a rule of thumb, we can check for following indications:
 If the ACF diagram tapers down gradually to (decrease gradually) and
approaches 0 and PACF has an abrupt down fall may be after 1 or 2 lags
then it could be an AR process.

 If the ACF has an abrupt down fall and PACF gradually tapers down and
approached 0 then it could be MA process

How to estimate AR and MA components


• We need to clearly consider in both the processes where
the significant correlation is getting detected, typically those
many AR or MA component are to be included.
• Example (right below)
 It is a MA process, certainly not an AR process
 There is "l" since we took the first difference
 We get:
͟ AR could be 0 , "l" could be 1 and MA = 1 and a
seasonality of 12.
͟ Model could be (0,1,1) (0,1,1)12
 This may get tricky at times.

4
11/30/2023

Examples

AR(1, 0) Model
MA(0, 1) Model

Examples

Mixed ARMA(1, 1) Model

In practice, the values of p


and q each seldom exceed 2

5
11/30/2023

Special comments

• The ARIMA models are not designed for models with


multiplicative seasonality. In such cases;
 Use log transforms.
 De-seasonalize and use ARIMA on de-seasonalized data.
• Models with persistent trends can be de-trended and ARIMA
applied to the de-trended series.
• There is an automatic fitting programs do a good job fitting
ARIMA models

11

Special comments
• Parsimony is desirable – use models with as few as terms as
possible
 AIC and BIC criterion penalize number of terms in the model
 Theoretical result – any high order MA model can be written as a
low order AR model and vice versa; e.g. an MA(6) can be closely
represented by an AR(1) or AR(2) model
• Key point – Above approach to model selection is based on in sample
fitting
• Need to compare all models on the basis of out-of sample forecasts
on holdout data.
 Simpler ARIMA models seem to work better out of sample even
though they may not give the best fit.
 Recall from early slides that fitting is different than forecasting.
• ARIMA models forecast can be pooled with those from other models
12

6
11/30/2023

Time Series Plot of LnSales_Diff1


0.3

0.2

0.1

0.0
LnSales_Diff1

-0.1

-0.2

-0.3

-0.4

-0.5

-0.6
1 19 38 57 76 95 114 133 152 171
Autocorrelation Function for LnSales_Diff1 Partial Autocorrelation Function for LnSales_Diff1
Index
(with 5% significance limits for the autocorrelations) (with 5% significance limits for the partial autocorrelations)

1.0 1.0

0.8 0.8

0.6 0.6
Partial Autocorrelation

0.4 0.4
Autocorrelation

0.2 0.2

0.0 0.0

-0.2 -0.2

-0.4 -0.4

-0.6 -0.6

-0.8 -0.8

-1.0 -1.0

1 5 10 15 20 25 30 35 40 45 1 5 10 15 20 25 30 35 40 45
Lag Lag

7
11/30/2023

Step 3. Model Checking


m
rk2
• Determines whether model fits data Qm  n(n  2) 
adequately. k 1 n  k

 The goal is to extract all information and m-p-q: Degrees of


ensure that residuals are white noise
Freedom
• Key measures
 ACF of Residuals where:
 n= the number of
 Ljung-Box-Pierce Q Statistic
observations in the time
(Portmanteau Test)
series
͟ Tests whether a set of residual
autocorrelations is significantly different than
 k= the particular time lag
zero. to be checked
͟ See next slide for details  m= the number of time
lags to be tested
• If model deemed adequate, proceed  rk= sample
with forecasting, otherwise try a new autocorrelation function of
model. the kth residual term15

Step 3. Model Checking


Need to check the model with two tests:
1) Examine the autocorrelation function for white noise;
a. Visually examine the autocorrelations of the resulting
residuals (or errors).
b. Does the autocorrelation function exhibit statistically
significant spikes?
2) Use the Ljung-Box statistic to examine for correctness of
the model
a. Calculate LBQ statistic
b. Perform a chi-square test on the autocorrelations of the
residuals (or error terms).
If either test indicates anything other than white noise, try a
new model.
16

8
11/30/2023

non constant variance (heteroscedasticity) and strong increasing


relationship

transforming y
17

Step 3. Model Checking

3) Transformation— Taking logarithm will enable us to


stabilize the variance across time
4) Difference the Log data - We could difference the log
data to attain more stationarity
a. When the properties of the time series does not depend on
time the data shall be considered as stationary. Thus time
series with trend or seasonality are not stationary
b. However a time series with cyclical behaviour without a trend
or seasonality is considered to be stationary (Stabilized
Mean)

18

9
11/30/2023

A Box Cox transformation

• A Box Cox transformation


 is a way to transform non-normal dependent variables into a
normal shape because Normality is an important assumption
for many statistical techniques;
 If the data isn't normal, applying a Box-Cox  we are
capable to run a broader number of tests
 Box Cox transformation is a popular power transformation
method developed by George E. P. Box and David Cox.

Box Cox Transformation Formula


The formula of the Box Cox transformation is:
Where:
y is the transformation result
x or y (itself) is the variable under transformation
λ is the transformation parameter
21

10
11/30/2023

A Box Cox transformation

• Stat > Control Charts > Box-Cox Transformation.


 Use optimal λ: Use the optimal lambda, which should
produce the best fitting transformation. Minitab rounds the
optimal lambda to 0.5 or the nearest integer. Ex. 0.19 1
 a λ value of 1 is equivalent to using the original data. If
the confidence interval for the optimal λ includes 1, then no
transformation is necessary.
 If your data are not collected in subgroups, enter 1
 If all subgroups are the same size, you can enter the
subgroup size
• To determine which lambda value approximately
stabilizes the variance?
• λ includes 1, then no transformation is necessary.
22

Box-Cox power transformation in Minitab: (W = Y^λ)


Use optimal λ: Use the optimal lambdathe best fitting
transformation.
Note: To use an exact value instead of a rounded value for optimal λ,
choose File > Options > Control Charts and Quality Tools > Other and
deselect Use rounded values for Box-Cox transformations when possible.
λ = 0 (ln): Use the natural log of your data.
λ = 0.5 (square root): Use the square root of your data.
Other (enter a value between -5 and 5):
Use a specified value for lambda. Other common transformations are
square (λ = 2), inverse square root (λ = −0.5), and inverse (λ = −1).
In most cases, you should not use a value outside the range of −2 and 2.

23

11
11/30/2023

Box-Cox Plot of Sales


Lower CL Upper CL
800 λ
(using 95.0% confidence)

700 Estimate -0.02


Lower CL -0.55
Upper CL 0.42
600
Rounded Value 0.00

500
StDev

400
the 95% confidence interval
for λ (−0.55 to 0.42) does
300
not include 1, so a
200 transformation is
Limit

100
appropriate.
-3 -2 -1 0 1
λ

should transform the data using λ = 0

24

A transformation that uses λ =?


25

12
11/30/2023

26

Example 4 Atron Corporation

ARIMA(0, 0, 2)

ARIMA(1, 0, 0)
Good? Check The mean
square errors, Residuals
Plot, Ljung-Box Q statistic 29

13
11/30/2023

Example 9 Keytron Corporation

the small lags appeared to cut


off after lag1, although there
seasonal pattern along with a was a small blip (nhiễu) at lag 3.
general upward trend the autocorrelations at the
seasonal lags—12, 24, and 36
were large and failed to die out
quickly. 30

to difference the series with respect


to the seasonal lag to see if she
could convert the nonstationary
series to a stationary one.

the seasonally differenced


data are stationary and seem
to vary about a level of roughly
100

ACFs have one


significant spike at lag 12
(cuts off)

31

14
11/30/2023

the sample partial


autocorrelations have
significant spikes at lags
12 and 24 that get
progressively smaller
(die out).

 ARIMA(0, 0, 0) (0, 1, 1)12 model

32

15

You might also like