Arima
Arima
MODELS
AUTOREGRESSIVE INTEGRATED
MOVING AVERAGE MODELS
BOX JENKINS MODEL
A technique that tries to model the
underlying process (stochastic) of the
time series
Forecasting by concentrating only on
the past patterns of the time series
Find a model that accurately represents
the past and future patterns of a time
series
Y
t
= Pattern + e
t
The pattern can be random, seasonal,
trend, cyclical or a combination of
patterns
MODEL
Similar to a machine that takes the
observed time series and turns them into
forecasts and white noise errors
Actual series Machine (Black Box)
Accurate Forecast and White Noise
residuals
AIM OF ARIMA = DESIGN THE RIGHT
MACHINE (IDENTIFY THE RIGHT
PATTERN)
CONFIRMATION THAT CORRECT
PATTERN HAS BEEN IDENTIFIED
REQUIRES WHITE NOISE RESIDUALS
Meaning:
ERRORS ~ normally and independently
distributed (NID)
Errors have no pattern cannot be
predicted using past values
Errors have zero mean
Steps in ARIMA modeling
1. Model Identification:
Use graphs, statistics, ACF, PACF, etc. to
identify pattern and model
components
2. Parameter Estimation
Determine model coefficients through
software applications
3. Model Diagnostics
Use graphs, statistics, ACF, PACF of
residuals to determine if the model is
valid.
If valid, then use the model. If not, repeat 1,
2 and 3 again
4. Forecast
Breaking Down ARIMA
Model 1
AUTOREGRESSIVE MODEL AR(p)
The time series is predicted using its
own previous values previous values
Y
t
=
1
Y
t-1
+
2
Y
t-2
+ ..+
p
Y
t-p
+ e
t
AR(p) = Autoregressive of order p
i.e. using p past periods to forecast
AR(1)
Y
t
=
1
Y
t-1
+ e
t
How do we determine p?
Check ACF and PACF
ACF should decline rapidly to
insignificant values (tend to decrease
toward zero)
The number of statistically significant
spikes in PACF is your p (order of
autoregression)
MOVING AVERAGE MODEL MA(q)
Model that predict time series based on
past forecast errors.
Y
t
= e
t
+
1
e
t-1
+
2
Y
t-2
+ ..+
q
Y
t-q
+ e
t
Similar to weighted average model
q = order of MA
How do you determine q?
If PACF gradually declines to zero, then
the number of significant spikes in ACF
= q
AUTOREGRESSIVE MOVING AVERAGE
MODELS ARMA(p,q)
Predicting time series using past
values of the series and past values of
the forecast errors
Mixed model
Y
t
=
1
Y
t-1
+
2
Y
t-2
+ ..+
p
Y
t-p
+
1
e
t-1
+
2
Y
t-2
+ ..+
q
Y
t-q
+ e
t
To define an ARMA model, Check ACF
and PACF
Both ACF and PACF should gradually
fall to Zero
Count the number of AR and MA terms
significantly different from zero
For AR count PACF, for MA count ACF
STATIONARITY
Autocorrelations pattern dominate non-
stationary series.
So to determine the correct model and
pattern, wed have to stationarize the data
One of the ways to achieve stationarity is
differencing
1
'
t t t
y y Y
Sometimes wed have to find 2
nd
differences to stationarize
Number of differences = order of
Integration = d
When we use differencing to achieve
stationarity the resulting model =
ARIMA(p,d,q)
Seasonal differencing ARIMA(p,d,q)
(p,d,q)
DIAGNOSTIC CHECK OF MODEL
To check if model is adequate (valid), check
if the errors generated by the model are
white noise by
1. Create ACF for the residuals and check
if they have any significant spikes. If white
noise no significant spikes
2. Check the Ljung-Box Q-statistics.
This is a chi-square test on the residuals
with m-p-q degrees of freedom
m = number of time lags to be tested
Calculated Q-statistics is compared to chi-
square value from tables.
If calculated Q is less than table value
errors are white noise
If ACF plot of residual or Q-statistics
shows that the errors are not white noise
model must be redefined.
If two models yield white noise errors, pick
one with the lowest AIC or BIC
AIC = Akaike Information Criterion
BIC= Bayesian Information Criterion
AIC = n ln(SSE) + 2k
K = #of parameters that are fitted in the
model
SSE =sum of the squared errors
n = number of observations in series
BIC = n ln(SSE) + k ln(n)
You cannot compare the AIC or BIC of
one series with another series
You cannot compare AIC to BIC