Assignment
Assignment
To build an ARIMA (p,d,q) model Box and Jenkins three-stage method can be used for
estimating and forecasting a univariate time series. The three stages are:
(a) Identification,
In this stage of identification, the time plot of the series autocorrelation function and partial
autocorrelation function is visually examined. This stage include the following steps; (a)
Calculate ACF and PACF of the raw data, and check whether series is stationary or not. If series
is stationary, go to step 3, if not go to step 2. (b) Examine graphs of the ACF and PACF and
determine which models would be good starting points.
In this stage, each of the tentative models is estimated and the results of these models are
recorded for next stage.
For each of these estimated models: (a) check to see if parameter of longest lag is significant. If
not, you probably have too many parameters and should decrease the order of p and/or q. (b)
check ACF and PACF of the errors. If model has at least enough parameters, then all error ACFs
and PACFs will be insignificant. (c) check AIC and SBC together with adj-R2 of estimated
models to detect which is the parsimonious one (i.e. one that minimizes AIC and SBC and has
highest adj-R2).
2. HOW TO BUILD ARIMA MODEL IN E-VIEWS
We will try to identify the ARIMA (p.d,q) model for Gross Domestic Product (GDP) of Pakistan
from 1976 to 2014. The data has been taken from Ministry of Finance, Pakistan. First, plot the
data because the time plot of a variable gives us some important information about the nature of
the series, i.e. the series is stationary or not, there exist some structural breaks, missing
observations etc. the time plot of the series is presented in figure 1.
The time plot of the LGDP shows an increasing trend in the data, i.e. with the passage of time the
LGDP tends to increase. From the time plot of the data, it is expected that the series may be non-
stationary.
In identification stage, we visually examine the time plot of series autocorrelation function
(ACF), and partial autocorrelation function (PACF). This visual examination of both ACF and
PACF is called correlogram. The correlogram of LGDP is given below.
Figure 2 Correlogram of LGDP at Level
The correlogram of the series LGDP at level shows that the ACF is not decaying down with the
lags of the LGDP. Which indicates the sign of non-stationary data at level; this is in line with the
time plot of LGDP. Therefore, we move to check the correlogram at first difference as Box-
Jenkins method is based on the stationary time series.
Figure 3 Correlogram of LGDP at 1st Difference
The correlogram shows that the data is now stationary and we can now identify the potential
candidate models. From the figure 2, we can see that there are 2 spikes on the ACF, and then all
are zero, while there is also one spike in PACF, which then dies down to zero quickly. Therefore,
the first stage of identification suggests we may have AR (1) and MA (2) specifications. The
potential candidate models may be the ARIMA (1, 1, 2) and ARMA (1, 1, 1), ARMA (1, 1, 0),
and ARMA (0, 1, 1)
In diagnostic checking stage, we examine the goodness of fit of the model. We must be careful
here to avoid overfitting and Underfitting of the model. Summarized results of all five
specifications are provided in the following table.
Table 6 Summary of Estimation Stage
Model 1 Model 2 Model 3 Model 4 Model 5
ARIMA (1,1,2) ARIMA (1,1,1) ARIMA (1,1,0) ARIMA(0,1,2) ARIMA(0,1,1)
Degrees of 33 34 35 34 35
Freedom
AR 1 -0.5695 0.7190 0.4447 - -
coefficient (0.0000) (0.0057) (0.0057)
MA(1) 1.4292 -0.3648 _ 0.4208 0.3449
coefficient (0.0000) (0.2634) (0.0135) (0.0316)
MA(2) 0.9531 _ _ 0.2631 --
Coefficient (0.0000) (0.1139)
AIC -5.3357 -5.0690 -5.1057 -5.0852 -5.0655
SBC -5.1615 -4.9384 -5.0187 -4.9559 -4.9793
Adjusted R2 0.3768 0.1664 0.1761 0.1604 0.1225
The very first model ARIMA (1, 1, 2) shows that AR (1) component as well as the MA (1) and
MA (2) components are highly statistically significant even at 1 percent level of significance.
The adjusted R2 is very 0.3767 showing that 37 percent of the variation in GDP are jointly
explained by AR(1), MA(1) and MA(2) components, whereas the model is also good fit
according to the F-statistic.
The second potential candidate model of ARIMA (1, 1, 1) shows that MA (1) component is
statistically highly insignificant with probability value of 0.2634. The adjusted R square value
decreases to 0.1664, whereas the Schwarz info criterion (SIC) and Akaike info criterion (AIC)
have increased from -5.1612 , and, -5.3357 to -4.9384 and -5.0690 respectively. The model
ARIMA (1, 1, 0), ARIMA (0, 1, 2) and ARIMA (0, 1, 1) have low adjusted R square value and
high SIC and AIC values as compared to ARIMA (1, 1, 2). Therefore, evidence suggests that
ARIMA (1, 1, 2) model is probably the most appropriate one. The selected model is tested for
auto correlation. The result of Breusch-Godfrey Serial Correlation test is given below;
Since the value of both F probability and Chi-Square probability are more than 0.05, therefore
we are unable to reject the null of no autocorrelation.
2.4 Forecasting with the ARIMA (1, 1, 2)
Once a particular ARIMA model is fitted, we can use it for forecasting. There are two types of
forecast; Static forecast and Dynamic forecast. In static forecasts, we use the actual current and
lagged values of the forecast variable, whereas in dynamic forecasts, after the first period
forecast, we use the previously forecast values of the forecast variable. Results of both dynamic
and static forecasts are given below;
Static forecasted value of GDP is very near to the actual values of GDP as compared to the
Dynamic forecasts.