0% found this document useful (0 votes)
45 views9 pages

Assignment

The document describes how to build an ARIMA model using the Box-Jenkins three-stage method of identification, estimation, and diagnostic checking. It provides an example of identifying the best ARIMA model for GDP data of Pakistan from 1976 to 2014, finding that an ARIMA(1,1,2) model fits the data best.

Uploaded by

akashayosha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views9 pages

Assignment

The document describes how to build an ARIMA model using the Box-Jenkins three-stage method of identification, estimation, and diagnostic checking. It provides an example of identifying the best ARIMA model for GDP data of Pakistan from 1976 to 2014, finding that an ARIMA(1,1,2) model fits the data best.

Uploaded by

akashayosha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

1.

HOW TO BUILD ARIMA MODEL

To build an ARIMA (p,d,q) model Box and Jenkins three-stage method can be used for
estimating and forecasting a univariate time series. The three stages are:

(a) Identification,

(b) Estimation, and

(c) Diagnostic checking

1.1 Identification Stage

In this stage of identification, the time plot of the series autocorrelation function and partial
autocorrelation function is visually examined. This stage include the following steps; (a)
Calculate ACF and PACF of the raw data, and check whether series is stationary or not. If series
is stationary, go to step 3, if not go to step 2. (b) Examine graphs of the ACF and PACF and
determine which models would be good starting points.

1.2 Estimation Stage

In this stage, each of the tentative models is estimated and the results of these models are
recorded for next stage.

1.3 Diagnostic Checking

For each of these estimated models: (a) check to see if parameter of longest lag is significant. If
not, you probably have too many parameters and should decrease the order of p and/or q. (b)
check ACF and PACF of the errors. If model has at least enough parameters, then all error ACFs
and PACFs will be insignificant. (c) check AIC and SBC together with adj-R2 of estimated
models to detect which is the parsimonious one (i.e. one that minimizes AIC and SBC and has
highest adj-R2).
2. HOW TO BUILD ARIMA MODEL IN E-VIEWS

We will try to identify the ARIMA (p.d,q) model for Gross Domestic Product (GDP) of Pakistan
from 1976 to 2014. The data has been taken from Ministry of Finance, Pakistan. First, plot the
data because the time plot of a variable gives us some important information about the nature of
the series, i.e. the series is stationary or not, there exist some structural breaks, missing
observations etc. the time plot of the series is presented in figure 1.

Figure1. Source: Ministry of Finance

The time plot of the LGDP shows an increasing trend in the data, i.e. with the passage of time the
LGDP tends to increase. From the time plot of the data, it is expected that the series may be non-
stationary.

2.1 Identification Stage:

In identification stage, we visually examine the time plot of series autocorrelation function
(ACF), and partial autocorrelation function (PACF). This visual examination of both ACF and
PACF is called correlogram. The correlogram of LGDP is given below.
Figure 2 Correlogram of LGDP at Level

The correlogram of the series LGDP at level shows that the ACF is not decaying down with the
lags of the LGDP. Which indicates the sign of non-stationary data at level; this is in line with the
time plot of LGDP. Therefore, we move to check the correlogram at first difference as Box-
Jenkins method is based on the stationary time series.
Figure 3 Correlogram of LGDP at 1st Difference

The correlogram shows that the data is now stationary and we can now identify the potential
candidate models. From the figure 2, we can see that there are 2 spikes on the ACF, and then all
are zero, while there is also one spike in PACF, which then dies down to zero quickly. Therefore,
the first stage of identification suggests we may have AR (1) and MA (2) specifications. The
potential candidate models may be the ARIMA (1, 1, 2) and ARMA (1, 1, 1), ARMA (1, 1, 0),
and ARMA (0, 1, 1)

2.2 Estimation Stage:

In this stage of estimation, we now estimate ARIMA(1,1,2) and ARIMA(1,1,1), ARIMA(1,1,0),


ARIMA (0,1,2) and ARIMA (0,1,1). The results of these models are given.
Table 1. Regression Result of ARIMA (1, 1, 2)
Dependent Variable: DLGDP

Variable Coefficient Std. Error t-Statistic Prob.

C 0.047840 0.005640 8.482320 0.0000


AR(1) -0.569518 0.119404 -4.769689 0.0000
MA(1) 1.429225 0.029266 48.83620 0.0000
MA(2) 0.953138 0.017434 54.67117 0.0000

R-squared 0.428729 Akaike info criterion -5.335684


Adjusted R-squared 0.376795 Schwarz criterion -5.161531
F-statistic 8.255301 Prob(F-statistic) 0.000309

Table 2. Regression Result of ARIMA (1, 1, 1)


Dependent Variable: DLGDP

Variable Coefficient Std. Error t-Statistic Prob.

C 0.046901 0.007061 6.641975 0.0000


AR(1) 0.719026 0.243888 2.948176 0.0057
MA(1) -0.364805 0.320782 -1.137236 0.2634

R-squared 0.212710 Akaike info criterion -5.069006


Adjusted R-squared 0.166399 Schwarz criterion -4.938391
F-statistic 4.593067 Prob(F-statistic) 0.017151

Table 3. Regression Result of ARIMA (1, 1, 0)


Dependent Variable: DLGDP

Variable Coefficient Std. Error t-Statistic Prob.

C 0.048118 0.005434 8.854779 0.0000


AR(1) 0.444733 0.150838 2.948421 0.0057

R-squared 0.198960 Akaike info criterion -5.105745


Adjusted R-squared 0.176073 Schwarz criterion -5.018669
F-statistic 8.693185 Prob(F-statistic) 0.005657
Table 4. Regression Result of ARIMA (0, 1, 2)
Dependent Variable: DLGDP

Variable Coefficient Std. Error t-Statistic Prob.

C 0.047569 0.004961 9.587981 0.0000


MA(1) 0.420795 0.161694 2.602417 0.0135
MA(2) 0.263059 0.162260 1.621223 0.1139

R-squared 0.205795 Akaike info criterion -5.085154


Adjusted R-squared 0.160412 Schwarz criterion -4.955871
F-statistic 4.534627 Prob(F-statistic) 0.017734

Table 5. Regression Result of ARIMA (0, 1, 1)


Dependent Variable: DLGDP

Variable Coefficient Std. Error t-Statistic Prob.

C 0.047586 0.004067 11.69954 0.0000


MA(1) 0.344881 0.154177 2.236911 0.0316

R-squared 0.146234 Akaike info criterion -5.065470


Adjusted R-squared 0.122519 Schwarz criterion -4.979281
F-statistic 6.166139 Prob(F-statistic) 0.017817

2.3. Diagnostic Test:

In diagnostic checking stage, we examine the goodness of fit of the model. We must be careful
here to avoid overfitting and Underfitting of the model. Summarized results of all five
specifications are provided in the following table.
Table 6 Summary of Estimation Stage
Model 1 Model 2 Model 3 Model 4 Model 5
ARIMA (1,1,2) ARIMA (1,1,1) ARIMA (1,1,0) ARIMA(0,1,2) ARIMA(0,1,1)
Degrees of 33 34 35 34 35
Freedom
AR 1 -0.5695 0.7190 0.4447 - -
coefficient (0.0000) (0.0057) (0.0057)
MA(1) 1.4292 -0.3648 _ 0.4208 0.3449
coefficient (0.0000) (0.2634) (0.0135) (0.0316)
MA(2) 0.9531 _ _ 0.2631 --
Coefficient (0.0000) (0.1139)
AIC -5.3357 -5.0690 -5.1057 -5.0852 -5.0655
SBC -5.1615 -4.9384 -5.0187 -4.9559 -4.9793
Adjusted R2 0.3768 0.1664 0.1761 0.1604 0.1225
The very first model ARIMA (1, 1, 2) shows that AR (1) component as well as the MA (1) and
MA (2) components are highly statistically significant even at 1 percent level of significance.
The adjusted R2 is very 0.3767 showing that 37 percent of the variation in GDP are jointly
explained by AR(1), MA(1) and MA(2) components, whereas the model is also good fit
according to the F-statistic.
The second potential candidate model of ARIMA (1, 1, 1) shows that MA (1) component is
statistically highly insignificant with probability value of 0.2634. The adjusted R square value
decreases to 0.1664, whereas the Schwarz info criterion (SIC) and Akaike info criterion (AIC)
have increased from -5.1612 , and, -5.3357 to -4.9384 and -5.0690 respectively. The model
ARIMA (1, 1, 0), ARIMA (0, 1, 2) and ARIMA (0, 1, 1) have low adjusted R square value and
high SIC and AIC values as compared to ARIMA (1, 1, 2). Therefore, evidence suggests that
ARIMA (1, 1, 2) model is probably the most appropriate one. The selected model is tested for
auto correlation. The result of Breusch-Godfrey Serial Correlation test is given below;

Breusch-Godfrey Serial Correlation LM Test:

F-statistic 0.724327 Prob. F(2,31) 0.4927


Obs*R-squared 1.640040 Prob. Chi-Square(2) 0.4404

Variable Coefficient Std. Error t-Statistic Prob.

C 0.000109 0.005694 0.019226 0.9848


AR(1) 0.183570 0.196388 0.934728 0.3572
MA(1) 0.015044 0.032228 0.466788 0.6439
MA(2) 0.008849 0.019097 0.463345 0.6464
RESID(-1) -0.310694 0.264332 -1.175393 0.2488
RESID(-2) 0.141756 0.218592 0.648497 0.5214

R-squared 0.044325 Mean dependent var 0.000275


Adjusted R-squared -0.109816 S.D. dependent var 0.015278
S.E. of regression 0.016095 Akaike info criterion -5.273248
Sum squared resid 0.008030 Schwarz criterion -5.012018
Log likelihood 103.5551 Hannan-Quinn criter. -5.181152
F-statistic 0.287564 Durbin-Watson stat 1.868743
Prob(F-statistic) 0.916278

Since the value of both F probability and Chi-Square probability are more than 0.05, therefore
we are unable to reject the null of no autocorrelation.
2.4 Forecasting with the ARIMA (1, 1, 2)
Once a particular ARIMA model is fitted, we can use it for forecasting. There are two types of
forecast; Static forecast and Dynamic forecast. In static forecasts, we use the actual current and
lagged values of the forecast variable, whereas in dynamic forecasts, after the first period
forecast, we use the previously forecast values of the forecast variable. Results of both dynamic
and static forecasts are given below;

Figure 4 Dynamic Forecasts of ARIMA (1, 1, 2)

Figure 5 Static Forecasts of ARIMA (1, 1, 2)


On the basis of Theil coefficient, the dynamic forecast does not do as well as the static forecast,
because static forecast uses the actual value to make future/one step ahead forecast. Comparison
of some forecasted and actual values is given in the following table.

obs GDP (actual) GDP(Dynamic Forecast) GDP(Static Forecast)


2000 5945211.62 5426370.15 5946598.54
2001 6063068.18 5692276.57 6215142.13
2002 6258569.17 5971214.35 6236967.84
2003 6561879.52 6263820.14 6503191.29
2004 7045355.90 6570764.85 6997449.40
2005 7585588.27 6892750.45 7428180.90
2006 8216160.00 7230514.39 8131390.09
2007 8613232.00 7584829.60 8762840.74
2008 8759778.00 7956507.30 8906479.37
2009 9007825.00 8346398.20 8984466.93
2010 9152553.00 8755394.86 9441901.51
2011 9404102.00 9184433.47 9374860.62
2012 9733907.00 9634496.17 9733516.86
2013 10159011.00 10106613.19 10320123.56
2014 10640381.00 10601865.26 10450524.77

Static forecasted value of GDP is very near to the actual values of GDP as compared to the
Dynamic forecasts.

You might also like