0% found this document useful (0 votes)
159 views

Application of ARIMAX Model

This document describes applying an ARIMAX model to forecast annual paddy production in Trincomalee District, Sri Lanka. It uses time series data on monthly rainfall and annual paddy production from 1970 to 2010. The ARIMAX model incorporates rainfall data as an input variable to forecast paddy production. There is a significant positive correlation between paddy production and rainfall in both the Maha and Yala seasons. The study aims to validate the ARIMAX model using various selection criteria to accurately forecast paddy production based on rainfall patterns.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
159 views

Application of ARIMAX Model

This document describes applying an ARIMAX model to forecast annual paddy production in Trincomalee District, Sri Lanka. It uses time series data on monthly rainfall and annual paddy production from 1970 to 2010. The ARIMAX model incorporates rainfall data as an input variable to forecast paddy production. There is a significant positive correlation between paddy production and rainfall in both the Maha and Yala seasons. The study aims to validate the ARIMAX model using various selection criteria to accurately forecast paddy production based on rainfall patterns.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

B. Yogarajah(1), C. Elankumaran(2) and R.

Vigneswaran(3)

Application of ARIMAX Model for


Forecasting Paddy Production in
Trincomalee District in Sri Lanka
(1) Department of Physical Science, Vavuniya Campus, University of Jaffna, Sri Lanka.
(2) Department of Economics, University of Jaffna , Jaffna, Sri Lanka.
(3) Department of Mathematics and Statistics, University of Jaffna, Sri Lanka.
(e­mail: [email protected])

Abstract: In the post-war climes, the government’s two cropping seasons in the district corresponding
mandates focus on reviving one of the paddy with the northeast monsoon, or Maha season, and the
production region Trincomalee district in SriLanka to south-west monsoon, or Yala season. This dry-zone
meet the growing demands of the nation. Such a district show a significant relationship between annual
Paddy production and rainfall in both Maha and Yala
rehabilitation program requires the understanding of
seasons(Table3) (Yoshino, 1984b: 95). In regard to
how the paddy producing industry has fared along the
Yoshino’s Sri Lankan data, he confirmed the generality
historical time-lines. This understanding is essential
of the relationship between rainfall and the planted
for developing the necessary development plans for the area.
Rice sector.
Applications of time series technique ARIMA
When an ARIMA model includes other time series as have been used to model for forecasting agriculture
input variables, the model is referred to as an ARIMAX product. Some of research papers are Applying
model. Pankratz (1991). In this paper, ARIMAX model ARIMA models are Hossian et al. (2006) forecasted
has been applied to forecast annual paddy production three different varieties of pulse prices namely motor,
with includes rainfall time series as input variable for mash and mung in Bangladesh with monthly data
both seasons in this district. The validity of the model from Jan 1998 to Dec 2000; Wankhade et al. (2010)
is verified with various model selection criteria such as Forecasted pigeon pea production in India with annual
Adj R2, minimum of AIC and SBC lowest MAPE data from 1950-1951 to 2007-2008; Mandal (2005)
forecasted sugarcane production in India; Iqbal et al.
values.
(2005) forecasted area and production of wheat in
Keywords: ARIMAX model; Forecasting; AIC; SBC; Pakistan; Masuda and Goldsmith (2009) forecasted
MAPE world Soybean productions; Cooray (2006) forecasted
Sri Lanka’s monthly total production of tea beyond
Sept 1988 using monthly data from January 1988 to
Introduction September 2004. With these exceptions, there is
paucity of studies regarding applications of ARIMA
Rice is the most extensively cultivated crop in
model for forecasting agricultural products
Trincomalee district in SriLanka. Due to the unsettled
situation for last two decades in this district the rice When an ARIMA model includes other time
sector entered the difficult stage of development and series as input variables, the model is referred to as an
faces adjustment problems(IRI). Attention should be ARIMAX model. Pankratz (1991). In this paper,
drawn to the fact that there is room for further ARIMAX model has been applied to forecast annual
improvement by finding ways and means of fully paddy production with includes rainfall time series as
utilizing the general cultivable area in this district. input variable for both seasons in this district.This
paper applies Autoregressive Integrated Moving
In this paper rainfall and annual paddy
Average (ARIMAX) forecasting model, the most
production are time series event processes. There are

[21]
Proceedings of the Third International Symposium,
SEUSL: 6-7 July 2013, Oluvil, Sri Lanka

popular and widely used forecasting models for uni- Data Collection and Arrangement
variate time series data. Although it is applied across
various functional areas, it’s application is very limited The monthly rainfall and the yearly paddy
in agriculture, mainly due to unavailability of required production data from the period of 1970 to 2010
data and also due to the fact that agricultural product collected by the Trincomalee meteorological station
depends typically on monsoonal rain and other factors, and the Statistical Abstract release 1992, 1996, 2000,
which the ARIMA models failed to incorporate 2004, 2008 and 2010 by the Department of Census and
Statistics are used. These data are used to explore
annual climatic trend of paddy production and
Materials and Methods seasonal variation of Rainfall of the district. These data
are used for driving the Auto Regressive Integrated
The existing study applies Box-Jenkins (1970) Moving Average (ARIMAX) models with rainfall as
forecasting model popularly known as ARIMA model. input variables for both seasons. The collected data will
The ARIMA is an extrapolation method, which be divided into two sets, calibration data and validating
requires historical time series data of underlying data, in order to testify the performance of the
variable. The ARIMA approach was first popularized suggested model.
by Box and Jenkins, and ARIMA models are often
referred to as Box-Jenkins models. The general transfer
function model employed by the ARIMA procedure
Result and Discussion
was discussed by Box and Tiao (1975). When an From the Table1 the Correlation between
ARIMA model includes other time series as input production in maha season(Mprd) and rainfall of the
variables, the model is sometimes referred to as an season (MRfl) is 0.48207. The Correlation between
ARIMAX(p,d,q) model. Pankratz (1991) refers to the production in yala season(Yprd) and rainfall of the
season (YRfl) is 0.42209. Therefore this auxiliary
ARIMAX model as dynamic regression.
information of rainfall can be used for this model
building process.

  The Figure 1 and Figure 2 shows the relationship


between the paddy production and rainfall season wise
in this district.
Where t indexes time wt ;  is the mean term; B is the
backshift operator; that is, BXt = Xt -1 Table 1
be
Pearson Correlation Coefficients, N = 30
=1− 1−…−  is the autoregressive operator, Prob > |r| under H0: Rho=0
represented as a polynomial in the back shift operator: TotalPrd Mprd Yprd MRfl YRfl

=1−1−…− is the moving-average operator, TotalPrd 1.000 0.926 0.809 0.377 0.160
represented as a polynomial in the back shift operator: Mprd 0.926 1.000 0.612 0.482 0.101
at is the independent disturbance, also called the Yprd 0.809 0.612 1.000 0.091 0.423
MRfl 0.377 0.482 0.091 1.000 0.011
random
YRfl 0.160 0.101 0.423 0.011 1.000

ESACF and SCAN Methods Descriptive statistics


The Extended Sample Autocorrelation Function The preliminary understating about the nature
(ESACF) The Smallest CANonical (SCAN) methods of data (Figure1 and Figure2) the output have shown
highly volatile pattern, showing ups and downs over a
can tentatively identify the orders of a stationary or
period of time; some of their trends may be non-linear
nonstationary ARMA process based on iterated least and non normal. In time series language the variables
squares estimates of the autoregressive parameters. are non-stationary in nature; hence their mean and
Tsay and Tiao (1984) proposed the technique, and variance are not-constant and time variant which
Choi (1990) provides useful descriptions of the means output of these series are less and highly
algorithm. dispersed from the mean values. This is also reflected
in Conclusion.

[22]
B.Yogarajah, C.Elankumaran and R.Vigneswaran
Application of ARIMAX Model for Forecasting Paddy
Production in Trincomalee District in Sri Lanka

Model Building and Analysis Figure 3 gives scatter plot ACF, PACF and IACF graph
at autoregressive order p=1. This shows the stationarity
of rainfall in the maha season.
Decode
Rainfall Data Arrange Time Series Data
Analyze Trend and Correlation Analysis for MRFI
Viriance 1.0
1400
Analysis routine

1200 0.5
Variance NOT stable

MRFI

AFC
1000 0.0
800 -0.5
600
Obitain -1.0
Defferencing ACF Stationery Data
PACF 0 5 10 15 20 25 30 0 2 4 6
Observation Lag
1.0 1.0
No
Conclusions SCAN 0.5 0.5
Valdate Test of EACF

PAFC

IAFC
siginificance 0.0 0.0

-0.5 -0.5

Forecast ARIMA Model Residual


Diagnosis
-1.0 -1.0
0 2 4 6 0 2 4 6
Lag Lag

Figure 3: Trend and Correlaon Analysis

Stationary vs. non-stationary


ARIMAX model is generally applied for
stationary time series data. The time series properties
of stationary and non-stationary are checked applying
Augmented Dickey Fuller Test (Dickey-Fuller, 1979).
The Augmented Dickey Fuller (ADF) tests results are
estimated with level of first differences both seasons.
The result shows that, the variables production and
rainfall are non-stationary at level but stationary at first
difference. The null hypothesis of non-stationary at
level data is rejected at first difference data.

Figure 1: Plot of paddy and Rainfall data in the


Yala Season

Figure 4: Plot of Forecast for Paddy Producon in


the Maha Season’

Now the question may arise, how do we know


whether the identified model is appropriate or not?
Figure 2: Plot of Paddy and Rainfall data in the One simple way to answer is diagnostic checking on
Maha season residual term obtained from ARIMA model applying

[23]
Proceedings of the Third International Symposium,
SEUSL: 6-7 July 2013, Oluvil, Sri Lanka

the same ACF and PACF functions. Obtain ACF and Diagnostic checking:
PACF of residual term up to certain lags of the
estimated ARIMAX model and then check whether 1.0
Residual correlation Diagnostic For YRfI (1)
1.0

the coefficients are statistically significant or not with 0.5 0.5

Box-Pierce Q and LjungBox LB statistics,

PACF
ACF
0.0 0.0

respectively(Figure4). If the result obtains from the -0.5 -0.5

model is purely random, then estimated ARIMA -1.0


0 2 4 6
-1.0
2 4 6
Lag Lag
model is correct or else we have to look for alternative 1.0

specification of the model. Similarly, diagnostic

White Noise Prob


0.5 -001

IACF
checking can also be done through Adjusted R2, 0.0
05
minimum of Akaike Information Criteria (AIC) and -0.5

-1.0 1.0
Schwarz Bayesian Criteria (SBC) Table III reports the 0 2 4 6 0 2 4 6
Lag Lag
estimated results.
Figure 6: Residual Correlaon diagnosc

Figure 7: Residual Normality Diagnosc

Forecasting: Once the three previous steps of


Figure 5: Plot of Forecasng for Paddy Producon
ARIMAX model is over, then we can obtained
in the Yala Season
forecasted values by estimating appropriate model,
which are free from problems. The forecasted values
As per findings, the best ARIMA model for obtained from ARIMA model are reported in Table 3.
Trincomalee District The forecasted values are reported for a maximum 5
years as too much long term forecasting might not be
Yala Season is ARIMA(1,1,1) appropriate

Conclusion
While applying various quantitative and
qualitative models for forecasting, it is essential to
Maha season is ARIMA(1,1,0) understand the productivity is not an exception to it.
In this paper ARIMA model has been applied on few
selected agricultural products in India. As the model
requires large data points, considering the availability
of required annual data, 34 different agricultural
products has been selected. Annual data from 1950
and 1957 onwards to 2010 as the case may be have
been used. All the necessary steps of ARIMA model

[24]
B.Yogarajah, C.Elankumaran and R.Vigneswaran
Application of ARIMAX Model for Forecasting Paddy
Production in Trincomalee District in Sri Lanka

have been applied systematically for forecasting 5 The findings of this work can be applied in
periods ahead from 2011 onwards. Among these items, decision making for paddy cultivation and related
tea provides lowest MAPE value, whereas cardamom water resources management.
provides lowest AIC value. Similarly, highest MAPE is
obtained for papaya and highest AIC value is for
sugarcane. Now the question may arise is since Future work
agricultural productivity depend upon many factors Modeling Paddy production in the district and
such as rainfall, irrigation facility, monsoon, climate, determine whether the annual rainfall of the district
soil, fertilizer etc., forecasted values might be more impacts on Paddy production in the district.
accurate only with ceteris paribus assumption.
However, generally all the factors do not go well every
time and in right direction; therefore reliability of these References
forecasted values might be questionable. In this context
Box, G.E.P., Jenkins, G.M., and Reinsel, G.C. “Time Series
one need to rethink about other forecasting model,
Analysis: Forecasting and Control”. Third Edition,
which could incorporate more information for Englewood Cliffs, NJ: Prentice Hall, pp.197-199 1994.
forecasting the agricultural products. This could be
Box, G.E.P. and Tiao, G.C. “Intervention Analysis with
one of the limitations of the paper.
Applications to Economic and Environmental
underlying factors affecting it. Thus, forecasting Problems”. JASA, pp.70, 70-79 1975.
agricultural B.Yogarajah. “A time-series modeling approach for
understanding the behavior of rainfall patterns in the
In this model fitting process for the monthly Anuradhapura District”. Journal of the SrLanka
rainfall data from 1952 to 2010 in Trincomalee district Association of Geographers.pp.82-92 2010.
the more appropriate model than other models is
Brocklebank, J.C. and Dickey, D.A. “SAS System for
ARIMA(0,0,1)(0, 0,1)12
Forecasting Time Series”. Edition, Cary, North
The seasonal ARIMA model for monthly data Carolina: SAS Institute Inc.1986.
with the following mathematical form: Chatfild, C. “The Analysis of Time Series-an introduction.”
5th Edn., Chapman and Hall, UK. 1996
(1- B)(1- B12)Monthly Rainfall= (1 - 0.903 B)
(1 - 0.999 B**(12))at Demroes, M. “Aspects of aridity and drought in the monsoon
climate of Sri Lanka”. Indian Journal. Hydrology and
Where B is the backshift operator; that is and at is geophysics. 21(1and2): 384-394. 1978.
the independent disturbance, also called the random Manfred Domroes, Edmound Ranatunge “A statistical
error. approach towards a regionalization of daily rainfall in
Sri Lanka” Article first published online:
Key findings indicate that the rainfall patterns in DOI: 10.1002/joc.3370130704. 2006
the study area are non auto-regressive, as such - they
Manobavan, M., “The Response Of Terrestrial Vegetation To
do not depend on the past history of rainfall; but are El Niño Southern Oscillation”. Unpublished PhD
predominantly depending on nonlinear trend and a Thesis. Faculty of Science, Kingston University, Surry,
seasonal pattern of order 12. This indicates that in- United Kingdom. (2003).
order to arrive at a comprehensive forecasting model Meyler, A., Kenny, G., and Quinn, T. “Forecasting Irish
for rainfall in Trincomalee, the need to focus on the Inflation Using ARIMA Models”. Technical Paper
influences of non-endemic and regional-to-global (3/RT/98). Economic Analysis, Research and
climatic phenomena is apparent. Publication Department, Central bank of Ireland.
1998.
However, the model does not account for
Sabita Madhvi Singh, “Statistical Modeling of Climate
variations in precipitation due to cyclonic activity, Parameters”. Asian Journal of Current Engineering and
which is a significant factor of variation in the local Maths1: 2 March - April (2012) 29.
climate. This shortcoming should be addressed in the Thambyahpillay, G. “Climatic changes in Ceylon”. MA.
future research projects of this nature to enable a much Thesis, University of Cambridge.1958.
clearer picture of climatic changes in the district.

[25]

You might also like