0% found this document useful (0 votes)
5 views

reference 10

Uploaded by

rootsha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

reference 10

Uploaded by

rootsha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/286582526

Machine learning approach for forecasting crop yield based on climatic


parameters

Conference Paper · January 2014


DOI: 10.1109/ICCCI.2014.6921718

CITATIONS READS
87 10,794

3 authors, including:

Veenadhari Suraparaju
Rabindranath Tagore University
32 PUBLICATIONS 207 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Associate Professor and Research Guide at AISECT University View project

All content following this page was uploaded by Veenadhari Suraparaju on 16 July 2016.

The user has requested enhancement of the downloaded file.


2014 International Conference on Computer Communication and Informatics (ICCCI -2014), Jan. 03 –
05, 2014, Coimbatore, INDIA

Machine learning approach for forecasting


crop yield based on climatic parameters
S.Veenadhari Dr. Bharat Misra Dr. CD Singh
Ph.D.Scholar Associate Professor Senior Scientist
MGCGV, Chitrakoot MGCGV, Chitrakoot CIAE, Bhopal
Madhya Pradesh Madhya Pradesh Madhya Pradesh
[email protected]

Abstract—With the impact of climate change in India, dangerously high pesticide usage. These studies
majority of the agricultural crops are being badly affected have reported negative correlation between pesticide
interms of their performance over a period of last two usage and crop yield[1]. In their study they have
decades. Predicting the crop yield well ahead of its harvest shown that how data mining integrated agricultural
would help the policy makers and farmers for taking data including pest scouting, pesticide usage and
appropriate measures for marketing and storage. Such meteorological data are useful for optimization of
predictions will also help the associated industries for pesticide usage. Thematic information related to
planning the logistics of their business. Several methods of
predicting and modeling crop yields have been developed in
agriculture which has spatial attributes was reported
the past with varying rateof success, as these don’t take in one of the study[6]. Their study aimed at
into account characteristicsoftheweather, a n d aremostly discerning trends in agriculture production with
empirical. In the present study a software tool named references to the availability of inputs. K- means
‘Crop Advisor’ has been developed as an user friendly web method was used to perform forecasts of the
page for predicting the influence of climatic parameters on pollution in the atmosphere [4], the k nearest
the crop yields.C4.5 algorithm is used to find out the most neighbor was applied for simulating daily
influencing climatic parameter on the crop yields of selected precipitations and other weather variables [11], and
crops in selected districts of Madhya Pradesh. This software different possible changes of the weather scenarios
provides an indication of relative influence of different
climatic parameters on the crop yield, other agro-input
are analyzed using SVMs[13]. Data mining
parameters responsible for crop yield are not considered in techniques are often used to study soil
this tool, since, application of these input parameters varies characteristics. As an example, the k-means
with individual fields in space and time. approach is used for classifying soils in combination
Key words: Climate, agricultural productivity, C4.5 with GPS-based technologies [14]. Apples were
alogarithm, prediction checked using different approaches before sending
them to the market.[9], uses a k-means approach to
analyze color images of fruits as they run on
I.INTRODUCTION conveyor belts. [12] uses X-ray images of apples to
Crop production is a complex phenomenon that monitor the presence of water cores, and a neural
is influenced by agro-climatic input parameters. network is trained for discriminating between good
Agriculture input parameters varies from field to and bad apples. Spatial data mining introduced
field and farmer to farmer. Collecting such especially decision tree algorithm applying to
information on a larger area is a daunting task. agriculture land grading[15]. He combined spatial
However, the climatic information collected in India data mining techniques with expert system
at every 1sq.m area in different parts of the district techniques and applied them to establish an
are tabulated by Indian Meteorological Department. intelligent agriculture land grading information
The huge such data sets can be used for predicting system. The author adopted decision tree C4.5
their influence on major crops of that particular algorithm and implement with Mo2.0 and VC++6.0
district or place. There are different forecasting to build agriculture land grading expert system. The
methodologies developed and evaluated by the study showed the advantages of this method in
researchers all over the world in the field of addressing problems in land grading. A decision
agriculture or associated sciences. Some of the such tree classifier for agriculture data was proposed
studies are : Agricultural researchers in Pakistan [5].This new classifier uses new data expression and
have shown that attempts of crop yield maximization can deal with both complete data and in complete
through pro-pesticide state policies have led to a data. In the experiment,10-fold cross validation

978-1-4799-2352-6/14/$31.00 ©2014 IEEE


2014 International Conference on Computer Communication and Informatics (ICCCI -2014), Jan. 03 –
05, 2014, Coimbatore, INDIA

method is used to test dataset, horse-colic dataset • For each attribute a : find the
and soybean dataset. Their results showed the normalized information gain from
proposed decision tree is capable of classifying all splitting on a
kinds of agriculture data. Data mining technique for
evolution of association rules for droughts and • Let a_best be the attribute with the
floods in India was applied using climate inputs[2]. highest normalized information gain
In their study, a data-mining algorithm using the • Create a decision node that splits
concepts of minimal occurrences with constraints on a_best
and time lags was used to discover association rules
between extreme rainfall events and climatic indices. • Recurse on the sublists obtained by
Rainfall events were forecasted the using data splitting on a_best, and add those nodes
mining techniques[7].The occurrence of prolonged as children of node.
dry period or heavy rain at the critical stages of the Let S be a set of training samples, where the class
crop growth and development may lead to label of each sample is known. Each sample is in fact
significant reduction in crop yield. Sugarcane yield a tuple. One attribute is used to determine the class of
was estimated in Brazil, using 10-day periods of training samples. Suppose that there are m classes.
SPOT vegetation NDVI images and Let S contain si samples of class Ci for I= 1.., m. An
meteorological data [3]. Data Mining approach arbitrary samples belongs to class Ci with probability
based on Spatio-Temporal data to forecast irrigation si/s, where s is the total number of samples in set S.
water demand[8]. A set were prepared containing The expected information needed to classify a given
attributes obtained from meteorological data, sample is
remote sensing images and water delivery
statements. In order to make the prepared data sets m
si s
useful for demand forecasting and pattern I (s1, s2, …,sm) = −∑ log 2 i
extraction data sets were processed using a novel i =1 s s
approach based on a combination of irrigation and
data mining knowledge. Decision tree techniques
were applied to forecast future water requirement. An attribute A with values {a1, a2, …,av} can be used
to partition S into the subsets {S1, S2, …, Sv}, where
Sj contains those samples in S that have value aj of A.
II. METHODOLOGY Let Sjcontain sij samples of class Ci. The expected
information based on this partitioning by A is known
The present study was aimed to develop a web site for as the entropy of A. It is the weighted average:
finding out the influence of climatic parameters on
crop production in selected districts of Madhya v s1 j + .., s mj
Pradesh. The selection of districts has been made E (A) = ∑ s
I ( s1 j ,...s mj )
based on the area under that particular crop. Based on j =1
this criteria first top five districts in which the selected
The information gain obtained by this portioning on
crop area is maximum were selected. The crops
A is defined by
selected in the study is based on the predominate
crops in the selected district. The selected crop Gain (A) = I (s1, s2, …,sm) – E (A)
includes: Soybean, Maize, Paddy and Wheat. The
yield of these crops was tabulated for continuous 20
years by collecting the information from secondary
sources. Similarly for the corresponding years
climatic parameters such as Rainfall, Maximum & In this approach to relevance analysis, we can
Minimum temperature, Potential Evapotranspiration, compute the information gain for each of the
Cloud cover, Wetday frequency were also collected attributes defining the samples in S. The attribute with
the highest information gain is considered the most
from the secondary sources. The methodology
discriminating attribute of the given set. By
adopted for analysis includes for values above the
computing the information gain for each attribute, we
threshold were considered as one child and the
therefore obtain a ranking of the attributes. This
remaining as another child. It also handles missing
ranking can be used for relevance analysis to select
attribute values. In pseudo code, the general algorithm
the attributes to be used in concept description.
for building decision trees is:
• Check for base cases
2014 International Conference on Computer Communication and Informatics (ICCCI -2014), Jan. 03 –
05, 2014, Coimbatore, INDIA

III. RESULTS AND DISCUSSION


A web based software has been developed in C# Using the developed software the influence of
language in .net platform. The backend used is sql climatic parameters on crop productivity in selected
server 2008. On the home page of the web site districts of Madhya Pradesh was carried out for
(www.cropadvisor.in) the methodology adopted in predominant crops. For Soybean crop in all the
the study, contact information of the administrator, selected districts, the most influencing parameter was
new user registration and registered users login are found to be cloud cover, for paddy crop it was found
appeared. For the registered users the window is as rainfall, for maize crop it was maximum
displayed as temperature and for wheat crop the minimum
temperature.
In the present study only the climatic parameters were
considered in predicting the crop yield, though, the crop
yield is influenced by many other input parameters such
as irrigation, fertilizer application, pesticide application
etc. This is due to paucity of such information on
district wise resulted in developing a model, which can
approximately predict the crop yields knowing the
climatic parameters, as this will facilitate the policy
makers to decide on the buffer stock of the grains,
fixing minimum support price etc. Therefore, the
decision rules that were framed from the developed
software was used for validation of the software by
predicting the yields of selected crops in all the selected
district with the observed values. The prediction
This website is designed as an interactive software accuracy was also worked out by comparing the
tool for predicting the influence of climatic predicted yield with the observed yields. For each crop
parameters on the crop yields. C 4.5 algorithm is used the validation of the developed software has been
to find out the most influencing climatic parameter carried out, however for one crop (soybean) the
on the crop yields of selected crops in selected decision rules and validation accuracy of results were
districts of Madhya Pradesh. This software provides presented in the present paper.
an indication of relative influence of different climate
parameters on the crop yield, other agro-input The decision rules developed based on the model for
parameters responsible for crop yield are not soybean crop in Dewas district are:
considered in this tool, since, and application of these i) if Cloud Cover < 34.11days & Rainfall < 888.23
input parameters varies with individual fields in
mm then Yield < 1022.24 kg/ha
space and time. Based on the C 4.5 alogarithm,
ii) if Cloud Cover <34.11 days & Rainfall < 888.23
decision tree and decision rules have been developed,
mm then Yield >1022.24 kg/ha
which are displayed when icon decision tree is
iii) if Cloud Cover >34.11 days & Minimum
selected. The screen shot of the same appear as:
Temperature < 19.68oC then Yield < 1022.24 kg/ha
iv) if Cloud Cover >34.11 days & Minimum
Temperature > 19.68 oC then Yield > 1022.24 kg/ha
Based on the above decision rules the observed
values of the most influencing parameters of this
district are presented in Table 1.0.
2014 International Conference on Computer Communication and Informatics (ICCCI -2014), Jan. 03 –
05, 2014, Coimbatore, INDIA

Table 1.0 Prediction accuracy of developed model for soybean crop in Dewas district
Cloud cover, days Rainfall, mm Min. temp, oC Observed Yield, Predicted Yield, Is prediction accurate
kg/ha kg/ha
34.96 833.19 19.72 1092.7 >1022.24 Yes
33.96 1161.41 19.25 1060.8 >1022.24 Yes
34.04 938.17 19.83 1060.8 >1022.24 Yes
32.6 1033.26 20.06 1007.8 >1022.24 No
34.86 1067.67 19.27 1022.2 <1022.2 Yes
34.7 833.15 20.24 1247.1 >1022.24 Yes
35.17 962.4 19.77 1146.6 >1022.24 Yes
31.76 693.67 19.93 1009.8 <1022.24 Yes
35.23 694.35 19.7 1160.1 >1022.24 Yes
33.14 692.42 20.08 985.8 <1022.24 Yes
34.33 1168 18.98 1099 <1022.24 No
33.81 824 19.24 925 <1022.24 Yes
34.13 878 19.76 1275 >1022.24 Yes
33.86 851 19.49 912 <1022.24 Yes
33.85 802 20.1 907 <1022.24 Yes
34.49 915 20.14 1023 >1022.24 Yes
34.53 687 19.5 880 <1022.24 Yes
34.98 1293 19.31 780 <1022.24 Yes
34.02 709 19.65 920 <1022.24 Yes
33.96 728 19.66 930 <1022.24 Yes

Out of 20 years of data the predictions were correct Table 2.0. Prediction accuracy of developed model for different
crops.
in 18 years and were incorrect in two years indicating
the prediction accuracy of the developed model at 90 S.No. Name of the Crop Average prediction
per cent in case of soybean in Dewas district. Similar accuracy, %
analysis were carried out for all the selected crops 1 Soyabean 87
and districts, and based on the results obtained the
2 Paddy 85
overall accuracy of the developed model are
presented in Table 2.0. 3 Maize 76

The web based software developed for predicting the 4 Wheat 80


crop yield from the given input of climatological
parameters indicated a clear trend of each crop being
predominantly influenced by a particular climatic The prediction accuracy of the developed model
parameter. The average of accuracy obtained under a varied from 76 to 90 per cent for the selected crops
particular crop in different districts were averaged and selected districts. Based on these observations
and the prediction accuracy of developed model for the overall prediction accuracy of the developed
different crops are presented in table 2.0. model is 82.00 per cent. With a high prediction
accuracy the developed model can be used by the
policy makers in arriving at a policy decision well in
advance i.e., before the harvest of the crop.
2014 International Conference on Computer Communication and Informatics (ICCCI -2014), Jan. 03 –
05, 2014, Coimbatore, INDIA

IV. CONCLUSIONS gs/mapindia/2006/agriculture/mi06agri_12


4.htm).
The present study demonstrated the potential use of
[7]. Kannan, M. Prabhakaran S and P.
data mining techniques in predicting the crop yield
Ramachandran (2011).Rainfall forecasting
based on the climatic input parameters. The
using data mining technique. International
developed webpage is user friendly and the accuracy
Journal of Engineering and Technology
of predictions are above 75 per cent in all the crops
Vol.2 (6), 2010, 397-401.
and districts selected in the study indicating higher
[8]. Khan Mohammad A.,Md. Zahid-ul Islam and
accuracy of prediction. The user friendly web page
Mohsin Hafeez (2011). Evaluating the
developed for predicting crop yield can be used by
Performance of Several Data Mining
any user their choice of crop by providing climatic
Methods for Predicting Irrigation Water
data of that place.
requirement. Proceedings of the Tenth
Australasian Data Mining
conference(AusDM2012),Sydney,
ACKNOWLEDGEMENTS Australia.199-207.
The first author would like to extend her heart felt [9]. Leemans,V.,Destain,M.F.,2004.A real-time
gratitude to Vice Chancellor, MGCGV, Chitrakoot grading method of apples based on features
for giving admission to pursue Doctoral program extracted from defects. J. Food Eng.61, 83–
from the university. Thanks are also due to Director, 89.
CIAE, Bhopal for extending the facilities to carryout [10]. Quinlan,J.R.(1985b). Decision trees and
the research activities in the Institute. All the help multi-valued attributes. In J.E. Hayes &
received from the staff of the University and the D.Michie (Eds.),Machine intelligence 11.
Institute is duly acknowledged. Oxford University Press (in press).
[11]. Rajagopalan B. Lall U (1999) A k- nearest-
neighbor daily precipitation and other
REFERENCES weather variables. WatResResearch35(10)
:3089 – 3101.
[1]. Abdullah, A., Brobst, S, Pervaiz.I., Umer [12]. Shahin,M.A.,Tollner,E.W.,McClendon,R.W.
M.,and A.Nisar.2004. Learning dynamics of Arabnia, H.R.,(2002). Apple classification
pesticide abuse through data mining. based on surface bruises using image
Proceedings of Australian Workshop on Data processing and neural networks. Trans.
mining and Web Intelligence, New Zealand, ASAE 45, 1619–1627.
January. [13]. Tripathi S, Srinivas VV, Nanjudiah RS
[2]. Dhanya,C.T. and D. Nagesh Kumar, 2009. (2006).Down scaling of precipitation for
Data mining for evolution of association climate change scenarios: a support vector
rules for droughts and floods in India using machine approach, J. Hydrology 330-337.
climate inputs. J. of Geo. Phy.Res.114:1-15. [14]. Verheyen, K., Adrianens, M. Hermy and
[3]. Fernandes F.L., Jansle V.R., Rubens S.Deckers(2001). High resolution
Augusto and Camargo L.(2011). Sugarcane continuous soil classification using
yield estimates using time series analysis of morphological soil profile descriptions.
spot vegetation images. Sci. Agric. Geoderma, 101:31-48.
(Piracicaba, Braz.)vol.68no.2 [15]. Zelu Zia (2009). An Expert System Based
[4]. Jorquera H, Perez R, Cipriano A, Acuna on Spatial Data Mining used Decision
G(2001). Short term forecasting of air Tree for Agriculture Land Grading.
pollution episodes. In. Zannetti P (eds) Second International Conference on
Environmental Modeling 4. WITPress, UK. Intelligent Computation Technology and
[5]. Jun Wu, Anastasiya Olesnikova, Chi- Hwa Automation.Oct10-11,China
Song, Won Don Lee (2009).The
Development and Application of Decision
Tree for Agriculture Data. IITSI :16-20.
[6]. Kiran Mai,C., Murali Krishna, I.V., and
A.VenugopalReddy,2006.Data Mining o f
Geospatial Database for Agriculture
Related Application. Proceedings of Map
India. New Delhi.
((http://www.gisdevelopment.net/proceedin

View publication stats

You might also like