reference 10
reference 10
net/publication/286582526
CITATIONS READS
87 10,794
3 authors, including:
Veenadhari Suraparaju
Rabindranath Tagore University
32 PUBLICATIONS 207 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Veenadhari Suraparaju on 16 July 2016.
Abstract—With the impact of climate change in India, dangerously high pesticide usage. These studies
majority of the agricultural crops are being badly affected have reported negative correlation between pesticide
interms of their performance over a period of last two usage and crop yield[1]. In their study they have
decades. Predicting the crop yield well ahead of its harvest shown that how data mining integrated agricultural
would help the policy makers and farmers for taking data including pest scouting, pesticide usage and
appropriate measures for marketing and storage. Such meteorological data are useful for optimization of
predictions will also help the associated industries for pesticide usage. Thematic information related to
planning the logistics of their business. Several methods of
predicting and modeling crop yields have been developed in
agriculture which has spatial attributes was reported
the past with varying rateof success, as these don’t take in one of the study[6]. Their study aimed at
into account characteristicsoftheweather, a n d aremostly discerning trends in agriculture production with
empirical. In the present study a software tool named references to the availability of inputs. K- means
‘Crop Advisor’ has been developed as an user friendly web method was used to perform forecasts of the
page for predicting the influence of climatic parameters on pollution in the atmosphere [4], the k nearest
the crop yields.C4.5 algorithm is used to find out the most neighbor was applied for simulating daily
influencing climatic parameter on the crop yields of selected precipitations and other weather variables [11], and
crops in selected districts of Madhya Pradesh. This software different possible changes of the weather scenarios
provides an indication of relative influence of different
climatic parameters on the crop yield, other agro-input
are analyzed using SVMs[13]. Data mining
parameters responsible for crop yield are not considered in techniques are often used to study soil
this tool, since, application of these input parameters varies characteristics. As an example, the k-means
with individual fields in space and time. approach is used for classifying soils in combination
Key words: Climate, agricultural productivity, C4.5 with GPS-based technologies [14]. Apples were
alogarithm, prediction checked using different approaches before sending
them to the market.[9], uses a k-means approach to
analyze color images of fruits as they run on
I.INTRODUCTION conveyor belts. [12] uses X-ray images of apples to
Crop production is a complex phenomenon that monitor the presence of water cores, and a neural
is influenced by agro-climatic input parameters. network is trained for discriminating between good
Agriculture input parameters varies from field to and bad apples. Spatial data mining introduced
field and farmer to farmer. Collecting such especially decision tree algorithm applying to
information on a larger area is a daunting task. agriculture land grading[15]. He combined spatial
However, the climatic information collected in India data mining techniques with expert system
at every 1sq.m area in different parts of the district techniques and applied them to establish an
are tabulated by Indian Meteorological Department. intelligent agriculture land grading information
The huge such data sets can be used for predicting system. The author adopted decision tree C4.5
their influence on major crops of that particular algorithm and implement with Mo2.0 and VC++6.0
district or place. There are different forecasting to build agriculture land grading expert system. The
methodologies developed and evaluated by the study showed the advantages of this method in
researchers all over the world in the field of addressing problems in land grading. A decision
agriculture or associated sciences. Some of the such tree classifier for agriculture data was proposed
studies are : Agricultural researchers in Pakistan [5].This new classifier uses new data expression and
have shown that attempts of crop yield maximization can deal with both complete data and in complete
through pro-pesticide state policies have led to a data. In the experiment,10-fold cross validation
method is used to test dataset, horse-colic dataset • For each attribute a : find the
and soybean dataset. Their results showed the normalized information gain from
proposed decision tree is capable of classifying all splitting on a
kinds of agriculture data. Data mining technique for
evolution of association rules for droughts and • Let a_best be the attribute with the
floods in India was applied using climate inputs[2]. highest normalized information gain
In their study, a data-mining algorithm using the • Create a decision node that splits
concepts of minimal occurrences with constraints on a_best
and time lags was used to discover association rules
between extreme rainfall events and climatic indices. • Recurse on the sublists obtained by
Rainfall events were forecasted the using data splitting on a_best, and add those nodes
mining techniques[7].The occurrence of prolonged as children of node.
dry period or heavy rain at the critical stages of the Let S be a set of training samples, where the class
crop growth and development may lead to label of each sample is known. Each sample is in fact
significant reduction in crop yield. Sugarcane yield a tuple. One attribute is used to determine the class of
was estimated in Brazil, using 10-day periods of training samples. Suppose that there are m classes.
SPOT vegetation NDVI images and Let S contain si samples of class Ci for I= 1.., m. An
meteorological data [3]. Data Mining approach arbitrary samples belongs to class Ci with probability
based on Spatio-Temporal data to forecast irrigation si/s, where s is the total number of samples in set S.
water demand[8]. A set were prepared containing The expected information needed to classify a given
attributes obtained from meteorological data, sample is
remote sensing images and water delivery
statements. In order to make the prepared data sets m
si s
useful for demand forecasting and pattern I (s1, s2, …,sm) = −∑ log 2 i
extraction data sets were processed using a novel i =1 s s
approach based on a combination of irrigation and
data mining knowledge. Decision tree techniques
were applied to forecast future water requirement. An attribute A with values {a1, a2, …,av} can be used
to partition S into the subsets {S1, S2, …, Sv}, where
Sj contains those samples in S that have value aj of A.
II. METHODOLOGY Let Sjcontain sij samples of class Ci. The expected
information based on this partitioning by A is known
The present study was aimed to develop a web site for as the entropy of A. It is the weighted average:
finding out the influence of climatic parameters on
crop production in selected districts of Madhya v s1 j + .., s mj
Pradesh. The selection of districts has been made E (A) = ∑ s
I ( s1 j ,...s mj )
based on the area under that particular crop. Based on j =1
this criteria first top five districts in which the selected
The information gain obtained by this portioning on
crop area is maximum were selected. The crops
A is defined by
selected in the study is based on the predominate
crops in the selected district. The selected crop Gain (A) = I (s1, s2, …,sm) – E (A)
includes: Soybean, Maize, Paddy and Wheat. The
yield of these crops was tabulated for continuous 20
years by collecting the information from secondary
sources. Similarly for the corresponding years
climatic parameters such as Rainfall, Maximum & In this approach to relevance analysis, we can
Minimum temperature, Potential Evapotranspiration, compute the information gain for each of the
Cloud cover, Wetday frequency were also collected attributes defining the samples in S. The attribute with
the highest information gain is considered the most
from the secondary sources. The methodology
discriminating attribute of the given set. By
adopted for analysis includes for values above the
computing the information gain for each attribute, we
threshold were considered as one child and the
therefore obtain a ranking of the attributes. This
remaining as another child. It also handles missing
ranking can be used for relevance analysis to select
attribute values. In pseudo code, the general algorithm
the attributes to be used in concept description.
for building decision trees is:
• Check for base cases
2014 International Conference on Computer Communication and Informatics (ICCCI -2014), Jan. 03 –
05, 2014, Coimbatore, INDIA
Table 1.0 Prediction accuracy of developed model for soybean crop in Dewas district
Cloud cover, days Rainfall, mm Min. temp, oC Observed Yield, Predicted Yield, Is prediction accurate
kg/ha kg/ha
34.96 833.19 19.72 1092.7 >1022.24 Yes
33.96 1161.41 19.25 1060.8 >1022.24 Yes
34.04 938.17 19.83 1060.8 >1022.24 Yes
32.6 1033.26 20.06 1007.8 >1022.24 No
34.86 1067.67 19.27 1022.2 <1022.2 Yes
34.7 833.15 20.24 1247.1 >1022.24 Yes
35.17 962.4 19.77 1146.6 >1022.24 Yes
31.76 693.67 19.93 1009.8 <1022.24 Yes
35.23 694.35 19.7 1160.1 >1022.24 Yes
33.14 692.42 20.08 985.8 <1022.24 Yes
34.33 1168 18.98 1099 <1022.24 No
33.81 824 19.24 925 <1022.24 Yes
34.13 878 19.76 1275 >1022.24 Yes
33.86 851 19.49 912 <1022.24 Yes
33.85 802 20.1 907 <1022.24 Yes
34.49 915 20.14 1023 >1022.24 Yes
34.53 687 19.5 880 <1022.24 Yes
34.98 1293 19.31 780 <1022.24 Yes
34.02 709 19.65 920 <1022.24 Yes
33.96 728 19.66 930 <1022.24 Yes
Out of 20 years of data the predictions were correct Table 2.0. Prediction accuracy of developed model for different
crops.
in 18 years and were incorrect in two years indicating
the prediction accuracy of the developed model at 90 S.No. Name of the Crop Average prediction
per cent in case of soybean in Dewas district. Similar accuracy, %
analysis were carried out for all the selected crops 1 Soyabean 87
and districts, and based on the results obtained the
2 Paddy 85
overall accuracy of the developed model are
presented in Table 2.0. 3 Maize 76