LSTM 1
LSTM 1
1
978-1-5386-5051-6/18/$31.00 ©2018 IEEE
The authors of [9] present a review on use of machine Unlike a traditional neural network, in which it assumes
learning techniques for Indian rice crop prediction. They that all inputs are independent of each other, an RNN will
discuss the experimental results obtained by applying make use of the sequential information among input data.
sequential minimal optimization (SMO) classifier using the They are called recurrent because every element in the
WEKA tool on the dataset of 27 districts of Maharashtra sequence is subjected to the same task, while output is being
state, India. The parameters considered for the study were depend on computational values of previous nodes.
precipitation, minimum temperature, average temperature, RNN uses its internal state (memory) to process
maximum temperature, reference crop evapotranspiration, sequences of inputs which are related to each other. It keeps
area, production and yield for the Kharif season (June to memory on these relationships while training itself. So, the
November) for the years 1998 to 2002. relation among all the previous inputs helps in predicting a
better output. The RNNs are applicable to tasks such as
The trends in paddy production in Sri Lanka have been connected handwriting recognition, speech recognition, etc.
analyzed in [5]. The objectives of the study were to identify As shown in the Figure 1 [13], a block of neural network: A,
the past, present and future trends of paddy production in takes xt as the input and outputs the value ht. The
Sri Lanka and to develop a time series model to detect the information passed from one step of the network to another
long-term trend and prediction for three leading years of via a loop. A recurrent neural network can be considered as
paddy production. They have identified ARIMA (2, 1, 0), as multiple copies of the same network, where each one passes
the model which has the minimum AIC (Akaike Information a message to the next successor. This is known as output
Criterion) and BIC (Bayesian Information Criterion), which feedback.
was choosen as most suitable model that best fitted to the
data set.
2
LSTMs also form in a chain like structure,but the (the respondent agreed with the system, but rely on
repeating module organizes in a different way. In a LSTM alternative methods for the predictions), (3) Not suitable
network, a module may have four neural network layers (the respondent disagreed with the proposed system is not
with three gates. The structure of a strandard LSTM cell is necessary). The summary of the survey conducted with
shown in figure 4. farmers and rice mill owners are summarized in the Table I.
According to summary, 56.7% farmers have suggested
the system as the most suitable solution. 32.5% have
suggested suitable and 17.5% suggested as not suitable.
Among rice mill owners nearly 50.0% of average has agreed
as system is most suitable. 38.9% have agreed as suitable
and only 11.1% has suggested as not suitable.
3
and Development Institute, Bathalagoda, were analyzed. and “Test set”. Then the train and test set were split into
Finally, to identify regional based factors impact on paddy input (X) and output (Y) vectors. Finally, the inputs are
harvesting practices, purchasing prices and consumer’s reshaped into the 3D format expected by LSTM, namely
behaviour, the different Govi Jana Kendra centers were [samples, time steps, features].
referred. We defined the LSTM network with 50 neurons in the
first hidden layer and one neuron in the output layer for
C. Harvest prediction module predicting harvest. We used the Mean Absolute Error
(MAE) loss function and the efficient Adam version of
The main goal of harvest prediction module was to identify
stochastic gradient descent as evaluation models. The model
the best model that fit for harvest prediction patterns with a will be fit for 200 training epochs with a batch size of
higher accuracy level. 72.This value for number of iteration is selected as the
We started the process with a data set which reports the optimum value which gives higher accuracy specific to this
paddy harvest on each season (yala and maha) for 12 years at scenario by trial and error approach.
the Kurunegala, Anuradhapura and Matara district in Sri We kept track of both the training and test losses during
Lanka. The data included the year, the paddy harvest, and the training by setting the validation_data argument in the fit
factors that affect paddy harvest, including paddy yield area function.
and rainfall. Selection of these parameters were based on the Finally, we combined the forecast with the test dataset
co-efficient values of each variable to identify the degree of and invert the scaling. With forecasts and actual values in
change in the paddy production variable for every 1 unit their original scale, we calculated an error score for the
change in input parameters. model. In this case, we calculated the Root Mean Squared
Error (RMSE) that gives error in the same units as the
The complete list of raw data fields are as follow: variable itself.
1. id: row number D. Demand prediction module
2. district: district of paddy yield
3. season: season of paddy yield The main goal of demand prediction module was to identify
4. year: year of data in the row the best model that fit for demand prediction patterns with a
5. product: harvest amount reported in a season higher accuracy level. The demand prediction module
6. area: paddy yield area of a given district for the followed the same steps as harvest prediction module. The
given season same methods for data preparation and evaluation matrices
7. rain: average rainfall reported during the crop were used as the harvest prediction module. The complete
season list of raw data fields used for demand prediction are:
When preparing the data set first, we used the year as 1. id: row number
an index in Pandas (the python library used for the data 2. year: year of data in a row
analysis). Then all the null values in data were replaced as 3. income: per capita income on district basis
“0”. Then all the fields that were not important for the 4. substitute: average consumption amounts of
prediction dropped based on the co-efficient values. Finally, substitute foods
as shown in Table I our data set was ready for the 5. population: population per year on district basis
experiments. 6. consumption: consumption rates on district basis
We used this data and framed a forecasting problem The formatting of the data followed similar steps as the
where given the paddy cultivated area and rainfall for prior harvest prediction module. A sample of final dataset is given
year and then it forecasts the harvest production for the next in Table III.
year. TABLE III DEMAND PREDICTION - DATA SET
TABLE II HARVEST PREDICTION - DATA SET Year Population Per capita Substitute Consum
by income cereal ption of
Year Paddy Area cultivated Rainfall district (Rs) consumption rice (Mt)
production (Mt) (ha) (mm) (Mt)
2006 77257 16944 57 1989 12689 2600 2.3 80.3
2007 56492 11759 62 1990 14846 3549 2.6 95.2
2008 170500 38786 58 1991 18728 3549 2.6 117.5
2009 63275 14297 51 1992 19500 3800 2.2 118.6
2010 89625 18340 42 1993 19600 3540 2.5 119.9
Next step was fitting the LSTM to the problem. This Next, the data were feature normalized as input set, X:
involved preparing the dataset as a supervised learning {income, substitute, population} and output set, Y:
problem and normalizing the input variables. We framed the {consumption}. To fit an LSTM on the multivariate input
supervised learning problem as predicting the harvest at the data, the data split into “Train set” (70%) and “Test set”
current year (t) given the harvest production and affecting (30%). Finally, the data fetched into keras LSTM network
factors at the prior time step. Once all features are with 100 initial neurons and 1 output node. The batch
normalized, the dataset is transformed into a supervised processing ran for 90 epochs (iterations) with batch-size of
learning problem. 10. Again the number of iterations are chosen based on trial
To fit an LSTM on the multivariate input data, first we and error approach to get the optimum value.
split the prepared dataset into two sets named “Train set”
4
V. RESULTS AND DISSCUSSION
A. Harvest prediction
5
REFERENCES Conference on Computer Science and Software Engineering, Khon
Kaen, Thailand, July 2016, pp. 1-5.
[1] M. Dhanapala, "The Wet Zone Rice Culture in Sri Lanka: A Rational
Look", Journal of the National Science Foundation of Sri Lanka, vol. [10] "Recurrent neural network", En.wikipedia.org, 2018. [Online].
33, no. 4, p. 277, 2005. Available: https://en.wikipedia.org/wiki/Recurrent_neural_network.
[Accessed: 14- Aug- 2018].
[2] "Agriculture in Sri Lanka", En.wikipedia.org, 2018. [Online].
Available: https://en.wikipedia.org/wiki/Agriculture_in_Sri_Lanka. [11] S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory",
[Accessed: 14- Aug- 2018]. Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[12] "Understanding LSTM Networks -- colah's blog", Colah.github.io,
[3] "Crop forecast", Doa.gov.lk, 2018. [Online]. Available:
2018. [Online]. Available: http://colah.github.io/posts/2015-08-
https://doa.gov.lk/index.php/en/18-english-news/307-crop-forecast-4.
Understanding-LSTMs/. [Accessed: 15- Aug- 2018].
[Accessed: 14- Aug- 2018].
[4] "Understanding The Recurrent Neural Network – Mindorks – [13] "An Introduction to Recurrent Neural Networks", Towards data
Medium", Medium, 2018. [Online]. Available: science, 2018. [Online]. Available:
https://medium.com/mindorks/understanding-the-recurrent-neural- https://towardsdatascience.com/an-introduction-to-recurrent-neural-
network-44d593f112a2. [Accessed: 14- Aug- 2018]. networks-72c97bf0912. [Accessed: 14- Aug- 2018].
[5] V. Sivapathasundaram and C. Bogahawatte, "Forecasting of Paddy [14] M. Mitchell, An Introduction to Genetic Algorithms, Cambridge,
MA, USA: MIT Press, 1998.
Production in Sri Lanka: A Time Series Analysis using ARIMA
Model", Tropical Agricultural Research, vol. 24, no. 1, p. 21, 2015. [15] A. Ishigaki and S. Takaki, “Iterated Local Search Algorithm for
Flexible Job Shop Scheduling”, 2017
[6] A. Razmy Mohamed and A. Ahmed Naseer, "Trends in paddy
production in Sri Lanka", Ir.lib.seu.ac.lk, 2005. [Online]. Available: [16] W. Rankothge, F. Le, A. Russo, J. Lobo, " Experimental results on
http://ir.lib.seu.ac.lk/123456789/47. [Accessed: 30- Oct- 2018]. the use of genetic algorithms for scaling virtualized network
functions", Proc. IEEE SDN/NFV, pp. 47-53, 2015.
[7] Ponweera, P. and Premaratne, S. (2018). “Information and decision
support system to enrich paddy cultivation in Sri Lanka”. [online] [17] W. Rankothge, J. Ma, F. Le, A. Russo, J. Lobo, "Towards making
Dl.lib.mrt.ac.lk. Available at: http://dl.lib.mrt.ac.lk/handle/123/8438. network function virtualization a cloud computing service", Proc.
IEEE IM, pp. 89-97, 2015.
[8] Dieisson Pivotoa Paulo, Dabdab Waquil et al., "Scientific
development of smart farming technologies and their application in [18] W. Rankothge, F. Le, A. Russo, J. Lobo, " Optimizing Resource
Brazil", Information Processing in Agriculture, vol. 5, no. 1, p. 31, Allocation for Virtualized Network Functions in a Cloud Center
2018. Using Genetic Algorithms ", Proc. IEEE IM, pp. 89-97, 2015.
[9] N. Gandhi and L. Armstrong et al., "Rice Crop Yield Prediction in
India using Support Vector Machines", in International Joint