ML 3
ML 3
BALKAN JOURNAL OF ELECTRICAL & COMPUTER ENGINEERING, Vol. 7, No. 1, January 2019
observed that the regression techniques provide better demonstrated that the new model provides better performance
performance compared to the time series analysis approaches. than the single SVR. Stojanović et al. [18] used several
In this study, it was demonstrated that building an end-to-end features such as fuel price, holiday, unemployment,
sales prediction system on Azure ML Studio is an easy and temperature, store, and date to forecast the weekly sales in
very efficient task and there are many algorithms to apply. Walmart and showed that Support Vector Machine provides
The remainder of this paper is organized as follows: Section the best performance.
II presents the related work. Section III shows the As we see in these studies, each study suggests a single
methodology and the Section IV explains the initial results of model, but we need a comparative assessment of machine
this system. Section V shows the conclusion and future work. learning models to evaluate their performance on the same
public dataset. In this study, we performed our experiments to
II. RELATED WORK satisfy this goal.
There are many studies on the development of sales
forecasting models, but they did not evaluate many models in III. METHODOLOGY
one study and use Azure Machine Learning Studio platform to In this study, several regression algorithms in machine
build an end-to-end sales prediction system so far. learning and time series analysis methods were applied for
Kuo [1] applied the fuzzy-neural network for sales sales forecasting. In this section, these methods are introduced.
forecasting and demonstrated that this model’s performance is First, regression algorithms will be explained and then, the
superior to traditional neural networks. Chen and Ou [2] used time series analysis methods will be introduced.
a new approach called Gray extreme machine learning with A. Regression
Taguchi method [2]. The system performance was better than
Linear Regression
the performance of artificial neural networks. Zhao et al. [3]
utilized from clustering, regression, and time series analysis Linear Regression is used to create a mathematical equation to
techniques for the electricity sales forecasting. Tian et al. [4] produce the relation between independent variables (x) and a
applied seasonal time series analysis for auto sales in China. dependent variable (y).
The seasonal effects were calculated by using the exponential
weighted moving average. Later, calculated effects and the B: Coefficients
counted frequency are combined for the linear regression E: Residue
technique. Zhang [5] combined ARIMA and neural network
Linear regression method uses the Formula 1:
for forecasting. Pandey and Somani [6] implemented a cloud
computing based sales forecasting system and applied time
Y= + + +…+ +Ei (1)
series analysis with the moving average method. They
deployed the system on Azure cloud and used MySQL
The slope of the line is B, and E represents the intercept.
database and PHP programming language. Moving average Therefore, Y is the response variable which is also called the
methods are being applied to forecast the sales for a long time dependent variable, B’s are the weights that are the model
in the literature [7]. Vijayalakshmi et al. [8] implemented a parameters, the values of the predictor variables are
sales forecasting engine based on genetic algorithms. Yeo et represented with X’s, and finally E is the error term signifying
al. [9] developed a new customer model which uses customer- the random sampling noise.
browsing behavior and tested the model on an e-commerce
website. Choi et al. [10] combined the SARIMA and wavelet Bayesian Linear Regression
transform method for sales forecasting. They demonstrated
that the new hybrid model provides better performance than Unlike linear regression, Bayesian Inference is used in the
the single methods. Chang et al. [11] designed a hybrid model, Bayesian approach. The normal distribution in Bayesian
which combines k-means clustering and fuzzy neural network approach is calculated based on the Equations 2-3-4-5. Since
for sales prediction of circuit boards. Wong et al. [12] w is a continuous-valued random variable in Rd, Bayes rule
implemented a new model, which uses extreme learning says that the posterior distribution of w is given by y.
machine and harmony search algorithm for sales prediction of
retail supply chains and showed that the new model provides P(w|y) ∝ P(y|w) P(w) (2)
better performance than the ARIMA models. Katkar et al. [13]
used fuzzy logic and Naive Bayes classifier for sales P(w|y) ∼ N (µ, S) (3)
forecasting. Müller-Navarra et al. [14] applied Recurrent
Neural Networks to forecast the sales. Gao et al. [15] used = + ) (4)
extreme learning machine algorithm.
Omar et al. [16] examined the Back Propagation Neural µ=S( + y)) (5)
Network for sales forecasting. Lu et al. [17] proposed a hybrid
method based on MARS and SVR techniques for sales In Bayesian linear regression, the predictive distribution is
prediction of information technology products. They calculated based on Equation 6:
Predictive distribution evaluates the likelihood of a value y0 ARIMA (autoregressive integrated moving average) model
given x0 for a particular w, by means of likelihood by current uses the following Equations 14-15 and describes the
belief about w given data (y, X). Finally, sum up all possible autocorrelations in the data.
values of w.
= (14)
Neural Network Regression
The neural network regression uses Equation 7: = (15)
= w.x (7) The parameter L is the lag operator, p is the order which
represents the number of time lags of the autoregressive
In Neural Networks, a perceptron is used to take a vector of model, and q is the order of the moving-average model. θ are
real-valued inputs to calculate the linear combinations of the parameters of the moving average part and Φ are the
inputs. If the output is greater than some threshold, then it parameters of the autoregressive part of the model.
outputs 1, in the other case it produces -1. The weights have to
be calculated according to the perceptron training rules shown Seasonal ARIMA
in Equation 8 and 9:
(8) Seasonal ARIMA (SARIMA) is similar to ARIMA, but it has
different elements. SARIMA uses Formula 16-17-18.
= (9)
Φ(B) = θ(B) (16)
The symbol refers to the neural network learning rate.
( ) = ( ) (17)
Decision Forest Regression
Φ( ( ) = θ( ) ( ) (18)
In the Random Forest algorithm, the primary aim is to make a
classification by using several trees. In Random Forest, to SARIMA model has the same structure as the non-seasonal
obtain the last class of the tree, Gini value is used. Gini value (ARIMA) model: it may have an AR factor, an MA factor
is calculated based on Formula 10: (corresponds to α), and/or an order of differencing.
Mean Absolute Error (MAE): It takes the average of Studio. This time, the best approach is Boosted Decision Tree
absolute errors. Equation 25 is used to calculate this Regression. In Table 2, results are shown.
parameter. MAE calculates the average absolute difference
between yi and x i which are the coordinates of point i. TABLE 2
ALL THE DEPARTMENTAL RESULTS
Coefficient of
MAE= (25) Method MAE RMSE Determination
Bayesian Linear
2469.54 4361.40 0.96
Initial experiments were performed for only one store and one Regression
department of the Walmart Company because of the time Linear Regression 2480.12 4365.04 0.96
series analysis techniques. Test results were evaluated by Neural Network
several evaluation parameters such as Mean Absolute Error 14951.09 22499.33 0.00
Regression
and Root Mean Squared Error. Decision Forest Regression Boosted Decision
was the best approach based on RMSE and MAE value as Tree Regression
1669.10 3696.59 0.97
shown in Figure 1.
2
R =
(y - y) - (y - ŷ)
2 2
(26)
(y - y) 2
[5] Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and
neural network model. Neurocomputing, 50, 159-175.
[6] Pandey, A., & Somani, R. K. (2013). A Cloud Computing Based Sales
Forecasting System for Small and Medium Scale Textile Industries.
computing, 3(4).
[7] Winters, P. R. (1960). Forecasting sales by exponentially weighted
moving averages. Management Science, 6(3), 324-342.
[8] Vijayalakshmi, M., Menezes, B., Menon, R., Divecha, A., Ravindran,
R., & Mehta, K. (2010, October). Intelligent sales forecasting engine
using genetic algorithms. In Proceedings of the 19th ACM international
conference on Information and knowledge management (pp. 1669-
1672). ACM.
[9] Yeo, J., Kim, S., Koh, E., Hwang, S. W., & Lipka, N. (2016, April).
Browsing2purchase: Online Customer Model for Sales Forecasting in an
E-Commerce Site. In Proceedings of the 25th International Conference
Companion on World Wide Web (pp. 133-134). International World
Wide Web Conferences Steering Committee.
[10] Choi, T. M., Yu, Y., & Au, K. F. (2011). A hybrid SARIMA wavelet
Fig. 2. The coefficient of determination results transform method for sales forecasting. Decision Support Systems,
51(1), 130-140.
[11] Chang, P. C., Liu, C. H., & Fan, C. Y. (2009). Data clustering and fuzzy
neural network for sales forecasting: A case study in printed circuit
V. CONCLUSION board industry. Knowledge-Based Systems, 22(5), 344-355.
It is crucial to predicting the sales amounts as close as to the [12] Wong, W. K., & Guo, Z. X. (2010). A hybrid intelligent model for
medium-term sales forecasting in fashion retail supply chains using
actual sales amounts for enterprises to increase their profits extreme learning machine and harmony search algorithm. International
[20]. Unless an accurate forecasting model is built, cash flow Journal of Production Economics, 128(2), 614-624.
problems are inevitable. Therefore, building this kind of [13] Katkar, V., Gangopadhyay, S. P., Rathod, S., & Shetty, A. (2015,
prediction models for sales forecasting has a high priority for January). Sales forecasting using data warehouse and Naïve Bayesian
classifier. In Pervasive Computing (ICPC), 2015 International
the organizations. In this study, we investigated the effect of Conference on (pp. 1-6). IEEE
Regression and Time Series Analysis methods on the sales [14] Müller-Navarra, M., Lessmann, S., & Voß, S. (2015, January). Sales
forecasting problem. Our experiments show that the regression Forecasting with Partial Recurrent Neural Networks: Empirical Insights
techniques provide higher performance and accuracy and Benchmarking Results. In System Sciences (HICSS), 2015 48th
Hawaii International Conference on (pp. 1108-1116). IEEE.
compared to the time series analysis techniques. Boosted [15] Gao, M., Xu, W., Fu, H., Wang, M., & Liang, X. (2014, July). A novel
Decision Tree Regression algorithm was the best predictor for forecasting method for large-scale sales prediction using extreme
sales forecasting with the 0.97 coefficient of determination. learning machine. In Computational Sciences and Optimization (CSO),
Prediction results were obtained for weekly sales quantities. 2014 Seventh International Joint Conference on (pp. 602-606). IEEE.
[16] Omar, H. A., & Liu, D. R. (2012, January). Enhancing sales forecasting
In the future, a hybrid model using ARIMA and Boosted by using neuro networks and the popularity of magazine article titles. In
Decision Tree Regression techniques will be investigated to 2012 Sixth International Conference on Genetic and Evolutionary
solve this problem. In addition, the implementation of another Computing (ICGEC) (pp. 577-580).
hybrid model with an SVM model is planned. New [17] Lu, C. J., Lee, T. S., & Lian, C. M. (2010, December). Sales forecasting
of IT products using a hybrid MARS and SVR model. In 2010 IEEE
experiments will be performed if the new public data is International Conference on Data Mining Workshops (pp. 593-599)..
obtained. Also, deep learning algorithms [22, 23] can be [18] Stojanović, N., Soldatović, M., & Milićević, M. (2014, June). Walmart
investigated for sales prediction problem. Recruiting–Store Sales Forecasting. In Proceedings of the XIV
International Symposium Symorg 2014: New Business Models and
Sustainable Competitiveness (p. 135). Fon.
ACKNOWLEDGMENT [19] Kaggle.com | Kaggle Datasets | Open Datasets for Any Project,
The authors thank Wageningen University and Istanbul Kültür “https://www.kaggle.com/” Access date: 25.02.2019.
[20] Bohanec, M., Borštnar, M. K., & Robnik-Šikonja, M. (2017). Explaining
University for the infrastructure support to complete this machine learning models in sales predictions. Expert Systems with
scientific research. Applications, 71, 416-428.
[21] Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles
and practice. OTexts.
[22] Tsoumakas, G. (2018). A survey of machine learning techniques for
REFERENCES food sales prediction. Artificial Intelligence Review, 1-7.
[1] Kuo, R. J. (2001). A sales forecasting system based on fuzzy neural [23] Luce, L. (2019). Deep Learning and Demand Forecasting. In Artificial
network with initial weights generated by genetic algorithm. European Intelligence for Fashion (pp. 155-166). Apress, Berkeley, CA.
Journal of Operational Research, 129(3), 496-517.
[2] Chen, F. L., & Ou, T. Y. (2011). Sales forecasting system based on Gray
extreme learning machine with Taguchi method in retail industry. Expert
Systems with Applications, 38(3), 1336-1345
[3] Zhao, J., Tang, W., Fang, X., Wang, J., Liu, J., Ouyang, H., ... & Qiang,
J. (2015, September). A Novel Electricity Sales Forecasting Method
Based on Clustering, Regression and Time Series Analysis. In
Proceedings of the 2015 International Conference on Artificial
Intelligence and Software Engineering.,
[4] Tian, Y., Liu, Y., Xu, D., Yao, T., Zhang, M., & Ma, S. (2012, April).
Incorporating Seasonal Time Series Analysis with Search Behavior
Information in Sales Forecasting. In Proceedings of the 21st
International Conference on World Wide Web (pp. 615-616).
BIOGRAPHIES
CAGATAY CATAL received the
B.S. and M.S. degrees in computer
engineering from Istanbul Technical
University, Istanbul, and the Ph.D.
degree in computer engineering from
Yildiz Technical University, Istanbul,
in 2008. He worked 8 years at the
Scientific and Technological Research
Council of Turkey as Senior
Researcher.
Later, he worked 6 years in Istanbul Kültür University,
Department of Computer Engineering as Associate Professor
and was the Head of the Department for the last 3 years. On
January 2018, he joined the Information Technology Group of
Wageningen University in the Netherlands. His research
interests are data science, machine learning, software
engineering, and experimental software engineering.