0% found this document useful (0 votes)

75 views7 pages

ML 3

Uploaded by

shabir Ahmad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

75 views7 pages

ML 3

Uploaded by

shabir Ahmad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

20

BALKAN JOURNAL OF ELECTRICAL & COMPUTER ENGINEERING, Vol. 7, No. 1, January 2019

Benchmarking of Regression and Time Series

Analysis Techniques for Sales Forecasting
C. CATAL, K. ECE, B. ARSLAN and A. AKBULUT 

sufficient number of products, but there is no potential

Abstract— Predicting the sales amount as close as to the actual customer to buy them, then the product stays in stocks. In
sales amount can provide many benefits to companies. Since the addition, the fashion industry is volatile. Trends change
fashion industry is not easily predictable, it is not quickly. Fashion must be closely followed to increase the sales
straightforward to make an accurate prediction of sales. In this
study, we applied not only regression methods in machine
amount.
learning but also time series analysis techniques to forecast the A company which is out of the trends is not preferred by the
sales amount based on several features. We applied our models customers. Therefore, features which affect the fashion can be
on Walmart sales data in Microsoft Azure Machine Learning examined to increase the sales. In this study, our goal is to
Studio platform. The following regression techniques were predict the actual sales amount accurately by using different
applied: Linear Regression, Bayesian Regression, Neural machine learning algorithms. Machine learning is a very
Network Regression, Decision Forest Regression and Boosted
Decision Tree Regression. In addition to these regression
active research field that helps to learn from the data and uses
techniques, the following time series analysis methods were the data to make a prediction for the future. There are many
implemented: Seasonal ARIMA, Non-Seasonal ARIMA, Seasonal application areas of machine learning algorithms. For
ETS, Non -Seasonal ETS, Naive Method, Average Method, and example, Facebook’s News Feed feature, which applies the
Drift Method. It was shown that Boosted Decision Tree EdgeRank algorithm, can be used for the personalization of
Regression provides the best performance on this sales data. This the feeds. The algorithm identifies the user interests by using
project is a part of the development of a new decision support
system for the retail industry.
statistical and predictive analysis.
Walmart, one of the best retailers in the world, dramatically
increased its online sales and revenue by using advanced data
Index Terms—Machine learning, regression, sales forecasting, mining techniques. The data prior to the sales and after the
time series analysis. sales have been extensively analyzed by the data scientists to
change the e-commerce strategy of this retail company. Also,
Walmart has changed its shipping policy for the products
I. INTRODUCTION based on the data analysis performed on big data. According

T HE IDENTIFICATION of the number of stocks and the

replenishment strategy are significant activities for many
companies in the retail industry.
to Walmart's new shipping policy, the minimum amount for
free shipping was raised from $45 to $50.
In this study, Walmart’s public data was analyzed with
If the number of the products is insufficient at a given time, different regression algorithms in Azure Machine Learning
the customer demand cannot be satisfied at that time which (ML) Studio. In addition to these algorithms, several time
causes the company to lose the customer. If there are a series analysis methods were implemented by using R
packages which are available from Azure ML Studio. Since
CAGATAY CATAL, is with Information Technology Group Wageningen there is no way to add time series analysis methods into the
University, Wageningen, The Netherlands,(e-mail: [email protected]). experiment screen graphically, these methods were
https://orcid.org/ 0000-0003-0959-2930 implemented in the R programming language manually. Later,
KAAN ECE, is with Department of Computer Engineering Istanbul Kültür the best model was transformed into a web service and this
University, Istanbul, Turkey, (e-mail: [email protected]). web service was deployed on the Azure cloud platform. A
https://orcid.org/0000-0001-7225-047X client application was implemented to consume this web
BEGUM ARSLAN, is with Department of Computer Engineering Istanbul service. Azure sends the results in JSON format.
Kültür University, Istanbul, Turkey, (e-mail: [email protected]). The following regression algorithms were applied: Linear
https://orcid.org/0000-0002-4794-0791 Regression, Bayesian Regression, Neural Network
AKHAN AKBULUT, is with Department of Computer Engineering Istanbul Regression, Random Forest Regression, and Boosted Decision
Kültür University, Istanbul, Turkey, (e-mail: [email protected]).
Tree Regression. Also, the following time series analysis
https://orcid.org/ 0000-0001-9789-5012
techniques were applied: Seasonal ARIMA, Non-Seasonal
ARIMA, Seasonal ETS, Non-Seasonal ETS, Naive Method,
Manuscript received December 10, 2018; accepted January 17, 2019. Average Method, and Drift Method. According to the
DOI: 10.17694/bajece.494920 experimental results, the best method was identified. It was

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

21
BALKAN JOURNAL OF ELECTRICAL & COMPUTER ENGINEERING, Vol. 7, No. 1, January 2019

observed that the regression techniques provide better demonstrated that the new model provides better performance
performance compared to the time series analysis approaches. than the single SVR. Stojanović et al. [18] used several
In this study, it was demonstrated that building an end-to-end features such as fuel price, holiday, unemployment,
sales prediction system on Azure ML Studio is an easy and temperature, store, and date to forecast the weekly sales in
very efficient task and there are many algorithms to apply. Walmart and showed that Support Vector Machine provides
The remainder of this paper is organized as follows: Section the best performance.
II presents the related work. Section III shows the As we see in these studies, each study suggests a single
methodology and the Section IV explains the initial results of model, but we need a comparative assessment of machine
this system. Section V shows the conclusion and future work. learning models to evaluate their performance on the same
public dataset. In this study, we performed our experiments to
II. RELATED WORK satisfy this goal.
There are many studies on the development of sales
forecasting models, but they did not evaluate many models in III. METHODOLOGY
one study and use Azure Machine Learning Studio platform to In this study, several regression algorithms in machine
build an end-to-end sales prediction system so far. learning and time series analysis methods were applied for
Kuo [1] applied the fuzzy-neural network for sales sales forecasting. In this section, these methods are introduced.
forecasting and demonstrated that this model’s performance is First, regression algorithms will be explained and then, the
superior to traditional neural networks. Chen and Ou [2] used time series analysis methods will be introduced.
a new approach called Gray extreme machine learning with A. Regression
Taguchi method [2]. The system performance was better than
Linear Regression
the performance of artificial neural networks. Zhao et al. [3]
utilized from clustering, regression, and time series analysis Linear Regression is used to create a mathematical equation to
techniques for the electricity sales forecasting. Tian et al. [4] produce the relation between independent variables (x) and a
applied seasonal time series analysis for auto sales in China. dependent variable (y).
The seasonal effects were calculated by using the exponential
weighted moving average. Later, calculated effects and the B: Coefficients
counted frequency are combined for the linear regression E: Residue
technique. Zhang [5] combined ARIMA and neural network
Linear regression method uses the Formula 1:
for forecasting. Pandey and Somani [6] implemented a cloud
computing based sales forecasting system and applied time
Y= + + +…+ +Ei (1)
series analysis with the moving average method. They
deployed the system on Azure cloud and used MySQL
The slope of the line is B, and E represents the intercept.
database and PHP programming language. Moving average Therefore, Y is the response variable which is also called the
methods are being applied to forecast the sales for a long time dependent variable, B’s are the weights that are the model
in the literature [7]. Vijayalakshmi et al. [8] implemented a parameters, the values of the predictor variables are
sales forecasting engine based on genetic algorithms. Yeo et represented with X’s, and finally E is the error term signifying
al. [9] developed a new customer model which uses customer- the random sampling noise.
browsing behavior and tested the model on an e-commerce
website. Choi et al. [10] combined the SARIMA and wavelet Bayesian Linear Regression
transform method for sales forecasting. They demonstrated
that the new hybrid model provides better performance than Unlike linear regression, Bayesian Inference is used in the
the single methods. Chang et al. [11] designed a hybrid model, Bayesian approach. The normal distribution in Bayesian
which combines k-means clustering and fuzzy neural network approach is calculated based on the Equations 2-3-4-5. Since
for sales prediction of circuit boards. Wong et al. [12] w is a continuous-valued random variable in Rd, Bayes rule
implemented a new model, which uses extreme learning says that the posterior distribution of w is given by y.
machine and harmony search algorithm for sales prediction of
retail supply chains and showed that the new model provides P(w|y) ∝ P(y|w) P(w) (2)
better performance than the ARIMA models. Katkar et al. [13]
used fuzzy logic and Naive Bayes classifier for sales P(w|y) ∼ N (µ, S) (3)
forecasting. Müller-Navarra et al. [14] applied Recurrent
Neural Networks to forecast the sales. Gao et al. [15] used = + ) (4)
extreme learning machine algorithm.
Omar et al. [16] examined the Back Propagation Neural µ=S( + y)) (5)
Network for sales forecasting. Lu et al. [17] proposed a hybrid
method based on MARS and SVR techniques for sales In Bayesian linear regression, the predictive distribution is
prediction of information technology products. They calculated based on Equation 6:

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

22
BALKAN JOURNAL OF ELECTRICAL & COMPUTER ENGINEERING, Vol. 7, No. 1, January 2019

P( |y, X , ) = Z P( |w, X, , B. Time Series Analysis

)P(w|X) dw (6) ARIMA

Predictive distribution evaluates the likelihood of a value y0 ARIMA (autoregressive integrated moving average) model
given x0 for a particular w, by means of likelihood by current uses the following Equations 14-15 and describes the
belief about w given data (y, X). Finally, sum up all possible autocorrelations in the data.
values of w.
= (14)
Neural Network Regression
The neural network regression uses Equation 7: = (15)

= w.x (7) The parameter L is the lag operator, p is the order which
represents the number of time lags of the autoregressive
In Neural Networks, a perceptron is used to take a vector of model, and q is the order of the moving-average model. θ are
real-valued inputs to calculate the linear combinations of the parameters of the moving average part and Φ are the
inputs. If the output is greater than some threshold, then it parameters of the autoregressive part of the model.
outputs 1, in the other case it produces -1. The weights have to
be calculated according to the perceptron training rules shown Seasonal ARIMA
in Equation 8 and 9:
(8) Seasonal ARIMA (SARIMA) is similar to ARIMA, but it has
different elements. SARIMA uses Formula 16-17-18.
= (9)
Φ(B) = θ(B) (16)
The symbol refers to the neural network learning rate.
( ) = ( ) (17)
Decision Forest Regression
Φ( ( ) = θ( ) ( ) (18)
In the Random Forest algorithm, the primary aim is to make a
classification by using several trees. In Random Forest, to SARIMA model has the same structure as the non-seasonal
obtain the last class of the tree, Gini value is used. Gini value (ARIMA) model: it may have an AR factor, an MA factor
is calculated based on Formula 10: (corresponds to α), and/or an order of differencing.

Gini(T)=1- (10) Seasonal ETS

The Seasonal Exponential Smoothing (Seasonal ETS) applies
The T dataset is split into , subsets with , three sub-pass filters recursively with special exponential
dimensions, then Gini split value is calculated based on window functions. The simplest moving averages are the
Formula 11: weighting of past observations, but the exponential window
functions are multiplied over time to reduce the weight. The
simplest formula of seasonal ETS is shown in Formula 19
where α is the smoothing factor and s is the seasonal period.
= ( )+ ( ) (11)
= α. +(1- α) (19)

Boosted Decision Tree Regression Non-Seasonal ETS

Non-seasonal time series includes a trend component. To
Boosted decision trees use gradient boosting algorithm. This estimate the trend component, the simple moving average
algorithm applies the optimization of differentiable loss function is used as shown in Equation 20.
function by using the weighted sum of functions. The F(x) is
calculated based on Equation 12:
SMA= (20)
(x)= arg min and (12)
Naive Method
This method works quite well for economic and financial time
The predictions are calculated based on Equation 13: series. This approach sets each prediction to be equal to the
last observed value of the same season. Equation 21 is used
for the calculation.
=argmin (13)

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

23
BALKAN JOURNAL OF ELECTRICAL & COMPUTER ENGINEERING, Vol. 7, No. 1, January 2019

= (21) tempCategory. Also, economical features were investigated

and used for the prediction such as PPI, TreInf, TNF. These
economical features were added to the previous feature set.
Naive forecast is the model that calculates in the simplest way Therefore, the following features were created in the dataset:
using the actual demand for the past period as the expected Store, department, week of the month, MonthOfYear,
demand for the future period with an assumption of the past PrevWeekSales, isHoliday, period, season, size, PPI, TNF,
will repeat. TreInf, fuel_price.
PPI is the acronym of Producer Price Index and sets the
Average Method inflation rate incurred by the purchase of goods and services.
Estimates of all the future values are equal to the average of TreInf indicates the amount of income per household. TNF is
the historical data. This approach can be used with any kind of the acronym of Total Nonfarm Payrolls which is a monthly
data where historical data is available. Formula 22 shows this report showing how many employers provide employment in
simple approach. private or government sectors for the previous month, or how
much reduction occurs in the employment in the United
= (22) States. Season and period were extracted from the date feature.
1 is used to represent the autumn season, 2 for the winter
To make a forecast using averaging, this formula (22) simply season, 3 for the spring season and 4 for the summer season.
takes the average of selected periods of the past data by The Period feature is selected based on the company’s yearly
summing each period and dividing the result by the number of plan. Most of the companies start their yearly plan in January.
periods. Therefore, forecast of all future values (Ῠt+h|t) is equal Therefore, 1 represents the first three months, 2 is used for the
to mean of historical data. It is worth pointing out that this next 3 months. The week of the month and the month of the
technique is very effective and useful for short term forecasts. year were also extracted from the date feature. The store size
feature was extracted from the size feature and the temperature
Drift Method: This method is the variation of the Naive category was extracted from temperature. 1 represents below
method, but it provides an increasing or decreasing over time 15 C°, 2 is between 15 C° - 25 C° and 3 means over 25 C°.
which is called drift for the historical data. Equation 23 shows Also, previous week sales feature was extracted from the
how the method works. weekly_sales feature by using previous week sales. IsHoliday
includes the following days: Super Bowl, Christmas, Labor
= +h (23) Day, Thanksgiving. WeekNo feature was extracted from the
date feature. Dates on the training dataset were numbered to
Forecasts are equal to last value plus average change which is weeks. Temperature, MarkDown1, MarkDown2, MarkDown3,
an equivalent expression to the extrapolation of a line drawn MarkDown4, MarkDown5, CPI, Unemployment, store size,
between the first and last observations. TempCategory were observed to have negative effects on the
models. Therefore, these features were removed. The data
which is up to 2012-01-01 were used for training and the
IV. EXPERIMENTAL RESULTS remaining four months were used for testing.
Datasets were obtained from Walmart Recruiting Store Sales During the experiments, parameters of the regression
Competition page of Kaggle website [19]. There are three algorithms were optimized by using “tune hyper model
different files called features.csv, stores.csv and train.csv. In parameters” feature of Azure ML Studio platform. After the
features.csv, there are several features such as temperature, best model was selected, “Set Up Web Service” button with
fuel price, and unemployment. In stores.csv, there are features “Predictive Web Service” option was used to produce the web
such as store id, store type, and store size. Finally, train.csv service. Web service was deployed by selecting “Deploy Web
has some historical data and real sales amount. The first step is Service” button. Web Services are displayed on the “Web
to create a new experiment. Then, the features.csv, stores.csv Services” tab on the Azure ML Studio. Web service C# code
and train.csv documents are joined to create a single dataset. for a client application is displayed on the
The next step is to perform feature selection and feature REQUEST/RESPONSE page and its API key is displayed on
extraction. the dashboard screen of the selected web service. The API key
Feature.csv file includes store, date, temperature, fuel_price, is needed to authenticate the user. In this study, MAE and
MarkDown1, MarkDown2, MarkDown3, MarkDown4, RMSE error parameters were used.
MarkDown5, CPI, Unemployment, IsHoliday features. Root Mean Squared Error: It is the square root of the
Store.csv includes the store, type, size features, and train.csv average error. The mean squared error is shown based on
includes the store, department, date, weekly_sales, IsHoliday. Equation 24.
These three files were imported into MS SQL Server
Management Studio and were combined with an inner join
operation. The features used in the system are the store, RMSE= (24)
department, date, isHoliday, size, fuel_price, isHoliday. The
class label is weekly_sales. As a result of the feature selection,
the following features were identified: week of the month, Xobs: observed value
monthOfYear, prevWeekSales, season, store size, Xmodel: predicted values at time/place i.

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

24
BALKAN JOURNAL OF ELECTRICAL & COMPUTER ENGINEERING, Vol. 7, No. 1, January 2019

Mean Absolute Error (MAE): It takes the average of Studio. This time, the best approach is Boosted Decision Tree
absolute errors. Equation 25 is used to calculate this Regression. In Table 2, results are shown.
parameter. MAE calculates the average absolute difference
between yi and x i which are the coordinates of point i. TABLE 2
ALL THE DEPARTMENTAL RESULTS
Coefficient of
MAE= (25) Method MAE RMSE Determination
Bayesian Linear
2469.54 4361.40 0.96
Initial experiments were performed for only one store and one Regression
department of the Walmart Company because of the time Linear Regression 2480.12 4365.04 0.96
series analysis techniques. Test results were evaluated by Neural Network
several evaluation parameters such as Mean Absolute Error 14951.09 22499.33 0.00
Regression
and Root Mean Squared Error. Decision Forest Regression Boosted Decision
was the best approach based on RMSE and MAE value as Tree Regression
1669.10 3696.59 0.97
shown in Figure 1.

The coefficient of determination, which is also known as R2

(equation 26) represents the predictive power of the model as
a value between 0 and 1.

2
R =
 (y - y) -  (y - ŷ)
2 2

(26)
 (y - y) 2

The coefficient of determination, R2, signifies the proportion

of the total sample variation in y which is measured by the
sum of squares of deviations of the sample y values about their
y
mean , is attributed to the linear relationship between x and
y. It is a standard way of evaluating how well the model fits
the data. It can be interpreted as the proportion of variation
explained by the model. A higher proportion is better, where 1
Fig. 1. Evaluation of experiments
indicates a perfect fit. We used the coefficient of
determination factor to express our accuracy. Moreover, the
Table 1 shows that four methods provide better performance.
highest prediction success in estimates where all stores are
The difference between Decision Forest Regression, Non-
Seasonal ARIMA, Boosted Decision Tree Regression, and used is 0.97 with Boosted Decision Tree Regression method
Seasonal ARIMA’s accuracy is not too much. Decision Forest as shown in Table 2.
As explained in the Related Work section, several studies
Regression provides the highest performance.
applied the times series analysis approaches [4, 6] and
TABLE 1. artificial neural network algorithms [14, 16, 20] for predicting
RMSE AND MAE RESULTS
the sales amount. In this study, we demonstrated that
regression algorithms (i.e., Boosted Decision Tree Regression)
Method RMSE MAE work much better than these algorithms used in the literature
Seasonal ARIMA 3650.08 2923.35 and it does not require much optimization to provide high
Non-Seasonal ARIMA 3635.49 2916.03 performance. We developed our NN model using a standard
Seasonal ETS 8274.58 7582.88 topology, and therefore this might affect the performance of
Bayesian Linear Regression 4771.36 3980.12 the model. This can be considered as one of the threats to the
Linear Regression 4613.01 3666.66 validity of this study. Also, our experiments were performed
Decision Forest Regression 3439.48 2750.21 on a specific dataset and hence, the performance might be
Boosted Decision Tree 3637.61 2907.69 different on other kinds of datasets. We suggest practitioners
Regression apply the Boosted Decision Tree Regression algorithm for
Neural Network Regression 9622.62 7745.15 sales forecasting.

To extend the experiment with the entire dataset, all the

departments were tested with these regression methods
because the use of time series analysis techniques is not
appropriate for the entire dataset. Parameters were optimized
with Tune Model Hyper Parameters module in Azure ML

25
BALKAN JOURNAL OF ELECTRICAL & COMPUTER ENGINEERING, Vol. 7, No. 1, January 2019

[5] Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and
neural network model. Neurocomputing, 50, 159-175.
[6] Pandey, A., & Somani, R. K. (2013). A Cloud Computing Based Sales
Forecasting System for Small and Medium Scale Textile Industries.
computing, 3(4).
[7] Winters, P. R. (1960). Forecasting sales by exponentially weighted
moving averages. Management Science, 6(3), 324-342.
[8] Vijayalakshmi, M., Menezes, B., Menon, R., Divecha, A., Ravindran,
R., & Mehta, K. (2010, October). Intelligent sales forecasting engine
using genetic algorithms. In Proceedings of the 19th ACM international
conference on Information and knowledge management (pp. 1669-
1672). ACM.
[9] Yeo, J., Kim, S., Koh, E., Hwang, S. W., & Lipka, N. (2016, April).
Browsing2purchase: Online Customer Model for Sales Forecasting in an
E-Commerce Site. In Proceedings of the 25th International Conference
Companion on World Wide Web (pp. 133-134). International World
Wide Web Conferences Steering Committee.
[10] Choi, T. M., Yu, Y., & Au, K. F. (2011). A hybrid SARIMA wavelet
Fig. 2. The coefficient of determination results transform method for sales forecasting. Decision Support Systems,
51(1), 130-140.
[11] Chang, P. C., Liu, C. H., & Fan, C. Y. (2009). Data clustering and fuzzy
neural network for sales forecasting: A case study in printed circuit
V. CONCLUSION board industry. Knowledge-Based Systems, 22(5), 344-355.
It is crucial to predicting the sales amounts as close as to the [12] Wong, W. K., & Guo, Z. X. (2010). A hybrid intelligent model for
medium-term sales forecasting in fashion retail supply chains using
actual sales amounts for enterprises to increase their profits extreme learning machine and harmony search algorithm. International
[20]. Unless an accurate forecasting model is built, cash flow Journal of Production Economics, 128(2), 614-624.
problems are inevitable. Therefore, building this kind of [13] Katkar, V., Gangopadhyay, S. P., Rathod, S., & Shetty, A. (2015,
prediction models for sales forecasting has a high priority for January). Sales forecasting using data warehouse and Naïve Bayesian
classifier. In Pervasive Computing (ICPC), 2015 International
the organizations. In this study, we investigated the effect of Conference on (pp. 1-6). IEEE
Regression and Time Series Analysis methods on the sales [14] Müller-Navarra, M., Lessmann, S., & Voß, S. (2015, January). Sales
forecasting problem. Our experiments show that the regression Forecasting with Partial Recurrent Neural Networks: Empirical Insights
techniques provide higher performance and accuracy and Benchmarking Results. In System Sciences (HICSS), 2015 48th
Hawaii International Conference on (pp. 1108-1116). IEEE.
compared to the time series analysis techniques. Boosted [15] Gao, M., Xu, W., Fu, H., Wang, M., & Liang, X. (2014, July). A novel
Decision Tree Regression algorithm was the best predictor for forecasting method for large-scale sales prediction using extreme
sales forecasting with the 0.97 coefficient of determination. learning machine. In Computational Sciences and Optimization (CSO),
Prediction results were obtained for weekly sales quantities. 2014 Seventh International Joint Conference on (pp. 602-606). IEEE.
[16] Omar, H. A., & Liu, D. R. (2012, January). Enhancing sales forecasting
In the future, a hybrid model using ARIMA and Boosted by using neuro networks and the popularity of magazine article titles. In
Decision Tree Regression techniques will be investigated to 2012 Sixth International Conference on Genetic and Evolutionary
solve this problem. In addition, the implementation of another Computing (ICGEC) (pp. 577-580).
hybrid model with an SVM model is planned. New [17] Lu, C. J., Lee, T. S., & Lian, C. M. (2010, December). Sales forecasting
of IT products using a hybrid MARS and SVR model. In 2010 IEEE
experiments will be performed if the new public data is International Conference on Data Mining Workshops (pp. 593-599)..
obtained. Also, deep learning algorithms [22, 23] can be [18] Stojanović, N., Soldatović, M., & Milićević, M. (2014, June). Walmart
investigated for sales prediction problem. Recruiting–Store Sales Forecasting. In Proceedings of the XIV
International Symposium Symorg 2014: New Business Models and
Sustainable Competitiveness (p. 135). Fon.
ACKNOWLEDGMENT [19] Kaggle.com | Kaggle Datasets | Open Datasets for Any Project,
The authors thank Wageningen University and Istanbul Kültür “https://www.kaggle.com/” Access date: 25.02.2019.
[20] Bohanec, M., Borštnar, M. K., & Robnik-Šikonja, M. (2017). Explaining
University for the infrastructure support to complete this machine learning models in sales predictions. Expert Systems with
scientific research. Applications, 71, 416-428.
[21] Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles
and practice. OTexts.
[22] Tsoumakas, G. (2018). A survey of machine learning techniques for
REFERENCES food sales prediction. Artificial Intelligence Review, 1-7.
[1] Kuo, R. J. (2001). A sales forecasting system based on fuzzy neural [23] Luce, L. (2019). Deep Learning and Demand Forecasting. In Artificial
network with initial weights generated by genetic algorithm. European Intelligence for Fashion (pp. 155-166). Apress, Berkeley, CA.
Journal of Operational Research, 129(3), 496-517.
[2] Chen, F. L., & Ou, T. Y. (2011). Sales forecasting system based on Gray
extreme learning machine with Taguchi method in retail industry. Expert
Systems with Applications, 38(3), 1336-1345
[3] Zhao, J., Tang, W., Fang, X., Wang, J., Liu, J., Ouyang, H., ... & Qiang,
J. (2015, September). A Novel Electricity Sales Forecasting Method
Based on Clustering, Regression and Time Series Analysis. In
Proceedings of the 2015 International Conference on Artificial
Intelligence and Software Engineering.,
[4] Tian, Y., Liu, Y., Xu, D., Yao, T., Zhang, M., & Ma, S. (2012, April).
Incorporating Seasonal Time Series Analysis with Search Behavior
Information in Sales Forecasting. In Proceedings of the 21st
International Conference on World Wide Web (pp. 615-616).

26
BALKAN JOURNAL OF ELECTRICAL & COMPUTER ENGINEERING, Vol. 7, No. 1, January 2019

BIOGRAPHIES
CAGATAY CATAL received the
B.S. and M.S. degrees in computer
engineering from Istanbul Technical
University, Istanbul, and the Ph.D.
degree in computer engineering from
Yildiz Technical University, Istanbul,
in 2008. He worked 8 years at the
Scientific and Technological Research
Council of Turkey as Senior
Researcher.
Later, he worked 6 years in Istanbul Kültür University,
Department of Computer Engineering as Associate Professor
and was the Head of the Department for the last 3 years. On
January 2018, he joined the Information Technology Group of
Wageningen University in the Netherlands. His research
interests are data science, machine learning, software
engineering, and experimental software engineering.

KAAN ECE received the B.S. degree in

computer engineering from Istanbul
Kültür University and has been working
in Eteration software company for 2
years. His research interests are machine
learning and software engineering.

BEGÜM ARSLAN received the B.S.

degree in computer engineering from
Istanbul Kültür University. Her research
interests are machine learning, data
science, and software engineering.

AKHAN AKBULUT received the B.S.

and M.S. degrees in computer
engineering from Istanbul Kültür
University and the Ph.D. degree in
computer engineering from Istanbul
University, Istanbul. Later, he worked 2
years in the software industry. In 2004,
he joined the department of computer
engineering in Istanbul Kültür
University. He has been working at the department of
computer science in North Carolina State University since
2017 as Postdoctoral researcher. His research interests are
machine learning, cloud computing, software engineering, and
internet architectures.

PDF Formal Modeling and Verification of Cyber Physical Systems 1st International Summer School on Methods and Tools for the Design of Digital Systems Bremen Germany September 2015 1st Edition Rolf Drechsler download
100% (4)
PDF Formal Modeling and Verification of Cyber Physical Systems 1st International Summer School on Methods and Tools for the Design of Digital Systems Bremen Germany September 2015 1st Edition Rolf Drechsler download
55 pages
ForecastingRetailSalesusingMachine Learning Models
No ratings yet
ForecastingRetailSalesusingMachine Learning Models
34 pages
Program Guide: Sept 20-23, 2010 San Jose Convention Center
No ratings yet
Program Guide: Sept 20-23, 2010 San Jose Convention Center
160 pages
SALES FORECAST PAPER
No ratings yet
SALES FORECAST PAPER
8 pages
Project Report Shruti 2
No ratings yet
Project Report Shruti 2
66 pages
Web-Based Guidance Consultation and Information System in Pamantasan NG Lungsod NG Marikina
100% (1)
Web-Based Guidance Consultation and Information System in Pamantasan NG Lungsod NG Marikina
37 pages
Predicting The Future of Sales: A Machine Learning Analysis of Rossman Store Sales
No ratings yet
Predicting The Future of Sales: A Machine Learning Analysis of Rossman Store Sales
11 pages
PRINCE2 Practitioner Resource Book v3 7 235
No ratings yet
PRINCE2 Practitioner Resource Book v3 7 235
1 page
Bug Report
No ratings yet
Bug Report
42 pages
P1 Examine The Relationship Between An API and A Software Development Kit (SDK) What Is API?
100% (1)
P1 Examine The Relationship Between An API and A Software Development Kit (SDK) What Is API?
4 pages
synopsis-big mart sales prediction
No ratings yet
synopsis-big mart sales prediction
3 pages
Opencv Python Tutorial
100% (1)
Opencv Python Tutorial
88 pages
Manual Pronto
No ratings yet
Manual Pronto
185 pages
Documentation Assignment
No ratings yet
Documentation Assignment
6 pages
Singh 2020 J. Phys. Conf. Ser. 1712 012042
No ratings yet
Singh 2020 J. Phys. Conf. Ser. 1712 012042
9 pages
4 0 Normalization
No ratings yet
4 0 Normalization
51 pages
Socket Lab 21
No ratings yet
Socket Lab 21
9 pages
Forecasting Future Sales of Bigmarts
100% (1)
Forecasting Future Sales of Bigmarts
5 pages
Doc3 Main Report
No ratings yet
Doc3 Main Report
60 pages
Debug Your App: Unit-Iv
No ratings yet
Debug Your App: Unit-Iv
63 pages
AMMMP2023-87-94
No ratings yet
AMMMP2023-87-94
8 pages
Improving Sales Forecasting Accuracy: A Tensor Factorization Approach With Demand Awareness
No ratings yet
Improving Sales Forecasting Accuracy: A Tensor Factorization Approach With Demand Awareness
30 pages
A Survey On Retail Sales Forecasting and Prediction in Fashion Markets
No ratings yet
A Survey On Retail Sales Forecasting and Prediction in Fashion Markets
9 pages
Applsci 12 07081
No ratings yet
Applsci 12 07081
17 pages
Building Drift Analysis
No ratings yet
Building Drift Analysis
32 pages
MachhindraJagadale_UI_Developer
No ratings yet
MachhindraJagadale_UI_Developer
4 pages
Sales Prediction For Online Shopping
No ratings yet
Sales Prediction For Online Shopping
4 pages
BRTools configuration for Oracle installation
No ratings yet
BRTools configuration for Oracle installation
7 pages
Lowpass and Bandpass Filter On Speech Signal Using Matlab Tools-Tutorial
No ratings yet
Lowpass and Bandpass Filter On Speech Signal Using Matlab Tools-Tutorial
9 pages
U2431791 DS7010 2324 T2 Introduction (1)
No ratings yet
U2431791 DS7010 2324 T2 Introduction (1)
2 pages
SQL Server Always On Setup
No ratings yet
SQL Server Always On Setup
15 pages
Gokul 3Review
No ratings yet
Gokul 3Review
17 pages
Bai 1
No ratings yet
Bai 1
9 pages
Mistral 1
No ratings yet
Mistral 1
8 pages
gargee
No ratings yet
gargee
9 pages
Unit 4 New
No ratings yet
Unit 4 New
129 pages
Project Report Shruti
No ratings yet
Project Report Shruti
66 pages
Sales Forecasting Elsvier
No ratings yet
Sales Forecasting Elsvier
19 pages
An_Effective_Predicting_E_Commerce_Sales
No ratings yet
An_Effective_Predicting_E_Commerce_Sales
11 pages
BASEPAPER_3
No ratings yet
BASEPAPER_3
14 pages
EHP QuickStartGuide
No ratings yet
EHP QuickStartGuide
10 pages
NYSDEC WastewaterTreatmentPlants DataDictionary
No ratings yet
NYSDEC WastewaterTreatmentPlants DataDictionary
1 page
Gate Control
No ratings yet
Gate Control
64 pages
EIGER - Digital Marketing
No ratings yet
EIGER - Digital Marketing
10 pages
Analytical Methods of Machine Learning Model For E-Commerce Sales Analysis and Prediction
No ratings yet
Analytical Methods of Machine Learning Model For E-Commerce Sales Analysis and Prediction
6 pages
Neba2672024AJPAS118179
No ratings yet
Neba2672024AJPAS118179
24 pages
ECSFS Report (670 - Kumar Shantanu)
No ratings yet
ECSFS Report (670 - Kumar Shantanu)
21 pages
RetailSalesPredictionUsingMachineLearningAlgorithms
No ratings yet
RetailSalesPredictionUsingMachineLearningAlgorithms
9 pages
Seminar Report
No ratings yet
Seminar Report
25 pages
Main Report
No ratings yet
Main Report
67 pages
Model Question Paper BC5901
0% (1)
Model Question Paper BC5901
26 pages
Sales Prediction and Product Recommendation Model Through
No ratings yet
Sales Prediction and Product Recommendation Model Through
20 pages
Acknowledgement: MR - Bhushan Deshpande
No ratings yet
Acknowledgement: MR - Bhushan Deshpande
7 pages
Public Key Infrastructure by Muhedin Abdullahi Mohammed
No ratings yet
Public Key Infrastructure by Muhedin Abdullahi Mohammed
123 pages
BM5242 BM5342 BM5642 User Manual
No ratings yet
BM5242 BM5342 BM5642 User Manual
35 pages
BMSP-ML: Big Mart Sales Prediction Using Different Machine Learning Techniques
No ratings yet
BMSP-ML: Big Mart Sales Prediction Using Different Machine Learning Techniques
10 pages
finaal project
No ratings yet
finaal project
13 pages
Pavlyshenko (2019) Machine-Learning Models For Sales Time Series Forecasting. Data-04-00015-V2
No ratings yet
Pavlyshenko (2019) Machine-Learning Models For Sales Time Series Forecasting. Data-04-00015-V2
11 pages
New Text Document
No ratings yet
New Text Document
4 pages
Copa
100% (1)
Copa
13 pages
Chapter 3 - Object Oriented Programming
No ratings yet
Chapter 3 - Object Oriented Programming
110 pages
JICET-Abdullah Bin Tayyab
No ratings yet
JICET-Abdullah Bin Tayyab
11 pages
Content
No ratings yet
Content
8 pages
Machine-Learning Models For Sales Time Series Forecasting: Bohdan M. Pavlyshenko
No ratings yet
Machine-Learning Models For Sales Time Series Forecasting: Bohdan M. Pavlyshenko
11 pages
S 2
No ratings yet
S 2
9 pages
S 1
No ratings yet
S 1
12 pages
Analysis of Machine Learning Model For Predicting Sales Forecasting
No ratings yet
Analysis of Machine Learning Model For Predicting Sales Forecasting
6 pages
3 Main
No ratings yet
3 Main
9 pages
IOT Based Automatic Traffic Signal Monitoring and Controlling With Density Levels. Conditions
No ratings yet
IOT Based Automatic Traffic Signal Monitoring and Controlling With Density Levels. Conditions
3 pages
PayU - Sales Deck
No ratings yet
PayU - Sales Deck
27 pages
S 2
No ratings yet
S 2
56 pages
MAD Syllabus
No ratings yet
MAD Syllabus
4 pages
Salespredmmmm
No ratings yet
Salespredmmmm
15 pages
DSP Research Paper by Shanmukh and Meher
No ratings yet
DSP Research Paper by Shanmukh and Meher
33 pages
Syllabus - Object Oriented Programming (Java)
No ratings yet
Syllabus - Object Oriented Programming (Java)
14 pages
Asm
No ratings yet
Asm
5 pages
final pbl of aaryan & Satyam
No ratings yet
final pbl of aaryan & Satyam
19 pages
Ids Case Study
No ratings yet
Ids Case Study
15 pages
Machine-Learning Models For Sales Time Series Forecasting: Bohdan M. Pavlyshenko
No ratings yet
Machine-Learning Models For Sales Time Series Forecasting: Bohdan M. Pavlyshenko
11 pages
Data Flow Diagram With Examples - Video Rental System Example
No ratings yet
Data Flow Diagram With Examples - Video Rental System Example
3 pages
RP 3
No ratings yet
RP 3
12 pages
Grid Search Optimization (GSO) Based Future Sales Prediction For Big Mart
No ratings yet
Grid Search Optimization (GSO) Based Future Sales Prediction For Big Mart
7 pages
Intern Report
No ratings yet
Intern Report
17 pages
Unit 10 Assignment Brief 2 2018 2019
No ratings yet
Unit 10 Assignment Brief 2 2018 2019
3 pages
PPIR!1
No ratings yet
PPIR!1
9 pages
Unit:: A. Text Mining Algorithms
No ratings yet
Unit:: A. Text Mining Algorithms
21 pages
PPIR
No ratings yet
PPIR
8 pages
Intelligent Sales Prediction Using Machine Learning Techniques
No ratings yet
Intelligent Sales Prediction Using Machine Learning Techniques
6 pages
Final Year Project
No ratings yet
Final Year Project
41 pages
BigMart Sale Prediction Using Machine Learning
No ratings yet
BigMart Sale Prediction Using Machine Learning
2 pages
Vmware Exam Q /a
No ratings yet
Vmware Exam Q /a
21 pages
Non-Dominated Sorting Genetic Algorithm With Numerical Example Step-by-Step
No ratings yet
Non-Dominated Sorting Genetic Algorithm With Numerical Example Step-by-Step
185 pages
National Vocational and Technical Training Commission: Government of Pakistan
No ratings yet
National Vocational and Technical Training Commission: Government of Pakistan
22 pages
Sales Prediction For Big Mart 3.0.pptx MM
No ratings yet
Sales Prediction For Big Mart 3.0.pptx MM
25 pages
Sales Prediction Model For Big Mart: Parichay: Maharaja Surajmal Institute Journal of Applied Research
No ratings yet
Sales Prediction Model For Big Mart: Parichay: Maharaja Surajmal Institute Journal of Applied Research
11 pages
Improvizing Big Market Sales Prediction: Meghana N
No ratings yet
Improvizing Big Market Sales Prediction: Meghana N
7 pages
Chapter 1: Introduction: 1.1 Background Theory
No ratings yet
Chapter 1: Introduction: 1.1 Background Theory
36 pages
Final DMT Report PDF
No ratings yet
Final DMT Report PDF
27 pages
Ms MP Manual 2012
No ratings yet
Ms MP Manual 2012
8 pages
FinalPaper SalesPredictionModelforBigMart
No ratings yet
FinalPaper SalesPredictionModelforBigMart
14 pages
Big Data and Data Science: Analytics for the Future
From Everand
Big Data and Data Science: Analytics for the Future
Dhaanyalakshmi Ahuja
No ratings yet
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet

ML 3

Uploaded by

ML 3

Uploaded by

20

Benchmarking of Regression and Time Series

sufficient number of products, but there is no potential

T HE IDENTIFICATION of the number of stocks and the

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

P( |y, X , ) = Z P( |w, X, , B. Time Series Analysis

Gini(T)=1- (10) Seasonal ETS

Boosted Decision Tree Regression Non-Seasonal ETS

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

= (21) tempCategory. Also, economical features were investigated

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

The coefficient of determination, which is also known as R2

The coefficient of determination, R2, signifies the proportion

To extend the experiment with the entire dataset, all the

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

KAAN ECE received the B.S. degree in

BEGÜM ARSLAN received the B.S.

AKHAN AKBULUT received the B.S.

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece

You might also like