0% found this document useful (0 votes)
54 views

Application of Predictive Analytics in Volume Forecasting and Resource Planning

This document discusses applying predictive analytics to volume forecasting and resource planning. It outlines selecting demand projection models using R that can incorporate business restrictions. The document analyzes time series forecasting models like TBATS, Prophet, NNAR, H2O AutoML and LSTM. It discusses assessing model performance on a sample dataset using mean absolute error and root mean squared error, with root mean squared error given precedence due to penalizing larger errors.

Uploaded by

Vamsi Krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Application of Predictive Analytics in Volume Forecasting and Resource Planning

This document discusses applying predictive analytics to volume forecasting and resource planning. It outlines selecting demand projection models using R that can incorporate business restrictions. The document analyzes time series forecasting models like TBATS, Prophet, NNAR, H2O AutoML and LSTM. It discusses assessing model performance on a sample dataset using mean absolute error and root mean squared error, with root mean squared error given precedence due to penalizing larger errors.

Uploaded by

Vamsi Krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Application of

Predictive Analytics in
Volume Forecasting
and Resource Planning
1.
Introduction
About the Project, and
the approach taken.

2
Project Objective:

▶ Building a predictive model that will better forecast


order inflows, to improve shop floor management
within the Trade Services Unit.

Project Approach:

▶ Identifying demand projection models that can be used for the task.
▶ Selecting models that can be deployed within the company’s
environment. The tools used will be based on R.
▶ Making sure that the demand prediction models and tools selected
can incorporate restrictions specific to the business case of the bank
and trade services operations.
▶ Taking historical data and testing it with selected models to identify
the one that gives the best results.

3
2.
The Problem
Understanding the problem, and
the Possible Solutions.

4
Defining the Problem:

▶ The problem in front of us is a Demand Prediction


Problem.
▶ The historical data available with us is Time Series Data.
▶ The goal is to identify models that can be used for
modelling the time series data, and can be used to
forecast their future evolution.

Time Series Data


• It is a discrete time data obtained by indexing data
points at successively and equally placed distinct points
of time.
• Mathematically, a time series is represented as Y(t), i.e.
observations as a function o time.
• In forecasting, we find ourselves interested in finding out
Y(t+h), using information only available till time t.
5
Time Series Forecasting Models:

Historically, the following methods have been used for forecasting:


• Naïve, SNaïve
• Moving Average, Weighted Moving Average
• Exponential Smoothing
• ARIMA, SARIMA
But now, more advanced models are being used. These
models incorporate advances in statistics as well as computing. The
make use of the mathematical models developed for Machine Learning
and Artificial Neural Networks.

The models we will use are: TBATS, Prophet, NNAR, H2O


AutoML based models, and LSTM Models.

6
3.
Forecasting Models
An Overview of the selected Models

7
The Models and Tools that we will be
looking at for our problem are as follows:

▪ TBATS ( Trigonometric, Box-Cox Transform,


ARMA errors, Trend and Seasonal Components)
▪ NNAR (Neural Network AutoRegression)
▪ Prophet
▪ H2O AutoML
▪ LSTM ( Long Short Term Memory ) Model

8
A Brief Overview of each of the selected
Models/Tools:
1. TBATS (Trigonometric, Box-Cox Transform, ARMA errors, Trend
and Seasonal Components):

i. This is an advanced statistical model. Of the five models selected


for our solution, this is the only statistical model
ii. The ability to deal with multiple seasonalities helps set this model
apart from simpler statistical models
iii. The model achieves this by modelling each seasonality with a
trigonometric representation based on Fourier series
2. NNAR (Neural Network AutoRegression):

i. The NNETAR model is based on a fully connected neural network


ii. By virtue of using Artificial Neural Networks, the model based on
this method allows complex nonlinear relationships between
variables and predictors.

Both these models are implemented in R,


using tbats and nnetar functions in Forecast package. 9
A Brief Overview of each of the selected
Models/Tools:
3. Prophet:

i. Prophet is a procedure for forecasting time series data based on an


additive model (nonparametric regression method) where non-linear
trends are fit with yearly, weekly, and daily seasonality, plus holiday
effects
ii. Incorporates built-in support for country level holidays for India and 12
other countries
iii. In some instances the seasonality may depend on other factors, such as
a weekly seasonal pattern that is different during the summer than it is
during the rest of the year, or a daily seasonal pattern that is different on
weekends vs. on weekdays. These types of seasonalities can be modeled
using conditional seasonalities.
iv. Best suited for daily data and can handle some irregular gaps in the data

This models are implemented in R,


Using prophet package for R distributed by Facebook.
It is an Open-Source package distributed through CRAN.
10
A Brief Overview of each of the selected
Models/Tools:

4. H2O AutoML

i. H2O is an open source, in-memory, distributed, fast, and scalable


machine learning and predictive analytics tool
ii. This tool provides provides a set of Machine Learning algorithms,
and the data is trained on all of them. At the end, a leaderboard is
created to identify the best model that fits the provided data
iii. The current version of AutoML trains and cross-validates the
following algorithms (in the following order): three pre-specified
XGBoost GBM (Gradient Boosting Machine) models, a fixed grid of
GLMs, a default Random Forest (DRF), five pre-specified H2O GBMs,
a near-default Deep Neural Net, an Extremely Randomized Forest
(XRT), a random grid of XGBoost GBMs, a random grid of H2O GBMs,
and a random grid of Deep Neural Nets

This tool is implemented in R,


h2o package is open-source and distributed through CRAN.
It depends on Java SE13 or lower for runtime implementation.
11
A Brief Overview of each of the selected
Models/Tools:

5. LSTM (Long Short Term Memory) Method:

i. It is a type of Recurrent Neural Network(RNN). RNN solve the failings


of neural networks and enable them to work on time series data
ii. As the name suggests, it is better at maintaining a balance between
recent data and that from the past. This is because these neural
networks are designed to solve the long-term dependency problem.
iii. It’s insensitivity to gap length helps it to perform better when there
is inconsistency in time lag duration in time series data

This models are implemented in R/Python,


Using keras an open-source package distributed through CRAN/PyPI.
It uses tensorflow backend which is an open-source tool provided by
Google for machine learning tasks.

12
4.
Data, and
Training Method
A look at the sample data being
used, and how the training is done.

13
Sample Dataset
Time Series Training and Validation

• The Dataset that we have chosen has data from 01-01-


2013 to 31-12-2017.
• We divide the data into the following:
• Training Set :– From 01-01-2013 to 31-12-2016 (1461 Days)
• Test Set :- From 01-01-2017 to 31-12-2017 (365 days)
• The model is fit onto the training set.
• Then, a forecast for the duration of the test set is
generated based on the fitted model.
• The forecast thus generated is compared to original
values in the training set.

15
5.
Assessing the
Performance of
Selected Models
Selecting the right tools to
compare our models

16
Assessing the Performance of
Selected Models
For comparing the fit obtained from the various models,
going line by line, looking at each prediction will not give a true
picture.
Therefore, it is important to select a parameter on which to
assess the various models.
For our purpose, we will be using two parameters:
1. MAE ( Mean Absolute Error )
2. RMSE ( Root Mean Square Error )
Of these two, RMSE will be given precedence. The reason
for this has been discussed in the subsequent slides.

17
MAE ( Mean Absolute Error )

Mean Absolute Error (MAE): MAE measures the average magnitude


of the errors in a set of predictions, without considering their
direction. It’s the average over the test sample of the absolute
differences between prediction and actual observation where all
individual differences have equal weight.

18
RMSE ( Root Mean Square Error )

Root mean squared error (RMSE): RMSE is a quadratic scoring rule


that also measures the average magnitude of the error. It’s the
square root of the average of squared differences between
prediction and actual observation.

-------------------------------------------------------------------------------------------

For comparison, look at the MAE


formula from the previous slide.

19
Comparing RMSE and MAE

Similarities: Both MAE and RMSE express average model prediction


error in units of the variable of interest. Both metrics can range from
0 to ∞ and are indifferent to the direction of errors. They are
negatively-oriented scores, which means lower values are better.

Differences: While calculating RMSE, since the errors are squared


before they are averaged, RMSE gives a relatively high weight to
large errors. This means the RMSE should be more useful when large
errors are particularly undesirable.

The subsequent slide shows a comparison between RMSE and MAE.

20
Comparing RMSE and MAE

So, while will be calculating both MAE and RMSE, priority


will be given to using RMSE as the guiding parameter.

21
6.
TBATS

22
TBATS Model Fit (2013-2016) and Forecast (2017)

23
TBATS Forecast 2017 (looking at 3 month Windows)

24
Above: TBATS is able to keep up with Trend (Yellow) and Yearly
Seasonality (Black)
Below: Though it tracks the trend (Green), it is unable to keep up
with weekly seasonality and daily peaks

25
TBATS Conclusion:

• MAE = 20.00773
• RMSE = 25.88035
• As we can see from the charts, while the Model is able to
follow the trend of the data, and also navigate the yearly
seasonality of the data. However, it is unable to handle
weekly seasonality.
• It is also unable to get close to the daily peaks and troughs.

26
7.
NNAR

27
8.
Prophet

29
Characteristics of the Data
Identified by Prophet while Fitting the Model
Prophet Model Fit (2013-2016) and Forecast (2017)

31
Prophet Forecast 2017 (looking at 3 month Windows)

32
Above: Prophet is able to keep up with Trend (Yellow) and Yearly
Seasonality (Green)
Below: It is able to model weekly seasonality, and gets close to daily
peaks

33
Prophet Conclusion:

• MAE = 13.52374
• RMSE = 17.01999
• As we can see from the charts, the Model is able to follow
the trend of the data, and also navigate the yearly
seasonality of the data. It is also able to handle weekly
seasonality.
• However, though it gets close to daily peaks and troughs, it
overshoots/undershoots them multiple times.
• When presented with a data with lower daily variance, the
fit should be much tighter.

34
9.
H2O AutoML

35
H2O AutoML Leaderboard, and best model
H2O AutoML Leader Model Forecast (2017)

37
H2O AutoML Leader Model Forecast 2017
(looking at 3 month Windows)

38
Right: Model is able to keep up with Trend
and Yearly Seasonality (Yellow)

Bottom: It is able to model weekly


seasonality, and gets close to daily peaks

39
H2O AutoML Leader Model Conclusion:

• MAE = 13.85142
• RMSE = 18.09316
• As we can see from the charts, the Model is able to follow
the trend of the data, and also navigate the yearly
seasonality of the data. It is also able to handle weekly
seasonality.
• However, though it gets close to daily peaks and troughs, it
overshoots/undershoots them multiple times.
• When presented with a data with lower daily variance, the
fit should be much tighter.

40
10.
LSTM Model

41
LSTM Model Forecast (2017)

42
LSTM Model Forecast 2017
(looking at 3 month Windows)

43
Right: Model is able to keep up with Trend
and Yearly Seasonality (Yellow)

Bottom: It is able to model weekly


seasonality, and gets close to daily peaks

44
LSTM Model Conclusion:

• MAE = 16.7863
• RMSE = 21.65362
• As we can see from the charts, the Model is able to follow
the trend of the data, and also navigate the yearly
seasonality of the data. It is also able to handle weekly
seasonality.
• It overshoots/undershoots the daily peaks and troughs
multiple times.
• When presented with a data with lower daily variance, the
fit should be much tighter.

45
11.
Comparing Models

46
Comparing Models:

MAE RMSE

TBATS 20.00773 25.88035

Prophet 13.52374 17.01999

H2O AutoML 13.85142 18.09316

LSTM 16.7863 21.65362

47
1. Prophet
2. H2O AutoML
We will be going forward with
these two Tools.

48
12.
Prophet <v/s>
H2O AutoML
Final Comparision

49
Modifying the Data to Reflect our business case:

Index Date Product Sub Product Volume STP TSU Index Date Product Sub Product Volume STP TSU
1 01-01-13 INWARD_SWIFT NA 179 STP Hyderabad 1306 01-01-13 INWARD_SWIFT NA 90 NON-STP Hyderabad
2 02-01-13 INWARD_SWIFT NA 191 STP Hyderabad 1307 02-01-13 INWARD_SWIFT NA 96 NON-STP Hyderabad
3 03-01-13 INWARD_SWIFT NA 209 STP Hyderabad 1308 03-01-13 INWARD_SWIFT NA 105 NON-STP Hyderabad
4 04-01-13 INWARD_SWIFT NA 217 STP Hyderabad 1309 04-01-13 INWARD_SWIFT NA 109 NON-STP Hyderabad
5 07-01-13 INWARD_SWIFT NA 180 STP Hyderabad 1310 07-01-13 INWARD_SWIFT NA 90 NON-STP Hyderabad
6 08-01-13 INWARD_SWIFT NA 174 STP Hyderabad 1311 08-01-13 INWARD_SWIFT NA 87 NON-STP Hyderabad
7 09-01-13 INWARD_SWIFT NA 174 STP Hyderabad 1312 09-01-13 INWARD_SWIFT NA 87 NON-STP Hyderabad
8 10-01-13 INWARD_SWIFT NA 187 STP Hyderabad 1313 10-01-13 INWARD_SWIFT NA 94 NON-STP Hyderabad
9 11-01-13 INWARD_SWIFT NA 198 STP Hyderabad 1314 11-01-13 INWARD_SWIFT NA 99 NON-STP Hyderabad
10 14-01-13 INWARD_SWIFT NA 180 STP Hyderabad 1315 14-01-13 INWARD_SWIFT NA 90 NON-STP Hyderabad

50
Characteristics of the Data
Identified by Prophet while Fitting the Model
Prophet Model Fit (2013-2016) and Forecast (2017)

52
Prophet Forecast 2017 (looking at 3 month Windows)

53
Above: Prophet is able to keep up with Trend (Yellow) and Yearly
Seasonality (Green)
Below: It is able to model weekly seasonality, and gets very close to
daily peaks

54
Prophet Conclusion:

• MAE = 18.41301
• RMSE = 23.65726
• As we can see from the charts, the Model is able to follow
the trend of the data, and also navigate the yearly
seasonality of the data. It is also able to handle weekly
seasonality.
• It gets close to peaks and troughs most of the time.

55
H2O AutoML Leaderboard, and best model
H2O AutoML Leader Model Forecast (2017)

57
H2O AutoML Leader Model Forecast 2017
(looking at 3 month Windows)

58
Right: Model is able to keep up with Trend
and Yearly Seasonality (Yellow)

Bottom: It is able to model weekly


seasonality, and gets close to daily peaks

59
H2O AutoML Leader Model Conclusion:

• MAE = 19.50624
• RMSE = 24.99153
• As we can see from the charts, the Model is able to follow
the trend of the data, and also navigate the yearly
seasonality of the data. It is also able to handle weekly
seasonality.
• It gets close to the daily peaks and troughs most of the time

60
2017 Forecast Prophet v/s H2O AutoML

61
Comparing Models:

MAE RMSE

Prophet 18.41301 23.65726

H2O AutoML 19.50624 24.99153

▪ As we can see from the table above, on the comparison


parameters, both Prophet and H2O perform really close
but Prophet has a slight edge. The Charts show a similar
trend with both Prophet and H2O Model trend very close
to the original data.
▪ Prophet’s better performance on comparison parameters,
and its feature set make the best solution, among the
options considered, for forecasting.

62
13.
Implementation
A look at how it will work

63
A look at how it will work

1. Dataset 2. Prophet Module 3. UI


A Dataset of the The Module along with From his/her dashboard,
transaction volumes Prophet package will can the user can select, the
for the past few be written in either R or product, the unit, and the
years. Everyday or Python. When called on, period for which he
at periodic intervals, the module will take in wants the forecast. Based
the dataset will be the dataset , and fit a on his selection, the
updated with the model onto it. It will then dataset will be filtered
latest data of the provide a forecast for the and fed to the module
previous day or the period required by the which will then give the
previous few days user or as mentioned in output for the requested
respectively. the UI. period.

64
UI Mock Up
65
What is happening behind the scenes:

1. The Module takes your inputs to filter the primary dataset and to
create the input dataset.

2. Creates a Prophet Model frame with the necessary parameters,


and fits the model onto the input dataset.

66
What is happening behind the scenes:

3. Then, it uses the model to create a forecast for period requested by


the user.

periods = 1, for the next day’s forecast


periods = 7, for the next week’s forecast.

4. The prediction will be under yhat in prophet_forecast

n=1, will return last value


n=7, will return the last seven values
Depending on the periods for which the forecast was run,
Tail function will be used to retrieve corresponding values.
67
Thank You!
- Nallamilli Sandeep Reddy

Project: Application of Predictive Analytics in


Volume Forecasting and Resource Planning

Mentor: Nageshwara Rao

Buddy: G Mahesh

68
Appendix: A Brief look at the model metrics

69

You might also like