0% found this document useful (0 votes)
3 views

Rose Wine Analysis

This document provides a comprehensive analysis of rose wine sales data from 1980 to 1995, focusing on time series forecasting. It includes exploratory data analysis, model building using various techniques such as linear regression and ARIMA, and evaluation of model performance using RMSE. The findings highlight a declining trend in sales over the years and suggest measures for future sales improvement.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Rose Wine Analysis

This document provides a comprehensive analysis of rose wine sales data from 1980 to 1995, focusing on time series forecasting. It includes exploratory data analysis, model building using various techniques such as linear regression and ARIMA, and evaluation of model performance using RMSE. The findings highlight a declining trend in sales over the years and suggest measures for future sales improvement.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 78

Time Series Forecasting -rose wine analysis

Swathi Anirudh
[Company name] [Company address]
Table of Contents
Rose Wine Analysis.............................................................................................................................................................1
1. Executive Summary......................................................................................................................................................1
2. Introduction.................................................................................................................................................................1
3. Data Dictionary............................................................................................................................................................1
4. Data Description..........................................................................................................................................................1
5. Sample of the dataset..................................................................................................................................................2
6. Read the data as an appropriate Time Series data and plot the data........................................................................2
7. Time Stamp created from ‘YearMonth’ column..........................................................................................................3
7.1. Resulting dataset after removing the “Year-Month” column and appending.......................................................3
8. Renaming the columns of the data frame...................................................................................................................4
9. Checking null values in the dataset.............................................................................................................................4
10. Perform appropriate Exploratory Data Analysis to understand the data and also perform decomposition........5
11. Descriptive Summary of the Dataset.......................................................................................................................6
12. Exploratory Analysis................................................................................................................................................6
12.1. Yearly Plot................................................................................................................................................................7
12.2. Monthly Plot........................................................................................................................................................8
12.3. Annual Sales........................................................................................................................................................9
12.4. Quarterly Sales....................................................................................................................................................9
12.5. Monthly Sales across Different Years...............................................................................................................10
12.6. Empirical Cumulative Distribution Plot.............................................................................................................10
12.7. Monthly Time Series Plot..................................................................................................................................11
12.8. Average Wine sales per month & change percentage over each month.........................................................12
12.9. Decomposition of Time Series...........................................................................................................................13
12.10. Additive Decomposition....................................................................................................................................13
12.11. Multiplicative Decomposition...........................................................................................................................14
12.12. Split the data into training and test. The test data should start in 1991.........................................................15
12.13. Build all the exponential smoothing models on the training data and evaluate the model using RMSE on the
test data. Other models such as regression, naïve forecast models and simple average models. should also be built on
the training data and check the performance on the test data using RMSE.....................................................................16
12.14. Model 1 – Linear Regression.............................................................................................................................16
12.14.1. Linear Regression: Model Evaluation...........................................................................................................17
12.15. Model 2 – Naïve Forecast..................................................................................................................................17
12.15.1. Naïve Forecast: Model Evaluation................................................................................................................18
12.16. Model 3 – Simple Average.................................................................................................................................19
12.16.1. Simple Average: Model Evaluation...............................................................................................................20
12.17. Model 4 – Moving Average (MA)......................................................................................................................20
12.18. Moving Average: Model Evaluation..................................................................................................................23
12.19. Model 5 – Simple Exponential Smoothing........................................................................................................24

2
12.19.1. Simple Exponential Smoothing: Model Evaluation......................................................................................27
12.20. Model 6 – Double Exponential Smoothing (Holt's Model)...............................................................................27
12.21. Double Exponential Smoothing: Model Evaluation..........................................................................................30
12.22. Model 7 – Triple Exponential Smoothing (Holt-Winter’s Model).....................................................................30
12.23. Triple Exponential Smoothing: Model Evaluation............................................................................................34
12.24. Check for the stationarity of the data on which the model is being built on using appropriate statistical tests
and also mention the hypothesis for the statistical test. If the data is found to be non-stationary, take appropriate
steps to make it stationary. Check the new data for stationarity and comment. Note: Stationarity should be checked
at alpha = 0.05.................................................................................................................................................................... 36
12.24.1. Checking for Stationarity of Entire Data.......................................................................................................36
12.25. Checking for Stationarity of Training Data........................................................................................................38
12.26. Build an automated version of the ARIMA/SARIMA model in which the parameters are selected using the
lowest Akaike Information Criteria (AIC) on the training data and evaluate this model on the test data using RMSE.. .39
12.26.1. Model 8 – Auto-Regressive Integrated Moving Average (ARIMA)...............................................................39
12.27. Automated ARIMA: Model Evaluation.............................................................................................................44
12.28. Model 9 – Seasonal Auto-Regressive Integrated Moving Average (SARIMA)..................................................45
12.29. Automated SARIMA: Model Evaluation............................................................................................................50
12.30. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the training data and
evaluate this model on the test data using RMSE.............................................................................................................51
12.30.1. Model 10 – Auto-Regressive Integrated Moving Average (ARIMA) – Manual.............................................51
12.30.2. ACF Plot – Training Data...............................................................................................................................52
12.30.3. PACF Plot – Training Data.............................................................................................................................52
12.31. Manual ARIMA: Model Evaluation....................................................................................................................55
12.32. Model 11 – Seasonal Auto-Regressive Integrated Moving Average (SARIMA) – Manual................................56
12.32.1. ACF Plot – Seasonally differenced (F=12) Training Data...............................................................................57
12.32.2. PACF Plot – Seasonally differenced (F=12) Training Data.............................................................................57
12.33. Manual SARIMA: Model Evaluation..................................................................................................................60
12.34. Build a table (create a data frame) with all the models built along with their corresponding parameters and
the respective RMSE values on the test data.....................................................................................................................61
12.35. Based on the model-building exercise, build the most optimum model(s) on the complete data and predict
12 months into the future with appropriate confidence intervals/bands........................................................................62
12.36. Optimum Model 1:............................................................................................................................................65
12.37. Optimum Model 2:............................................................................................................................................69
Comment on the model thus built and report your findings and suggest the measures that the company should be
taking for future sales........................................................................................................................................................74
Model Insights:................................................................................................................................................................74
Historical Insights:...........................................................................................................................................................74
Forecast Insights:............................................................................................................................................................74
Recommendations:.........................................................................................................................................................75

3
Rose Wine Analysis
1. Executive Summary
Data on wine sales from the 20th century are available from ABC Estate Wines, a wine producing
firm, and should be examined. With the provided information, an estimate of wine sales in the
20th century must be forecasted.

2. Introduction
The purpose of this report is to explore the dataset. Do the exploratory data analysis. Explore the
dataset using central tendency and other parameters. The data consists of sales of Rose wine
from 20th century.

3. Data Dictionary

Variable Name Description


YearMonth Represents the year and month in which the
sales were recorded
Rose Denotes the number of wine units sold

4. Data Description
 YearMonth: Datatime variable from 1980-01 to 1995-07
 Rose: Continuous from 89 to 267

4
5. Sample of the dataset

 Table 1. Sample of first 5 rows of the dataset

 Table 2. Sample of last 5 rows of the dataset

 Dataset has 2 columns which captures the Year and Month of recorded data
and the number of units sold on corresponding Year-Month respectively.

6. Read the data as an appropriate Time Series


data and plot the data.
 Let us check the types of variables in the data frame and check for missing values in
the dataset

 Fig.2 Details of the dataset columns

 The dataset has 2 variables and 187 rows in total. The "YearMonth"

5
column can be deleted after creating a suitable time stamp column
because it is not necessary for our modelling. The column Rose is of
float type. Additionally, we can observe from the data above that
Rose column has some missing values which needs to be imputed
further as it’s a time series.

7. Time Stamp created from ‘YearMonth’ column

 Fig.3 Details of the dataset columns

7.1. Resulting dataset after removing the “Year-


Month” column and appending
Time_Stamp column

 Fig.4 Details of the dataset columns

Time_Stamp column has been set as index of the dataset and column Rose has been renamed as
Rose_Wine_Sales.

6
8. Renaming the columns of the data frame
The below mentioned columns of the data frame have been renamed as shown.

 Original Column Name  Renamed Column Name


 Rose  Rose_Wine_Sales

Fig.5 Details of the dataset columns after renaming

9. Checking null values in the dataset

Fig.6 Null values in the dataset

As can be seen from the above figure, there are 2 null values present in the dataset.
Since it’s a time series we cannot remove it and hence must be imputed.

7
Fig.7 Graph plot of the Rose wine sales dataset

Observation:

 The data set provided contains sales information from January 1980 to July 1995.
 We can see from the plot that there has been a decline in sales over time. Over the years,
the sales have gradually decreased. The data also exhibit some seasonality, as may be
shown.
 There are 2 missing values which must be imputed.

10. Perform appropriate Exploratory Data Analysis


to understand the data and also perform
decomposition.
 Handling Missing Values

Fig.8 Imputed values of the dataset

As can be seen from Fig.6, values are missing for July and August month of 1994. Since it’s a time
series, the missing values cannot be removed. We have imputed them using linear interpolation.

Fig.9 Null values after imputation

11. Descriptive Summary of the Dataset

8
Fig.10 Descriptive Summary of Rose_Wine_Sales column

Observation:

 90 bottles of rose wine are typically sold each month.


 Between 62 and 111 units make up more than 50% of the sold rose wine units.
 The lowest unit sold is 28 units, while the highest unit sold is 267 units.

12. Exploratory Analysis


 Let us analyze the wine sales across different years and months using boxplots

12.1. Yearly Plot

9
Fig.11 Yearly plot of Rose wine sales

Observation:

 We can see from the figure above that sales of rose wine have been declining over
time.
 After 1992, the median sales have been at their lowest levels, having peaked in 1980 and
1981.
 Additionally, we can see that there are outliers in the box plots.

12.2. Monthly Plot

Fig.11 Monthly plot of Rose wine sales

Observation:

 The sales trajectory appears to be precisely the reverse of that seen in the yearly plot,
increasing near the end of each year.
 January has the lowest wine sales while December sees the greatest. The sales
modestly grow from January to August and then sharply climb after that.
 Additionally, we can see that there are outliers in the box plots.

10
12.3. Annual Sales

Fig.12 Line plot – Annual sales

12.4. Quarterly Sales

Fig.13 Line plot – Quarterly sales

11
12.5. Monthly Sales across Different Years

Fig.14 Line plot – Monthly sales across different years

12.6. Empirical Cumulative Distribution Plot

Fig.15 Line plot – Empirical cumulative distribution function

12
12.7. Monthly Time Series Plot

Fig.16 Line plot – Monthly time series

Observation:

 After 1981, the sales fell drastically. Sales are typically lowest in the first quarter and
highest in the fourth quarter.
 Every year, December has the highest sales, followed by November and October.
January had the lowest sales.
 From the cumulative distribution graph, we can observe that around 70 to 75 percent of
the units sold are fewer than 100, and 90% of the units sold are less than 150. Only 15% of
sales involved less than 50 items. Therefore, it is clear that the bulk of sales were in the
range of 50 to 100 units.

13
12.8. Average Wine sales per month & change
percentage over each month

Fig.17 Line plot – Average and % Change over each month

Observation:

 We can see that there is a declining trend and seasonality from the average sales and

 % change plots. Additionally, the seasonality in the percentage change appears to be consistent
throughout all the years.

14
12.9. Decomposition of Time Series

12.10. Additive Decomposition

Fig.18 Additive decomposition of time series

Fig.19 Additive Decomposition - Sample of Trend, Seasonality & Residual values

15
12.11. Multiplicative Decomposition

Fig.20.1 Multiplicative decomposition of time series

Fig.20.2 Multiplicative Decomposition - Sample of Trend, Seasonality & Residual values

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal.
 The residual patterns after additive decomposition of the time series appear to
represent the seasonal element and exhibit substantial variation.
 In the multiplicative decomposition of the time series, it has been observed that the
seasonal fluctuation of residuals is under control.
 The size of the seasonal variations doesn't change on comparison, but the residuals are
tightly controlled by the multiplicative decomposition. In addition to this, the residuals
are not independent of seasonality thus we may assume that it is multiplicative.

16
12.12. Split the data into training and test. The
test data should start in 1991.
Train and test data are separated from the provided dataset. Sales data up to 1991 is
included in the training data, while data from 1991 through 1995 is used for testing.

Fig.21.1 First and Last few rows of Train data Fig.21.2 First and Last few rows of Test
data

Fig.22 Count summary on train and test data

17
Fig.23 Line Plot – Splitting of time series into Train & Test data

12.13. Build all the exponential smoothing


models on the training data and evaluate the
model using RMSE on the test data. Other
models such as regression, naïve forecast
models and simple average models. should also
be built on the training data and check the
performance on the test data using RMSE.

12.14. Model 1 – Linear Regression


For this particular linear regression, we are going to regress the 'Rose_Wine_Sales' variable
against the order of the occurrence.

For the selection criteria, the below Linear Regression model is built by using default parameters.

18
Fig.24 Rose Wine – Linear regression model

Fig.25 Linear regression on Test data

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 The train and test data trends have been caught by the linear regression model
however, it is unable to account for seasonality
 The root means squared error (RMSE) for the linear regression model is 15.268. The size
of the seasonal

12.14.1. Linear Regression: Model Evaluation


 Performance Metric
 Test RMSE  15.268887

12.15. Model 2 – Naïve Forecast


For this particular naive model, we say that the prediction for tomorrow is the same as today
and the prediction for day after tomorrow is tomorrow and since the prediction of tomorrow is
same as today, therefore the prediction for day after tomorrow is also today.

19
Fig.26 Naïve forecast on Test data

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 The seasonality and trend of the time series data cannot be captured by the simple
forecast model.
 The root means squared error (RMSE) for the naïve forecast model is 79.719 which is
significantly higher than the regression model.

12.15.1. Naïve Forecast: Model Evaluation

20
 Performance Metric
 Test RMSE  79.718576

12.16. Model 3 – Simple Average


For this particular simple average method, we will forecast by using the average of the
training values.

Fig.27 Rose Wine – Simple Average model

Fig.28 Simple Average model predictions on Test data

21
Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 The seasonality and trend of the time series data cannot be captured by the simple
average model.
 The root means squared error (RMSE) for the simple average model is 53.46 which is
significantly higher than the regression model but lower than naïve forecast model.

12.16.1. Simple Average: Model Evaluation


Performance Metric
Test RMSE 53.460367

12.17. Model 4 – Moving Average (MA)


For the moving average model, we are going to calculate rolling means (or moving averages) for
different intervals. The best interval can be determined by the maximum accuracy (or the
minimum error) over here.

Fig.29 Rose Wine – Sample of Trailing Moving Averages

22
Fig.30 Moving Average on Entire data

23
Fig.31 Individual visualization of moving averages on entire data

Fig.32 Moving averages forecast on test data

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
24
 The seasonality and trend of the time series data may both be predicted using
moving average models.
 We can see how the data smooth out as the number of observation points taken
increases. The 2-point TMA has characteristics that are more similar to test results than
the 9-point TMA.
 The root means squared error (RMSE) for the 2-point trailing average model is
11.529, which is lowest than all models build so far

12.18. Moving Average: Model Evaluation


Model Test RMSE

2 Point Trailing Moving Average 11.529278


4 Point Trailing Moving Average 14.451376
6 Point Trailing Moving Average 14.566262
9 Point Trailing Moving Average 14.727596

 Let's compare the visualization of each model's predictions that we have


constructed so far before investigating exponential smoothing methods.

25
Fig.33 Comparison of different models on test data (Regression, Naïve, Simple and Moving

Average) Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 We can see from the graph above that simple average and naive forecast models fail to
adequately describe the characteristics of the test data.
 The trend portion of the series has been caught using linear regression, however the
seasonality has been missed
 Both trend and seasonality may be accounted for using moving average models

12.19. Model 5 – Simple Exponential Smoothing


The simplest of the exponentially smoothing methods is naturally called simple exponential
smoothing (SES). This method is suitable for forecasting data with no clear trend or
seasonal pattern.

In Single ES, the forecast at time (t + 1) is given by Winters,1960

Ft+1=αYt + (1−α)Ft
Parameter α is called the smoothing constant and its value lies between 0 and 1. Since
the model uses only one smoothing constant, it is called Single Exponential Smoothing.
For the selection criteria, the below Simple Exponential Smoothing is built by using
optimized parameters.

Fig.34 Rose Wine – Simple Exponential Smoothing Model

26
Fig.35 Sample of SES predictions

Fig.36 Rose Wine - SES predictions on Test data

The more recent observation is given more weight the higher the alpha value. That implies that
the recent events will repeat again. A loop with different alpha values is run to understand
which particular value works best for alpha on the test set.

The range of alpha value is from 0.1 to 0.95 and the respective RMSE for train and test data are
calculated for analyzing the performance metrics.

27
Fig.37 SES prediction metrics for different alpha values

Fig.38 SES forecast for different Alpha values

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 When there is neither a trend nor a seasonal component to the time series, simple

28
exponential smoothing is typically used. It is due to this reason, it unable to capture the
characteristics of the time series data.
 The root means squared error (RMSE) for the simple exponential smoothing model with
Alpha=0.0987 is 36.796 and for Alpha=0.1, RMSE is 36.827.
 The Simple Exponential Smoothing with alpha=0.0987 is taken as the best model
among two as it has the lowest test RMSE.

12.19.1. Simple Exponential Smoothing: Model


Evaluation
Model Test RMSE
SES (Alpha = 0.0987) 36.796036
SES (Alpha = 0.1) 36.827827

12.20. Model 6 – Double Exponential Smoothing


(Holt's Model)
This model is an extension of SES known as Double Exponential model which estimates two
smoothing parameters. Applicable when data has Trend but no seasonality. Two separate
components are considered: Level and Trend. Level is the local mean. One smoothing parameter
α corresponds to the level series. A second smoothing parameter β corresponds to the trend
series.
Double Exponential Smoothing uses two equations to forecast future values of the time series,
one for forecasting the short-term average value or level and the other for capturing the trend.

Intercept or Level equation, Lt is given by: Lt = αYt + (1−α)Ft

Trend equation is given by Tt = β(Lt − Lt−1) + (1−β)Tt−1

Here, αα and ββ are the smoothing constants for level and trend, respectively,

0 <α < 1 and 0 < β < 1.

The forecast at time t + 1 is given by

Ft+1 = Lt + Tt

Ft+n = Lt + nTt

For the selection criteria, the below Double Exponential Smoothing is built by using
optimized parameters.

29
Fig.39 Rose Wine – Double Exponential Smoothing Model

Fig.40 Sample of DES predictions

30
Fig.41 Rose Wine - DES predictions on Test data

The more recent observation is given more weight the higher the alpha value. That implies that
the recent events will repeat again. A loop with different alpha values is run to understand
which particular value works best for alpha on the test set.

The range of alpha value is from 0.05 to 1.0 and the respective RMSE for train and test data are
calculated for analyzing the performance metrics.

Fig.42 DES prediction metrics for different alpha, beta values

Fig.43 DES forecast for different Alpha, Beta values

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 When there is simply trend and no seasonality in the time series data, the double

31
exponential smoothing model performs well. It is due to this reason it is only able to
capture the trend characteristics of the data and seasonality is not accounted for.
 The root means squared error (RMSE) for the double exponential smoothing model with
Alpha=1.49e-08, Beta=7.389e-09 is 15.268 and for Alpha=0.05, Beta=0.35 (Auto tuned
model), RMSE is 16.328994.
 The Double Exponential Smoothing with Alpha=1.49e-08, Beta=7.389e-09 is taken as
the best model among two as it has the lowest test RMSE.
 Additionally, it should be highlighted that compared to the simple exponential
smoothing model, the double exponential smoothing model has almost halved the
RMSE values.

12.21. Double Exponential Smoothing: Model


Evaluation
Model Test RMSE
DES (Alpha=1.49e-08, Beta=7.389e- 15.268889
09)
DES (Alpha=0.05, Beta=0.35) 16.328994

12.22. Model 7 – Triple Exponential Smoothing


(Holt-Winter’s Model)
This model is an extension of DES known as Triple Exponential Smoothing model which
estimates three smoothing parameters. Applicable when data has both Trend and seasonality.
Three separate components are considered: Level, Trend and Seasonality.

One smoothing parameter α corresponds to the level series.

A second smoothing parameter β corresponds to the trend series.

A third smoothing parameter γ corresponds to the seasonality series

where,

0 < α <1,

0 < β <1,

0 < γ <1

32
For the selection criteria, the below Triple Exponential Smoothing is built by using
optimized parameters.

Fig.44 Rose Wine – Triple Exponential Smoothing Model

Fig.45 Sample of TES predictions

33
Fig.46 Rose Wine - TES predictions on Test data

The more recent observation is given more weight the higher the alpha value. That implies that
the recent events will repeat again. A loop with different alpha values is run to understand
which particular value works best for alpha on the test set.

The range of alpha value is from 0.1 to 1.0 and the respective RMSE for train and test data
are calculated for analyzing the performance metrics.

Fig.47 TES prediction metrics for different alpha, beta and gamma values

34
Fig.48 TES forecast for automated model parameters

Fig.49 TES forecast for different model parameters

Observation:

 We can see from the graphs above that the time series has a falling trend and is

35
seasonal
 When there is both trend and seasonality in the time series data, the triple
exponential model works well. It is due to this reason it able to capture both the
trend and seasonal characteristics and nearly match the actual test data plot.
 The root means squared error (RMSE) for the double exponential smoothing model with
Alpha=0.064, Beta=0.053, Gamma=0.0 is 21.154 and for Alpha=0.2, Beta=0.85,
Gamma=0.15 (Auto tuned model), RMSE is 9.121.
 The Triple Exponential Smoothing with Alpha=0.2, Beta=0.85, Gamma=0.15 is taken as
the best model among two as it has the lowest test RMSE.
 Additionally, it should be highlighted that compared to the double exponential
smoothing model, the triple exponential smoothing model has almost reduced the
RMSE value by 40%.

12.23. Triple Exponential Smoothing: Model


Evaluation
Model Test RMSE
TES (Alpha=0.064, Beta=0.053, 21.154527
Gamma=0.0)
TES (Alpha=0.2, Beta=0.85, 9.121757
Gamma=0.15)

Let's compare the RMSE values of the models we have constructed so far and visualize the
plot of the best exponential smoothing models thus built.

36
Fig.50 Comparison of Test RMSE values of different exponential smoothing models

Fig.51 Comparison of different models on test data (SES, DES and TES)

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 Simple exponential smoothing is frequently employed when the time series doesn't
include a trend or a seasonal component. This is the reason why it is unable to capture
the time series data's features.
 The double exponential smoothing model works effectively when the time series data just
contains trend and no seasonality. This explains why seasonality is not taken into
consideration and just the trend features of the data are captured.
 The triple exponential model performs effectively when the time series data exhibit both
trend and seasonality. This is the reason why it is essentially identical to the test data plot
and is able to capture both the trend and seasonal aspects.
 The Triple exponential model is the best model we have built so far as it has the
lowest RMSE value.

37
12.24. Check for the stationarity of the data on
which the model is being built on using
appropriate statistical tests and also mention
the hypothesis for the statistical test. If the data
is found to be non-stationary, take appropriate
steps to make it stationary. Check the new data
for stationarity and comment. Note: Stationarity
should be checked at alpha = 0.05.

12.24.1. Checking for Stationarity of Entire


Data
The Augmented Dickey-Fuller test is an unit root test which determines whether there
is a unit root and subsequently whether the series is non-stationary.

Framing the hypothesis:

H0: The Time Series has a unit root and is thus non-
stationary.

H1: The Time Series does not have a unit root and is thus

The series have to be stationary for building ARIMA/SARIMA models and thus we would
want the p-value of this test to be less than the α value.

Fig.52 Rose Wine – ADF summary


Inference:

We see that at 5% significant level the Time Series is non-stationary as p-value is 0.467
which is more than alpha value (0.05), therefore we fail to reject the null hypothesis. Let
us take one level of differencing to see whether the series becomes stationary.

38
Fig.53 Rose Wine – ADF summary with differencing

Inference:

We see that at 5% significant level the Time Series becomes stationary as p-value is 3.015e-11
which is less than alpha value (0.05), therefore we reject the null hypothesis. We can see that the
provided time series becomes stationary with differencing.

Fig.54 Time Series Plot of Entire data – With differencing

39
12.25. Checking for Stationarity of Training Data

Fig.54 Time Series Plot of Train data

Fig.55 Rose Wine – ADF summary on train data

Inference:

We see that at 5% significant level the Time Series of training data is non-stationary as p-value
is 0.756 which is more than alpha value (0.05), therefore we fail to reject the null hypothesis. Let
us take one level of differencing to see whether the series becomes stationary.

40
Fig.56 Rose Wine – ADF summary on train data with differencing

Inference:

We see that at 5% significant level the Time Series of training data is non-stationary as p-value
is 3.894e-08 which is less than alpha value (0.05), therefore we reject the null hypothesis. We can
see that the provided training time series becomes stationary with differencing.

Fig.57 Time Series Plot of Training data with differencing

Observation

 As per the Augmented Dicky-Fuller test, we observed that the time series data by itself
is not stationary, however, it becomes stationary when differencing is done.
 The same thing is also observed with Training data. Therefore, for training the
models, it can be built with order of difference d=1.

41
12.26. Build an automated version of the
ARIMA/SARIMA model in which the parameters
are selected using the lowest Akaike Information
Criteria (AIC) on the training data and evaluate
this model on the test data using RMSE.

12.26.1. Model 8 – Auto-Regressive Integrated


Moving Average (ARIMA)
Auto-regression means regression of a variable on itself. One of the fundamental
assumptions of an AR model is that the time series is assumed to be a stationary process.
When the time series data is not stationary, then we have to convert the non-stationary
time-series data to stationary time-series before applying AR.
ARIMA models may be used to represent any "non-seasonal" time series that has
patterns and isn't just random noise.

An ARIMA model is characterized by 3 terms: p, d, q

where,

p is the order of the Auto Regressive (AR) term

q is the order of the Moving Average (MA) term

d is the number of differencing required to make the time series stationary

 For the selection criteria of p,d,q the below ARIMA model is built by
using automated model parameters with lowest Akaike Information
Criteria.

42
Fig.58 Parameter Combinations for ARIMA model Fig.59 AIC values for different
parameter combinations

43
Fig.60 Sorted AIC values for different parameter combinations

We can see that among all the possible given combinations, the AIC is lowest for the
combination (2,1,3). Hence, the model is built with these parameters to determine the RMSE
value of test data.

Fig.61 Rose Wine – Automated ARIMA model

44
Fig.62 Automated ARIMA – Diagnostics plot

Observation:

 The optimal parameters are decided based on the lowest Akaike Information Criteria (AIC)
values. The AIC is lowest for the combination (2,1,3) as we see from the above results.
 From the Standardized residual plot above, we can notice that the residuals seem to
fluctuate around the mean of zero and have uniform variance.
 The histogram plus estimated density plot suggests a slightly uniform distribution with
mean zero and slightly skewed to the right.
 In Normal Q-Q plot, all the dots fall more or less in line with the red line. Few
deviations are present implying minor skewed distribution.
 The correlogram plot of residuals shows that the residuals are not auto correlated.

45
Fig.63 Sample of Automated ARIMA (2,1,3) predictions

Fig.64 Plot of Automated ARIMA (2,1,3) predictions on Test data

12.27. Automated ARIMA: Model Evaluation


For evaluating the model’s performance metrics, we look at root means squared error (RMSE) &
mean absolute percentage error (MAPE)

Model Test RMSE Test MAPE


ARIMA (p=2, d=1, q=3) 36.813 75.839

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 ARIMA models performs well on non-seasonal time series. It is due to this reason it is
unable to capture the entire characteristics of the test data.
 The root means squared error (RMSE) of test data for the ARIMA model with (p=2, d=1,

46
q=3) is 36.813.
 Not surprisingly, the RMSE of the aforementioned ARIMA model is greater than the
majority of previously constructed models.

12.28. Model 9 – Seasonal Auto-Regressive


Integrated Moving Average (SARIMA)
SARIMA models or also known as Seasonal ARIMA is an extension of ARIMA for a time
series data with defined seasonality. SARIMA models use seasonal differencing which is
similar to regular differencing.

A SARIMA model is characterized by 7 terms: p, d, q, P, Q, D and F

where,

p is the order of the Auto Regressive (AR) term

q is the order of the Moving Average (MA) term

d is the number of differencing required to make the time series stationary

P is the order of the Seasonal Auto Regressive (AR) term

Q is the order of the Seasonal Moving Average (MA) term

D is the number of seasonal differencing required to make the time series stationary

F is the seasonal frequency of the time series

 We must examine the PACF and ACF plots, respectively, at delays that are
the multiple of "F" in order to determine the "P" and "Q" values, and
determine where these cut-off values are (for appropriate confidence
interval bands).
 By examining the lowest AIC values, we can also estimate "p," "q," "P,"
and "Q" for the SARIMA models.
 By examining the ACF plots, one may calculate the seasonal parameter 'F'.
The existence of seasonality should be shown by a spike in the ACF plot at
multiples of "F."

47
Fig.65 ACF plot of Train data

From the above ACF plot we can observe that at every 12th lag is significant indicating the presence of
seasonality. Hence for our model building we will consider the term F=12.

For the selection criteria of p, d, q, P, D, Q & F the below SARIMA model is built by using
automated model parameters with lowest Akaike Information Criteria.

Fig.66 Parameter Combinations for SARIMA model

48
Fig.67 AIC values for different parameter combinations

Fig.68 Sorted AIC values for different parameter combinations

We can see that among all the possible given combinations, the AIC is lowest for the
combination (3,1,1) (3,0,2,12). Hence, the model is built with these parameters to determine the
RMSE value of test data.

49
Fig.69 Rose Wine – Automated SARIMA model

Fig.70 Automated SARIMA – Diagnostics plot

Observation:

50
 The optimal parameters are decided based on the lowest Akaike Information Criteria (AIC)
values. The AIC is lowest for the combination (3,1,1) (3,0,2,12) as we see from the above
results.
 From the Standardized residual plot above, we can notice that the residuals seem to
fluctuate around the mean of zero and have uniform variance.
 The histogram plus estimated density plot suggests a slightly uniform distribution with
mean zero and slightly skewed to the right.
 In Normal Q-Q plot, all the dots fall more or less in line with the red line. Few
deviations are present implying minor skewed distribution.
 The correlogram plot of residuals shows that the residuals are not auto correlated.

Fig.71 Sample of Automated SARIMA (3,1,1) (3,0,2,12) predictions

51
Fig.72 Plot of Automated SARIMA (3,1,1) (3,0,2,12) predictions on Test data

12.29. Automated SARIMA: Model Evaluation


For evaluating the model performance, we look at root means squared error (RMSE) & mean
absolute percentage error (MAPE)

Model Test RMSE Test MAPE


SARIMA (p=3, d=1, q=1) (P=3, 18.881 36.375
D=0, Q=2, F=12)

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 SARIMA model performs well on seasonal time series. It is due to this reason it is able to
capture the entire characteristics of the test data.
 The root means squared error (RMSE) of test data for the SARIMA model with (p=3, d=1,
q=1) (P=3, D=0, Q=2, F=12) is 18.881.
 Additionally, it should be highlighted that compared to the ARIMA model, the
SARIMA model has almost halved the RMSE value.

12.30. Build ARIMA/SARIMA models based on the


cut-off points of ACF and PACF on the training
data and evaluate this model on the test data
using RMSE.
12.30.1. Model 10 – Auto-Regressive Integrated Moving
Average (ARIMA) – Manual

An ARIMA model is characterized by 3 terms: p, d, q where,

p is the order of the Auto Regressive (AR) term

q is the order of the Moving Average (MA) term

52 to make the time series stationary


d is the number of differencing required
 Indicating which previous series values are most beneficial in forecasting
future values, autocorrelation and partial autocorrelation are measures of
relationship between present and past series values. You may identify the
sequence of processes in an ARIMA model using this information.

 The parameters p & q can be determined by looking at the PACF & ACF plots
respectively.

 Autocorrelation function (ACF) - At lag k, this is the correlation


between series values that are k intervals apart.
 Partial autocorrelation function (PACF) - At lag k, this is the correlation
between series values that are k intervals apart, accounting for the values
of the intervals between.

 In an ACF & PACF plots, each bar represents the size and direction of the
connection. Bars that cross the red line are statistically significant.

12.30.2. ACF Plot – Training Data

Fig.73 ACF plot on differenced train data

53
12.30.3. PACF Plot – Training Data

Fig.74 PACF plot on differenced train data

Observation:

 The Auto-Regressive parameter in an ARIMA model is 'p' which comes from the
significant lag after which the PACF plot cuts-off below the confidence interval.
 The Moving-Average parameter in an ARIMA model is 'q' which comes from the
significant lag after which the ACF plot cuts-off below the confidence interval.
 By looking at the above plots, we will take the value of p=2 and q=2 respectively. The
value of d=1, as with differencing the time series becomes stationary.

54
Fig.75 Rose Wine – Manual ARIMA model

Fig.76 Manual ARIMA – Diagnostics plot

Observation:

55
 The model's parameters, p and q, were identified by examining the ACF (q=2) and
PACF (p=2) graphs. Since we differenced the series to make it stationary, the
parameter d=1.
 From the Standardized residual plot above, we can notice that the residuals seem to
fluctuate around the mean of zero and have uniform variance.
 The histogram plus estimated density plot suggests a slightly uniform distribution with
mean zero and slightly skewed to the right.
 In Normal Q-Q plot, all the dots fall more or less in line with the red line. Few
deviations are present implying minor skewed distribution.
 The correlogram plot of residuals shows that the residuals are not auto correlated.

Fig.77 Sample of Manual ARIMA (2,1,2) predictions

56
Fig.78 Plot of Manual ARIMA (2,1,2) predictions on Test data

12.31. Manual ARIMA: Model Evaluation


For evaluating the model performance, we look at root means squared error (RMSE) & mean
absolute percentage error (MAPE)

Model Test RMSE Test MAPE


ARIMA (p=2, d=1, q=2) 36.87 76.055

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 ARIMA models performs well on non-seasonal time series. It is due to this reason it is
unable to capture the entire characteristics of the test data.
 The root means squared error (RMSE) of test data for the ARIMA model with (p=2, d=1,
q=2) is 36.87.
 Not surprisingly, the RMSE of the aforementioned ARIMA model is greater than the
majority of previously constructed models and nearly equal to ARIMA (2,1,3) model.

12.32. Model 11 – Seasonal Auto-Regressive Integrated Moving


Average (SARIMA) – Manual

A SARIMA model is characterized by 7 terms: p, d, q, P, Q, D and F

where,

p is the order of the Auto Regressive (AR) term

q is the order of the Moving Average (MA) term

d is the number of differencing required to make the time series stationary

P is the order of the Seasonal Auto Regressive (AR) term

Q is the order of the Seasonal Moving Average (MA) term

D is the number of seasonal differencing required to make the time series stationary
57
F is the seasonal frequency of the time series
 We must examine the PACF and ACF plots, respectively, at delays that are
the multiple of "F" in order to determine the "P" and "Q" values, and
determine where these cut-off values are (for appropriate confidence
interval bands).
 By examining the ACF plots, one may calculate the seasonal parameter 'F'.
The existence of seasonality should be shown by a spike in the ACF plot at
multiples of "F."
 The parameters P & Q can be determined by looking at the seasonally
differenced PACF & ACF plots respectively.

 Autocorrelation function (ACF) - At lag k, this is the correlation


between series values that are k intervals apart.
 Partial autocorrelation function (PACF) - At lag k, this is the correlation
between series values that are k intervals apart, accounting for the values
of the intervals between.

 In an ACF & PACF plots, each bar represents the size and direction of the
connection. Bars that cross the red line are statistically significant.

12.32.1. ACF Plot – Seasonally differenced (F=12) Training


Data

Fig.79 ACF plot on differenced train data

58
12.32.2. PACF Plot – Seasonally differenced (F=12) Training
Data

Fig.80 PACF plot on differenced train data

Observation:

 From the PACF plot it can be seen in early lags that till lag 4 is significant before cut- off,
so AR term ‘p = 4’ is chosen. From the multiples of seasonal lags, after first seasonal lag
of 12, it cuts off, so keep seasonal AR ‘P = 0’.
 From ACF plot, it can be seen in early lags, lag 1 and 2 are significant before it cuts off, so
let’s keep MA term ‘q = 2’ and at seasonal lag of 12, a significant lag is apparent and no
seasonal lags are apparent at lags 24, 36 or afterwards, so let’s keep ‘Q = 1'.
 The final selected terms for SARIMA model is (4, 1, 2) (0, 1, 1, 12), as inferred from the
ACF and PACF plots.

59
Fig.81 Rose Wine – Manual SARIMA model

Fig.82 Manual SARIMA – Diagnostics plot

60
Observation:

 The model's parameters, p, q, P, Q were identified by examining the ACF (q=2, Q=1) and
PACF (p=4, P=0) graphs. Since we differenced the series to make it stationary, the
parameter d=1, D=1.
 From the Standardized residual plot above, we can notice that the residuals seem to
fluctuate around the mean of zero and have uniform variance.
 The histogram plus estimated density plot suggests a slightly uniform distribution with
mean zero.
 In Normal Q-Q plot, all the dots fall more or less in line with the red line. Few
deviations are present implying minor skewed distribution.
 The correlogram plot of residuals shows that the residuals are not auto correlated.

Fig.83 Sample of Manual SARIMA (4,1,2) (0,1,1,12) predictions

61
Fig.84 Plot of Manual SARIMA (4,1,2) (0,1,1,12) predictions on Test data

12.33. Manual SARIMA: Model Evaluation


For evaluating the model performance, we look at root means squared error (RMSE) & mean
absolute percentage error (MAPE)

Model Test RMSE Test MAPE


ARIMA (p=4, d=1, q=2) 15.907 23.712
(P=0, D=1, Q=1, F=12)

Observation:

 We can see from the graphs above that the time series has a falling trend and is
seasonal
 SARIMA model performs well on seasonal time series. It is due to this reason it is able to
capture the entire characteristics of the test data.
 The root means squared error (RMSE) of test data for the SARIMA model with (p=4, d=1,
q=1) (P=0, D=1, Q=1, F=12) is 15.907.
 Additionally, it should be highlighted that compared to the all the ARIMA/SARIMA
models built so far, this SARIMA model has the lowest RMSE value.

12.34. Build a table (create a data frame) with all the models
built along with their corresponding parameters and the
respective RMSE values on the test data.

62
Fig.85 RMSE values of all models

63
Fig.86 Sorted RMSE values of all models

Observation:

o From the above table, we can see that Triple Exponential Smoothing model with
parameters (Alpha=0.2, Beta=0.85, Gamma=0.15) has the lowest RMSE for test
data.
o The naïve forecast model has performed the worst in terms of RMSE.

12.35. Based on the model-building exercise, build the most


optimum model(s) on the complete data and predict 12
months into the future with appropriate confidence
intervals/bands.
 From Fig.86 we observed the Triple Exponential Smoothing model is the
optimum model for the given data set as it has the lowest RMSE value.
 However, as we know SARIMA models tend to perform better with
seasonal time series, we are also considering SARIMA model for the

64
forecast.

 Let us visually see the time series plots of different models we have built so far on test data

Fig.87 Time Series Plot 1 – Different Model predictions on test data

65
Fig.88 Time Series Plot 2 – Different Model predictions on test data

Plotting the lowest RMSE models

Fig.89 Time Series Plot 3 – Different Model predictions on test data

66
12.36. Optimum Model 1:
Triple Exponential Smoothing Model (Alpha=0.2, Beta=0.85, Gamma=0.15)

Fig.90 TES Optimum Model – Line plot of Predictions vs Actual values

67
Fig.91 TES Optimum Model – Line plot of Predictions vs Actual values on Test data

Fig.92 TES Optimum Model

68
Fig.93 TES Model – Forecast for next 12 months

Fig.94 TES Optimum Model – Time series plot forecast for next 12 months

69
Fig.95 TES Optimum Model – Future forecast with confidence intervals

Fig.96 TES Optimum Model – Time series plot forecast with confidence intervals

70
Fig.97 TES Optimum Model – Time series plot forecast for next 12 months
with confidence intervals

12.37. Optimum Model 2:


Manual SARIMA Model (4, 1, 2) (0, 1, 1, 12)

71
Fig.98 Manual SARIMA Optimum Model – Line plot of Predictions vs Actual values

72
Fig.99 Manual SARIMA Optimum Model – Line plot of Predictions vs Actual values on
Test data

Fig.100 Manual SARIMA Optimum Model

73
 Fig.101 Manual SARIMA Model – Forecast for next 12 months with confidence intervals

Fig.102 Manual SARIMA Optimum Model – Time series plot forecast for next 12 months

74
Fig.103 Manual SARIMA Optimum Model – Time series plot forecast with
confidence intervals

Fig.104 Manual SARIMA Optimum Model – Forecast for next 12 months with
confidence interval

Comment on the model thus built and report your findings and
suggest the measures that the company should be taking for future
sales.
 We needed to construct an optimum model to forecast the rose wine sales for the next

75
12 months. The model information, insights and recommendations are as follows.

Model Insights:
 The time series in consideration exhibits a declining trend and stable seasonality. When
comparing the various models, we can see that Triple Exponential Smoothing and SARIMA
models frequently deliver the greatest results. This is due to the fact that these models
are excellent at predicting time series that demonstrate trend and seasonality. Apart
from these Double Exponential Smoothing and Moving Average Models also tend to
perform moderately good.
 We examine the root mean squared value of the forecast model to assess its performance
(RMSE). The model with the lowest RMSE value and characteristics that match the test
data is regarded as being a superior model.
 We observed that Triple Exponential Smoothing model had the lowest RMSE and the
characteristics that most closely fit test data. As a result, its regarded as the best model for
forecasting and can thus be used by the company for forecast analysis.

Historical Insights:
 The rose wine sales have declined throughout time. Rose wine sales peaked in 1980 &
1981 and fell to their present low position in 1995 (as we have data for only first 7
months).
 The monthly sales trajectory appears to be exactly the opposite of the yearly plot, with a
progressive increase towards the end of each year. January has the lowest wine sales,
while December has the highest. From January to August, sales increase gradually, and
then they quickly increase after that.
 The average monthly sales of Rose wine are 90 bottles. More than 50% of the sold units
of rose wine fall between 62 and 111. 28 units were sold as the lowest and 267 units as
the most. Only 20% of monthly sales that were recorded were for more than 120 units.
 Around 70 to 75 percent of the units sold are fewer than 100, and 90% of the units sold
are less than 150. Only 15% of sales involved more than 50 items. Therefore, it is clear
that the bulk of sales were in the range of 50 to 100 units.

Forecast Insights:
 Based on the forecast made by the Triple Exponential Smoothing model previously
presented, the following insights are offered.
 The forecast calls for average sale of 44 units, down by 45 units from the historical
average of 89 units. Thus, we might observe an alarming decrease in average sales by
50%.
 The prediction for minimum sales volume of 28 units equals the minimum sales volume
in the past. Consequently, a no percentage change could be seen in minimum quantity
sold.
 The projection estimates a maximum sales volume of 70 units, which is 197 units fewer

76
than the largest sales volume recorded in the past, which was 267 units. Consequently, a
73% decrease in maximum sales is visible.
 In comparison to the historical standard deviation of 62 recorded in the past, the
forecast's standard deviation is 10 units, or 52 units lower. It's gone down by 83%. This
is not anticipated because historical data tends to have less volatility than future data.
 We can see from the prediction that the months of October, November, and December
have increased sales. December is often when the sales are at their highest. There is a
startling decline in sales in January following December. The months after January
appear to witness a gradual improvement in sales until October, when it jumps sharply.

Recommendations:
 Records show that the months of September, October, November, and December
account for 40% of the total sales forecast. Many festivities take place in these months,
and many people travel during this time. One of the most premium types of wine used
during festive and event celebrations is rose wine.
 Wine sales often climb in the final two months of the year as people hurry to buy holiday
beverages. For forthcoming occasions like Thanksgiving, Christmas, and New Year's,
people typically stock up. The majority of individuals also buy in bulk for holiday
gatherings and gift-giving.
 Many individuals choose wine as their go-to gift when it comes to occasions like parties
and gift-giving. Sales of Rose wine rise just before the winter holidays as more collectors
purchase these wines as presents or look for vintages to serve at holiday gatherings.

 This blush wine works nicely with nearly anything, including spicy dishes, sushi, salads,
grilled meats, roasts, and rich sauces. It is well renowned for its outdoor-friendly
drinking style.
 The festival seasons may vary depending on where you are geographically, however the
most of the celebrations take place in the last four months.
o In these months, promotional offers might be implemented to lower costs and
significantly boost revenue.
o To increase sales, we must take advantage of all holiday events and set prices
appropriately.
o Many individuals order in bulk to prepare for upcoming festivities, which may
result in a high shipping expenditure. Businesses may provide significant
discounts or free shipping beyond a certain threshold at these times.
o Giving customers gifts to improve their user experience is one of the greatest
marketing strategies to deploy. In order to attract more consumers and increase
sales, the company might provide free gifts on orders with significant sales.
o To target various client demographics, the proper marketing campaigns must be
run
o Numerous ecommerce campaigns and competitions may be performed to

77
broaden the product's audience and enhance sales.
 The period from January to June is one of the key challenges for Rose wine sales.
o To identify the elements affecting sales, in-depth market research must be
conducted.
o Due to the fact that rose wines are premium category of wine, a market- friendly
version of the existing product might be introduced by the company, helping to
make up for the drop in sales. Long-term, this may bring in additional clients.
o The company can rebrand its product to instill a fresh perspective towards the
product and break the declining sales trend.
 There are other key elements that might be driving the sales, despite the present model's
ability to closely track the historical sales trend.
o The forecast might be improved by doing in-depth market research on the factors
that influence sales and incorporating that information into the model for
projection

78

You might also like