Open In App

Time Series Forecasting for Predicting Store Sales Using Prophet

Last Updated : 15 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Time series forecasting is a crucial aspect of business analytics, enabling companies to predict future trends based on historical data. Accurate forecasting can significantly impact decision-making processes, inventory management, and overall business strategy. One of the powerful tools for time series forecasting is Prophet, an open-source library developed by Facebook's Core Data Science team. This article will delve into the technical aspects of using Prophet for predicting store sales, providing a comprehensive guide from data preparation to model evaluation.

The Prophet Model

The Prophet model uses a decomposable time series model which is built on the following components:

  1. Trend Component: This component models the overall trend of the data, which can be linear or logistic. Prophet automatically detects changes in trends by selecting changepoints from the data.
  2. Seasonal Component: This component models seasonal patterns in the data using Fourier series. It can capture yearly, weekly, and daily seasonality.
  3. Holiday Component: This component incorporates the impact of holidays on the data. Users can provide a list of important holidays to be included in the model.
  4. Uncertainty Intervals: Provides a range within which the true values are likely to fall, giving an estimate of the prediction's reliability.

Steps for Implementing Prophet for Store Sales Forecasting

Installing Prophet

pip install prophet

One can install Prophet using the above command.

Let's discuss the steps to implement a store sales predictor using Prophet. Here we can consider two examples. One example predicts store sales based on time series, and in another, we can predict store sales for each category based on time series.

  • Predicts the store sales based on timeseries
  • Predict store sales for each category based on timseries.

Step 1: Import Necessary Libraries

Let's import the necessary libraries.

Python
import pandas as pd
from prophet import Prophet
import matplotlib.pyplot as plt

Let's understand each imports. They are as follows:

  • pandas: Used for data manipulation and analysis.
  • Prophet: A forecasting tool provided by Facebook, used for forecasting time series data.
  • matplotlib: Used for plotting and visualization.

Step 2: Load the Dataset

The code to load the dataset is as follows:

Here we load the data using pandas dataframe.

  • file_path: Path to the CSV file containing the sales data.
  • pd.read_csv(file_path): Reads the CSV file into a pandas DataFrame.

Dataset Link: Store_sales

Python
# dataset path
file_path = "https://media.geeksforgeeks.org/wp-content/\
uploads/20240704211146/retail_sales_dataset.csv"
# read data to dataframe
data = pd.read_csv(file_path)

Step 3: Data Analysis

Before implementing Prophet, it's crucial to analyze and understand the data. It involves:

  • Checking for missing values.
  • Identifying the granularity of the data (daily, weekly, etc.).
  • Understanding the overall trends and seasonal patterns.

Let's check for any missing values.

Python
data.info()

Output:

<class 'pandas.core.frame.DataFrame'> 
RangeIndex: 1000 entries, 0 to 999
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Transaction ID 1000 non-null int64
1 Date 1000 non-null object
2 Customer ID 1000 non-null object
3 Gender 1000 non-null object
4 Age 1000 non-null int64
5 Product Category 1000 non-null object
6 Quantity 1000 non-null int64
7 Price per Unit 1000 non-null int64
8 Total Amount 1000 non-null int64
dtypes: int64(5), object(4) memory usage: 70.4+ KB

You can notice that there is no missing values in the dataset. Every columns have equal set of data. Let's look at the plot diagram between total amount and date.

Python
# convert object to date format
data['Date'] = pd.to_datetime(data['Date'])
# plot diagram
data.plot(x='Date', y='Total Amount')

Output:

plot_dig_total_amount
Plot Diagram

Step 4: Preparing the Data for Prophet

Prophet requires the data to have two columns: ds (date) and y (value to forecast). Additionally, if you want to forecast sales for each product category, you'll need to prepare separate datasets for each category.

daily_sales.rename(columns={'Date': 'ds', 'Total Amount': 'y'})

Here we use pandas dataframe to replaces 'Date' column with 'ds' and 'Total Amount' with 'y'.

Step 5: Training the Prophet Model

To train a prophet model, one should initialize the model using Prophet() and train the prophet model using the prepared data.

# Initialize the model
model = Prophet()
# Fit the model
model.fit(daily_sales)

Here we pass the prepared data to train the prophet model.

Step 6: Forecasting with Prophet

Next step is to make predictions using the prophet model.

# Create a dataframe to hold the dates for which we want to make predictions
future = model.make_future_dataframe(periods=365)
# Predict future sales
forecast = model.predict(future)

Here we need to make a future dataframe that holds the date for which we want to make predictions and make predictions for the future date using the predict() method.

Step 7: Plot the Forecast and the Component

The plotting of forecast and it's component can be using the below code.

# plot forecast
model.plot(forecast)

#plot forecast component
model.plot_components(forecast)

Example1: Predicts the store sales based on timeseries

Let's predict the store sales based on timseries.

Import Libraries and Load the Dataset

We can import the necessary libraries and load the dataset.

Python
import pandas as pd
from prophet import Prophet
import matplotlib.pyplot as plt

# dataset path
file_path = "https://media.geeksforgeeks.org/wp-content/\
uploads/20240704211146/retail_sales_dataset.csv"
# read data to dataframe
data = pd.read_csv(file_path)
data.shape

Output:

(1000, 9)

Preparing the Data for Prophet

The prophet requires the data to have two columns: ds (date) and y (value to forecast) for time series forecasting. The code is as follows:

Python
# Convert the 'Date' column to datetime format
data['Date'] = pd.to_datetime(data['Date'])

# Aggregate total sales by date
daily_sales = data.groupby('Date')['Total Amount'].sum().reset_index()

# Rename columns to fit Prophet's requirements
daily_sales = daily_sales.rename(columns={'Date': 'ds', 'Total Amount': 'y'})

# Display the first few rows of the prepared data
daily_sales.head()

Output:

    ds                 y
0 2023-01-01 3600
1 2023-01-02 1765
2 2023-01-03 600
3 2023-01-04 1240
4 2023-01-05 1100

The above code does the following steps.

  • Convert Date to datetime format: Ensures that the 'Date' column is in a format that Prophet can recognize and work with.
  • Aggregate total sales by date: Groups the data by 'Date' and sums up the 'Total Amount' for each date to get daily sales.
  • Rename columns for Prophet: Renames the columns to 'ds' (date) and 'y' (value), which are the expected column names in Prophet.
  • Display the first few rows of the prepared data: Shows the first few rows of the transformed DataFrame to verify the changes.

Training the Prophet Model

To train a prophet model, one should initialize the model using Prophet() and train the prophet model using the prepared data. The code is as follows:

Python
# Initialize the model
model = Prophet()

# Fit the model
model.fit(daily_sales)

The code steps is as follows:

  • Initialize the model: Creates an instance of the Prophet model.
  • Fit the model: Trains the Prophet model on the prepared data (daily_sales), which includes the historical sales data.

Forecasting with Prophet

The code for forecasting is as follows:

Python
# Create a dataframe to hold the dates for which we want to make predictions
future = model.make_future_dataframe(periods=365)

# Predict future sales
forecast = model.predict(future)
# display first 5 rows and 5 columns
forecast.head(5).iloc[:,0:6]

Output:

index

ds

trend

yhat_lower

yhat_upper

trend_lower

trend_upper

0

2023-01-01 00:00:00

1320.24825

-494.5066309379769

2624.543928089163

1320.24825

1320.24825

1

2023-01-02 00:00:00

1320.2615444335206

-62.14814861103264

2971.5126360266177

1320.2615444335206

1320.2615444335206

2

2023-01-03 00:00:00

1320.2748388670414

-141.95277252227478

2925.4439832197586

1320.2748388670414

1320.2748388670414

3

2023-01-04 00:00:00

1320.2881333005616

-301.57031472493316

2821.8639758272507

1320.2881333005616

1320.2881333005616

4

2023-01-05 00:00:00

1320.3014277340822

-380.8603476610306

2646.1291107103552

1320.3014277340822

1320.3014277340822

Plotting the Forecast

Let's plot the forecast.

Python
# Plot the forecast
fig1 = model.plot(forecast)
plt.title('Total Sales Forecast')
plt.xlabel('Date')
plt.ylabel('Sales')
plt.show()

Output:

Screenshot-(203)
Prophet Sales Forecast

The code explanation is as follows:

  • Create a dataframe for future dates: Generates a dataframe with future dates for which predictions will be made (365 days into the future).
  • Predict future sales: Uses the trained Prophet model to predict future sales based on the generated future dates.
  • Plot the forecast: Visualizes the predicted sales, including the historical data and forecasted values, with appropriate titles and labels for clarity.

Plotting the Forecast Components

The code to plot the forecast components is as follows:

Python
# Plot the forecast components
fig2 = model.plot_components(forecast)
plt.show()

Output:

download-(2)
Forecast Components

The above code does the following:

  • Plot the forecast components: Visualizes different components of the forecast, such as trend, yearly seasonality, and weekly seasonality.
  • Show the plot: Displays the component plots to understand the contributions of each component to the overall forecast.

Example2: Forecasting Sales for Each Product Category

In the above example, we implemented the code to forecast total sales based on the date. In this example we will forecast total sales for each product category based on date. The code is same as that fo the above example except the data preparation step. Let's look at the process of data preparation:

  • Identify unique product categories: Retrieves the unique product categories from the dataset.
  • Loop through each category:
    • Filter data for the current category: Selects the sales data corresponding to the current product category.
    • Aggregate sales by date: Groups the data by 'Date' and sums the 'Total Amount' for each date to get daily sales for the category.
    • Rename columns for Prophet: Renames the columns to 'ds' (date) and 'y' (value) as required by Prophet.

The complete code is as follows:

Python
import pandas as pd
from prophet import Prophet
import matplotlib.pyplot as plt

# dataset path
file_path = "https://media.geeksforgeeks.org/wp-content/\
uploads/20240704211146/retail_sales_dataset.csv"
# read data to dataframe
data = pd.read_csv(file_path)

# Forecast sales for each product category
categories = data['Product Category'].unique()

for category in categories:
    category_data = data[
      data['Product Category'] == category].groupby('Date')[
      'Total Amount'].sum().reset_index()
    category_data = category_data.rename(
      columns={'Date': 'ds', 'Total Amount': 'y'})
    
    model = Prophet()
    model.fit(category_data)
    
    future = model.make_future_dataframe(periods=365)
    forecast = model.predict(future)
    
    fig = model.plot(forecast)
    plt.title(f'Sales Forecast for {category}')
    plt.xlabel('Date')
    plt.ylabel('Sales')
    plt.show()

    model.plot_components(forecast)
    plt.show()

Output:

download-(3)-(1)
Forecasting Sales for Beauty Category
download-(4)
Forecasting Sales for Beauty Category


download-(5)-(1)
Forecasting Sales for Clothing Category
download-(6)
Forecasting Sales for Clothing Category


download-(7)-(1)
Forecasting Sales for Electronic Category
download-(8)
Forecasting Sales for Electronic Category

The code explanation is as follows:

  • Identify unique product categories: Retrieves the unique product categories from the dataset.
  • Loop through each category:
    • Filter data for the current category: Selects the sales data corresponding to the current product category.
    • Aggregate sales by date: Groups the data by 'Date' and sums the 'Total Amount' for each date to get daily sales for the category.
    • Rename columns for Prophet: Renames the columns to 'ds' (date) and 'y' (value) as required by Prophet.
  • Initialize and train the model: Creates a new Prophet model instance and trains it on the category-specific sales data.
  • Create future dataframe and predict: Generates future dates and predicts future sales for the category.
  • Plot the forecast: Visualizes the forecasted sales for the category with appropriate titles and labels.
  • Plot the forecast components: Visualizes different components of the forecast, such as trend, yearly seasonality, and weekly seasonality, for the category.

Conclusion

Time Series Forecasting for predicting store sales using Prophet is an effective approach for understanding and forecasting sales trends and seasonality. Prophet, developed by Facebook, is particularly adept at handling time series data with strong seasonal effects and several seasons of historical data.

In our analysis, we applied Prophet to historical store sales data to predict future sales. The model successfully captured the overall increasing trend in sales and the weekly seasonal patterns. The forecast plot provided clear visualizations of predicted sales and uncertainty intervals, aiding in strategic decision-making.


Next Article

Similar Reads