0% found this document useful (0 votes)
40 views

Case Study-1-Pattern Discovery in Supermarket Sales Transactions Using EDA

DataSets
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Case Study-1-Pattern Discovery in Supermarket Sales Transactions Using EDA

DataSets
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Project Title: Pattern Discovery in Supermarket Sales Transaction using EDA

Technology Platform: ML with Azure Studio


Team Size: 4
Technical Domain: Exploratory Data Analysis
Business Domain: Retail Industry

Project Overview
Supermarkets are big business and they use data on a big scale. Originating in the US in the
1930s, supermarkets have since gradually taken over a bigger and bigger share of the retail
and grocery market. Giants like Wal-Mart, Aldi and Carrefour are among the largest retailers
in the world with revenues approaching the hundreds of billions. As such many have invested
heavily in big data, with analytics and data science forming a core part of their decision
making.
The growth of supermarkets in most populated cities are increasing and market competitions
are also high. Every product purchased, along with its price, is recorded in gargantuan
databases, with tables exceeding hundreds of billions of rows. Loyalty schemes, where
customers accumulate points by scanning their loyalty card at each purchase, allow the
company to stitch together a customer’s entire history of transactions, gaining more valuable
insights.

Dataset
The dataset is one of the historical sales of Supermarket Company which has recorded in 3
different branches for 3 months data. Predictive data analytics methods are easy to apply with
this dataset. The following table details the attribute information:

Page | 1
The dataset consists of data from 3 cities or 3 branches in Myanmar as given below-

a) Branch A (Yangoon)
b) Branch B (Mandalay)
c) Branch C (Naypyitaw)

Objective
Project teams need to explore & visualize the data to generate insights about supermarket
sales transactions of customers and also obtain inference about customer ratings.

Methodology
The methodology should include the following operations of Exploratory Data Analysis &
Clustering Analysis:
a. Import the dataset
b. Perform Univariate analysis to address the following queries:
 Question 1: What does the customer rating look like and is it skewed?
(Use normal distribution plot)
 Question 2: Is there any difference in aggregate sales across
branches?(Use bar graph)
 Question 3: Which is the most popular payment method used by
customers?(Use bar graph)

c. Perform Bi-variate analysis to address the following queries:


 Question 4: Does gross income affect the ratings that the customers
provide?(Use scatterplot)
 Question 5: Which branch is the most profitable?(Use Boxplot)
 Question 6: Is there any relationship between Gender and Gross
income?(Use Boxplot)
 Question 7: Is there any time trend in gross income? (Use line graph)
 Question 8: Which product line generates most income?(Use Bar plot)

d. Prepare pairwise plot (scatterplot matrix) to visualize all the bi-variate relationships
in the data.
e. Perform correlation analysis using heatmap.
f. Perform additional analysis to address the following queries:
 Question 9: What is the spending pattern of females and males and in
which category do they spend a lot?(Use countplot in Seaborn Python
package)
 Question 10: How many products are bought by customers?(Use
distribution plot)
 Question 11: Which day of the week has maximum sales?(Use countplot)
 Question 12: Which hour of the day is the busiest?(Use line plot)
 Question 13: Which product line should the supermarket focus on?(Use
bar plot)
 Question 14: Which city should be chosen for expansion and which
products should it focus on?(Use bar plot)

Use the dataset supermarket_sales.csv available under ‘Files’ section for the Project.

Page | 2
Project Outcome(s)
Project Teams need to explore the dataset and visualize the hidden data patterns and produce
valuable insights to highlight the key findings using Exploratory Data Analysis(EDA).

Page | 3

You might also like