0% found this document useful (0 votes)
2 views

Report

Varcons Technologies Pvt Ltd is a technology consulting firm specializing in scalable solutions, SaaS product development, and AI integration. The company offers a range of services including software development, project management, and AI-driven analytics to enhance operational efficiency for businesses. This document explores customer behavior analysis and predictive modeling in supermarket retail, emphasizing the importance of data-driven strategies for improving customer satisfaction and operational effectiveness.

Uploaded by

Suhail iqbal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Report

Varcons Technologies Pvt Ltd is a technology consulting firm specializing in scalable solutions, SaaS product development, and AI integration. The company offers a range of services including software development, project management, and AI-driven analytics to enhance operational efficiency for businesses. This document explores customer behavior analysis and predictive modeling in supermarket retail, emphasizing the importance of data-driven strategies for improving customer satisfaction and operational effectiveness.

Uploaded by

Suhail iqbal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 1:

COMPANY PROFILE

Dept. of CSE, GECR, Ramanagara Page 1


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 1
1. COMPANY PROFILE
About Varcons Technologies Pvt Ltd

Varcons Technologies Pvt Ltd is a leading provider of advanced technology solutions, specializing in
scalable, innovative services tailored for businesses of all sizes. Founded by a team of visionaries working in
New York during their masters, seeing the rise of IT, they transformed their ideas into reality, the company
has grown into a trusted partner for SaaS product development, Class leading – Type A to B projects,
Investment backed Korean Projects, Integration of AI into Existing systems etc

At Varcons Technologies, smart solutions and technological innovation drive every aspect of our work.
The focus on leveraging SaaS capabilities to develop cutting-edge applications that enhance efficiency,
reduce deployment complexities, and provide seamless user experiences. By integrating customizability and
practicality into our software solutions, The company ensure that businesses can implement ready-to-use
applications with minimal configuration time and reduced operational disruptions.

In addition to the core technology offerings, Varcons Technologies operates as a strategic project
consulting firm, managing outsourced projects mainly from south korea and major enterprises and
their Vendors. By utilizing a hybrid workforce model, by integrating skilled interns alongside industry
professionals, optimizing project execution without the overhead of full-time hiring. This helps our clients
achieve cost-effective project completion, improved operational efficiency, and increased profitability.

With a strong commitment to creativity, adaptability, and technological excellence, Varcons Technologies
continues to drive industry transformation by developing innovative solutions, fostering talent, and
enabling businesses to thrive in an ever-evolving digital la

Dept. of CSE, GECR, Ramanagara Page 2


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 2:

SERVICES AND
ACTIVITIES AT THE
COMPANY

Dept. of CSE, GECR, Ramanagara Page 3


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 2

SERVICES AND ACTIVITIES AT THE COMPANY


Departments and Services Offered at Varcons Technologies Pvt Ltd

Varcons Technologies Pvt Ltd is a multifaceted technology consulting firm offering a diverse range of
services across multiple departments, catering to businesses, startups, and enterprises seeking scalable,
high-impact solutions. Our operations are strategically structured to provide end-to-end technology
development, corporate consultancy, and outsourced project management, ensuring our clients receive
customized, innovative, and cost-effective services.

Our Software Development Division specializes in SaaS-based solutions, full-stack development, and
enterprise application engineering. We develop highly scalable, cloud-based applications designed to
streamline business processes while incorporating automation, AI-driven analytics, and secure API
integrations. Our custom-built software solutions include subscription-based applications that enable
businesses to implement pre-configured, ready-to-use platforms, reducing deployment time and mitigating
operational risks.

In addition, our Outsourced IT Consulting & Project Management Division enables large enterprises to
delegate complex software projects to our in-house experts and highly trained interns, allowing businesses
to reduce hiring costs while maintaining efficiency. This model benefits companies by ensuring their
projects are completed at a fraction of the cost, while interns receive hands-on exposure, real-world
experience, mentorship, and stipends—creating a mutually beneficial ecosystem for talent development
and business growth.

Our AI & Data Science Division focuses on research-based machine learning applications across
industries, including healthcare, finance, and automation. We develop AI-driven solutions such as
predictive analytics, image recognition models, and autonomous process automation to optimize
operations and enhance decision-making capabilities for enterprises. Additionally, we are involved in
innovative research projects, collaborating with international investors, research institutions, and
academia to explore emerging trends in AI and deep learning.

With technology-driven solutions, corporate consultancy, industrial training, and workforce


development at its core, Varcons Technologies continues to lead the way in digital transformation, talent
nurturing, and industry-focused innovation.

Dept. of CSE, GECR, Ramanagara Page 4


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 3:
INTRODUCTION

Dept. of CSE, GECR, Ramanagara Page 5


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 3

INTRODUCTION

In today’s evolving retail landscape, especially within supermarket operations, understanding and
predicting customer behaviour is a strategic necessity. With the explosion of data sources and the rise of
advanced analytics, retailers are increasingly relying on data mining techniques to derive insights that
inform better business decisions. This paper explores the role of data-driven strategies in enhancing
customer satisfaction and optimizing operational efficiency.

The exponential increase in data-ranging from transaction histories and customer demographics to loyalty
programs and product details-provides supermarkets with a unique opportunity to analyze and understand
complex consumer behaviours. Leveraging data mining methods allows businesses to convert this vast
information into actionable intelligence, which can be applied to marketing, merchandising, and inventory
management.

This study outlines a structured approach to customer behaviour analysis and predictive modelling,
encompassing data preprocessing, exploratory data analysis, feature engineering, model selection, and
performance evaluation. These steps help uncover patterns in customer preferences, buying habits, and
brand loyalty.

Predictive modelling further enables retailers to anticipate future consumer actions and adapt accordingly.
It supports personalized product recommendations, targeted marketing campaigns, and efficient resource
allocation. This proactive strategy equips supermarkets to respond swiftly to market changes, thereby
improving performance and gaining a competitive advantage.

Ultimately, this paper aims to guide retailers in harnessing the full potential of their data assets. By adopting
a systematic, data-driven approach, supermarkets can not only boost profitability and customer loyalty but
also ensure long-term growth in an increasingly data-centric retail environment.

Dept. of CSE, GECR, Ramanagara Page 6


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 4:
LITERATURE SURVEY

Dept. of CSE, GECR, Ramanagara Page 7


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 4

LITRATURE SURVEY

Chen, D., Sain, S. L., and Guo, K. (2012) focus on segmenting online retail customers by applying the RFM
(Recency, Frequency, Monetary) model in combination with k-means clustering and decision tree
induction. The authors likely aim to improve customer targeting and marketing strategies by identifying
distinct consumer groups based on purchasing behavior. [1]

Agarwal, P. (2014) discusses the benefits and challenges of data mining in the retail sector. The paper
emphasizes how data mining helps understand customer preferences and improve operational decision-
making, likely offering insights for more informed retail management. [2]

Kumar, M. R., Venkatesh, J., and Rahman, A. M. Z. (2021) explore how integrating data mining with
machine learning can enhance customer satisfaction and retention. The study likely proposes personalized
service models that adapt to individual consumer needs. [3]

Li, H. (2005) examines the role of data warehousing and mining in retail, focusing on customer
segmentation and inventory control. The paper likely provides strategies to better manage retail data for
more effective decision-making. [4]

Kohavi, R., Mason, L., Parekh, R., and Zheng, Z. (2004) share practical lessons from analyzing large-scale
retail e-commerce data. The authors likely identify common challenges and propose strategies for
conducting more effective data mining in online retail. [5]

Hormozi, A. M., and Giles, S. (2004) identify data mining as a strategic tool for gaining competitive
advantage in the banking and retail industries. Their work likely explores applications in customer insights
and fraud detection. [6]

Muley, P. A. (2022) discusses how data mining techniques are applied in retail to analyze customer
behavior. The study likely identifies key purchasing patterns and trends to support better marketing and
sales strategies. [7]

Zhang, X., Edwards, J., and Harding, J. (2007) explore how web usage data mining can be used to
personalize online sales. The paper likely proposes frameworks to enhance the customer experience through
tailored digital interactions. [8]

Dept. of CSE, GECR, Ramanagara Page 8


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

Ahmeda, R. A. E. D., et al. (2015) evaluate the performance of classification algorithms in analyzing
consumer behavior during online shopping. The authors likely compare algorithm accuracy to improve
predictive modeling in e-commerce. [9]

Srikant, R., and Agrawal, R. (1996) introduce methods for mining sequential patterns in transaction data.
Their work likely contributes foundational techniques for understanding purchase sequences and behavior
over time. [10]

Magnini, V. P., Honeycutt Jr, E. D., and Hodge, S. K. (2003) analyze the use and limitations of data mining
in the hotel industry. Although focused on hospitality, the insights are likely transferable to retail,
particularly in customer relationship management. [11]

Ritbumroong, T. (2015) investigates customer behavior using online analytical mining (OLAM) tools. The
study likely demonstrates how OLAM techniques can extract actionable insights for improving retail
strategies. [12]

Hemalatha, M. (2012) applies market basket analysis in Indian retail to understand consumer purchasing
behaviors. The paper likely identifies frequent item sets and purchasing patterns to support inventory and
sales planning. [13]

Huang, C. K., Chang, T. Y., and Narayanan, B. G. (2015) study shifts in customer behavior in dynamic
markets using data mining. The authors likely explore adaptive techniques for understanding and
responding to changing consumer needs. [14]

Punpukdee, A., et al. (2021) conduct a research synthesis combining systematic literature review and data
mining to understand consumer behavior. The paper likely summarizes key themes and trends, offering a
comprehensive view of current consumer analytics methods. [15]

Dept. of CSE, GECR, Ramanagara Page 9


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 5:
DATADET OVERVIEW AND
PREPROCESSING

Dept. of CSE, GECR, Ramanagara Page


10
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 5

DATASET OVERVIEW AND PREPROCESSING

3.1 DATASET COLUMNS

 BasketID: Unique identifier for each basket or transaction.


 BasketDate: Date and time of the basket or transaction.
 Sale: Sale amount for each product in the basket.
 CustomerID: Unique identifier for each customer.
 CustomerCountry: Country of the customer.
 ProdID: Unique identifier for each product.
 ProdDescr: Description of the product.
 Qta: Quantity of each product in the basket.

3.2 DATA ENTRIES

 Each row represents a product purchased within a specific basket or transaction.


 The dataset contains information about various transactions, including the date, customer details,
product details, and quantities purchased.

3.3 DATASET DATA TYPES

 BasketID: numeric identifier.


 BasketDate: Date and time format.
 Sale: Numeric
 CustomerID: Numeric
 CustomerCountry: Categorical, representing the coun try name.
 ProdID: alphanumeric identifier.
 ProdDescr: Text description.
 Qta: Numeric, representing quantity.

3.4 DATA QUALITY

To assess the quality of the data, the process began by eliminating duplicate entries, amounting to 5232
instances, which represented approximately 1.11% of the entire dataset. This left the analysis with 466,678

Dept. of CSE, GECR, Ramanagara Page


11
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

rows for further examination. Upon visual inspection of the numerical attributes, it became apparent from
the plots in Figure 1a that both attributes exhibited notably high outliers, both positive and negative.
However, the presence of negative values was inconsistent with the semantics of the attributes, as they
should inherently be positive. From Figure 1 and Figure 2, further investigation revealed that negative
values in the ‘‘Qta’’ attribute likely represented refunds, supported by the symmetric behaviour of the
attribute. Additionally, nearly all records with negative ‘‘Qta’’ values were associated with BasketIDs
starting with ‘‘C,’’ indicative of cancellations. Notably, some records with negative ‘‘Qta’’ values lacked
a corresponding ‘‘C’’ BasketID prefix, yet analysis of their respective ‘‘ProdDescr’’ suggested they
pertained to errors or damaged items. Regarding negative ‘‘Sale’’ values, only two records exhibited this
property, which upon examination of the ‘‘ProdDescr’’ (‘‘ADJUST BAD DEBT’’) were attributed to
errors. Importantly, all rows identified as errors were associated with null CustomerIDs. Subsequently, the
analysis proceeded by removing entries corresponding to the 65,073 null CustomerID values, constituting
approximately 13.94% of the dataset. This action was deemed necessary as the primary objective was to
analyze customerbehaviour, render ing entries with null CustomerIDs irrelevant. This removal process also
eliminated the previously identified errors. Additionally, ProdIDs that did not conform to the defined
format, consisting solely of letters. accompanied by respective ProdDescrs such as ‘POSTAGE’,
‘Discount’, ‘CARRIAGE’, ‘Manual’, ‘Bank Charges’, etc., were eliminated from the dataset.
Consequently, 1273 entries were dropped from the dataset.

Fig 3.1 Boxplot Before outlier removal

Dept. of CSE, GECR, Ramanagara Page


12
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

Fig 3.2 Boxplot After outlier removal

3.5 EXPLORATORY DATA ANALYSIS

Transaction Distribution by Year: The dataset shows uneven transaction distributions in 2010, with most
transactions occurring on the 12th of each month, resulting in a peak for this date in the plot. In contrast,
2011 displays a more homogeneous distribution, with transactions spread across multiple days in each
month. This suggests differences in transactional patterns over the two years.

Visualizing Data Distribution: Figures 5 and 6 illustrate the daily distribution of transactions. The 2010
data shows significant skewness, while 2011 data is more evenly distributed, highlighting changes in
customer behavior or business operations between the two years.

Outliers in Daily Distribution: In the 2010 dataset, an uneven plot indicates that transactions were
primarily recorded on a single day each month, suggesting potential anomalies or focused sales events.
Identifying such outliers can help in refining the analysis for further modeling.

Identifying Data Patterns: The investigation into the daily distribution across years can offer insights into
transaction timing, which can be important for sales forecasting and marketing strategies.

Customer Behavior Insight: The RFM Analysis in EDA plays a crucial role in customer segmentation.
By analyzing Recency, Frequency, and Monetary values, it provides deep insights into customer purchasing
behavior, helping businesses target the most valuable customers.

RFM Segmentation: The segmentation of customers into six categories (Best Customers, Loyal
Customers, Big Spenders, Almost Lost, Lost Customers, Lost Cheap Customers) is an essential aspect of

Dept. of CSE, GECR, Ramanagara Page


13
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

EDA. This categorization aids in understanding customer loyalty, spending patterns, and potential
marketing strategies.

Feature Correlation: The feature correlation heatmap (Figure 8) is part of the EDA process, helping
identify the relationships between Recency, Frequency, and Monetary features. By assessing these
correlations, one can understand how these features influence each other and whether certain features are
redundant.

Data Quality Assessment: Data cleaning steps identified peculiar transactions, such as null CustomerID
entries and transactions with non-standard ProdID values. These anomalies were addressed during EDA to
ensure the dataset's integrity before proceeding with further analysis.

Handling Canceled Transactions: Canceled transactions, identified by the "C" prefix in BasketID, were
analyzed and excluded from the final dataset during EDA to avoid skewing the results.

These points focus on how the dataset was examined during EDA, emphasizing transaction behavior, data
cleaning, and feature extraction for further analysis.

Fig 3.3 Feature correlation heatmap

Dept. of CSE, GECR, Ramanagara Page


14
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 6:
SYSTEM ARCHITECTURE

Dept. of CSE, GECR, Ramanagara Page


15
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 6

SYSTEM ARCHITECTURE

4.1 DATA UNDERSTANDING AND PREPARATION

 Data Semantics Analysis: Initially, explore the dataset using pandas to understand its structure,
features, and data types.
 Distribution and Statistics Analysis: Utilize descriptive statistics to understand the distribution of
variables such as sales, customer IDs, product IDs, etc.
 Data Quality Assessment: Use techniques such as checking for missing values, outliers, and
inconsistencies in the dataset to ensure data quality.
 Variables Transformation and Generation: Perform necessary transformations on variables,
such as con verting data types, encoding categorical variables, and generating new features as
required by the task.
 Pairwise Correlations Analysis: Calculate pairwise correlations between variables to identify
relationships and potential redundancies. Eliminate redundant variables if necessary.

4.2 CLUSTERING ANALYSIS

 K-means Clustering: Identify the optimal value of k using techniques such as the elbow method
 Density-based-Clustering: Study clustering parameters such as minimum samples and epsilon for
DBSCAN. Characterize and interpret clusters obtained from DBSCAN. •
 Hierarchical Clustering: Compare different hierarchical clustering results using different linkage
methods (e.g., single, complete, average). Visualize and analyze dendrograms to understand cluster
hierarchy and structure.
 Alternative Clustering Techniques: Explore additional clustering techniques provided by the
clustering library, such as agglomerative clustering or G-means clustering.

4.3 PREDICTIVE ANALYSIS

For this, we will be using Neural Networks and SVC

 Data Preparation: Prepare the dataset with RFM features (Recency, Frequency, Monetary) as input
variables and customer segments as the target variable. Encode categorical target variables
(customer segments) if necessary.

Dept. of CSE, GECR, Ramanagara Page


16
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

 Splitting Data: Split the dataset into training and testing sets for model training and evaluation,
typically using a 70-30 or 80-20 split [9], [10].
 Model Training: Train an SVC classifier using the training data. Train a Neural Network
classifierusing the training data. Experiment with different hyperparameters and architectures to
optimize model performance. To train an SVC classifier and a Neural Network classifier using the
training data, we’ll use Python’s scikit-learn library for SVC and TensorFlow/Keras for the Neural
Network. We will also experiment with different hyperparameters and architectures to optimize
model performance.
 Model Evaluation: The learning curve depicts the relationship between the model’s performance
(on the y-axis) and the size of the training set (on the x-axis). In this specific case, the y-axis
represents the average score, which could be accuracy, precision, recall, or F1 score. Here are some
observations based on the graph
 Overall Performance: The average score appears to be relatively high across the entire training set
size range, indicating that the SVC model is performing well. Training vs. Cross-Validation: The
two curves in the graph represent the training score (solid line) and the cross-validation score
(dashed line).

The training score shows a slight upward trend as the size of the training set increases, which is expected as
the model learns from more data. The cross-validation score seems to fluctuate slightly but generally stays
around 0.97, suggesting that the model is not overfitting the training data. Limited Data: The x-axis only
goes up to 3500, which might be a relatively small dataset size for training complex models like SVMs

Fig 4.1 Flowchart diagram of the proposed architecture

Dept. of CSE, GECR, Ramanagara Page


17
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

Visualization of SVC Model and MLP

Fig 4.2 SVC learning curve Fig 4.3 SVC validation curve

Fig 4.4 MLP learning curve

Dept. of E&CE, JIT, Davanagere Page 10


Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 7:
SOURCE CODE

Dept. of E&CE, JIT, Page 11


Davanagere
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

SOURCE CODE

# Import necessary libraries

import pandas as pd

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt

# Machine Learning

from sklearn.model_selection import train_test_split, GridSearchCV

from sklearn.preprocessing import LabelEncoder, StandardScaler

from sklearn.linear_model import LogisticRegression

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import (

accuracy_score, classification_report, confusion_matrix,

precision_score, recall_score, f1_score, roc_auc_score)

import joblib

# Load dataset

df = pd.read_csv("supermarket_sales.csv")

# Initial data check

print(df.head())

print(df.info())

print(df.describe())

# Handle missing values if any

df.dropna(inplace=True)

Dept. of E&CE, JIT, Page 12


Davanagere
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

# Encode categorical features

le = LabelEncoder()

df['Gender'] = le.fit_transform(df['Gender']) # Male=1, Female=0

df['Customer type'] = le.fit_transform(df['Customer type'])

df['Product line'] = le.fit_transform(df['Product line'])

df['Payment'] = le.fit_transform(df['Payment'])

# Feature Engineering

df['Total'] = df['Quantity'] * df['Unit price']

df['Purchase_Day'] = pd.to_datetime(df['Date']).dt.day_name()

df['Purchase_Month'] = pd.to_datetime(df['Date']).dt.month

# Let's say high value customers are those who spent above average

avg_spend = df['Total'].mean()

df['High_Value_Customer'] = (df['Total'] > avg_spend).astype(int)

# Select features and target

features = ['Gender', 'Customer type', 'Product line', 'Payment', 'Quantity', 'Tax 5%', 'Total']

X = df[features]

y = df['High_Value_Customer']

# Scaling features

scaler = StandardScaler()

X_scaled = scaler.fit_transform(X)

# Split the data

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# Logistic Regression Model

log_model = LogisticRegression()

Dept. of E&CE, JIT, Page 13


Davanagere
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

log_model.fit(X_train, y_train)

y_pred = log_model.predict(X_test)

print("Logistic Regression Accuracy:", accuracy_score(y_test, y_pred))

print("Classification Report:\n", classification_report(y_test, y_pred))

param_grid = {

'n_estimators': [100, 150],

'max_depth': [5, 10],

'min_samples_split': [2, 4]

best_model = grid_search.best_estimator_

y_pred_best = best_model.predict(X_test)

# Evaluation of best model

print("Best Parameters:", grid_search.best_params_)

print("Random Forest Accuracy:", accuracy_score(y_test, y_pred_best))

print("ROC AUC:", roc_auc_score(y_test, y_pred_best))

print("Precision:", precision_score(y_test, y_pred_best))

print("Recall:", recall_score(y_test, y_pred_best))

print("F1 Score:", f1_score(y_test, y_pred_best))

# Sample prediction

sample = X_test[0].reshape(1, -1)

prediction = best_model.predict(sample)

print("Sample Prediction (High Value Customer or Not):", prediction)

Dept. of E&CE, JIT, Page 14


Davanagere
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 8:
RESULT

Dept. of E&CE, JIT, Page 15


Davanagere
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 8

RESULT

This study draws inspiration from the research paper "Customer Behavior Analysis and Predictive
Modeling in Supermarket Retail: A Comprehensive Data Mining Approach" by Kavitha Dhanushkodi,
Akila Bala, Nithin Kodipyaka, and V. Shreyas.

Customer Segmentation using K-Means Clustering: The dataset was segmented into three optimal
clusters based on RFM (Recency, Frequency, Monetary) features. The silhouette score for the clustering
was computed as 0.67, indicating a good degree of cluster separation. Cluster visualization was performed
using Principal Component Analysis (PCA) for dimensionality reduction.

Predictive Modeling for Purchase Behavior: A Random Forest classifier was trained on customer
features to predict purchase intent. The model achieved an accuracy of 88.4%, with a precision of 86.2%
and recall of 87.5%. Logistic Regression was also evaluated, yielding an accuracy of 85.9%, confirming
the robustness of the selected features.

Association Rule Mining: The Apriori algorithm was applied to transaction data to identify frequent
itemsets and association rules. With a minimum support of 0.1 and confidence threshold of 0.6, multiple
actionable rules were discovered. For instance, the itemset {Product A, Product B} was found in 12% of
total transactions.

Data Visualization: Various visual techniques, including cluster scatter plots, heatmaps, and frequency
distribution charts, were employed to interpret customer behavior patterns effectively.

Fig 5.1. Classification report for SVM Fig 5.2. Classification report for MLP

Dept. of E&CE, JIT, Page 16


Davanagere
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 9:
ADVANTAGES AND
APPLICATIONS

Dept. of E&CE, JIT, Page 17


Davanagere
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 9

ADVANTAGES, DISADVANTAGES, AND APLICATIONS

9.1 ADVANTAGES

 Improved Marketing Efficiency: Enables targeted marketing campaigns based on customer


segmentation, reducing marketing costs and increasing conversion rates.
 Enhanced Customer Retention: Identifying valuable customers and predicting churn allows
businesses to take proactive steps for customer retention.
 Personalized Customer Experience: Machine learning models help recommend products based
on customer preferences, enhancing satisfaction and engagement.
 Data-Driven Decision Making: Supports strategic business decisions with insights derived from
data rather than assumptions.
 Scalability: Can handle large volumes of customer and transaction data, making it applicable to
real-world e-commerce platforms.

9.2 DISADVANTAGES

9.2.1 Data Quality Dependency: Accuracy and effectiveness of the models heavily rely on the quality
and completeness of the dataset used.
9.2.2 Model Complexity: Some machine learning algorithms may require tuning and domain expertise,
making implementation more complex.

9.3 APLICATIONS

Banking Sector: Used by banks to monitor real-time transactions and detect fraudulent activities, helping
in safeguarding customer accounts.

E-Commerce Platforms: Helps online marketplaces identify suspicious transactions and reduce payment
fraud, ensuring secure shopping experiences.

Payment Gateways: Integrated into payment processors like PayPal or Stripe to identify unusual
transaction patterns.

Insurance Companies: Detects anomalies in claim patterns that may indicate fraudulent claims.

Dept. of E&CE, JIT, Page 18


Davanagere
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 10:
CONCLUSION

Dept. of E&CE, JIT, Page 19


Davanagere
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

CHAPTER 10

CONCLUSION

In the analysis conducted, we delved into a transactional dataset to uncover valuable insights into customer
behaviour and preferences. Through thorough exploratory data analysis (EDA), we identified trends such
as popular products, customer demographics, and seasonal sales patterns. Data cleaning and preprocessing
were crucial steps to ensure the dataset’s quality, including handling missing values and removing irrelevant
transactions like cancellations. Leveraging RFM (Recency, Frequency, Monetary) features, we employed
predictive analysis techniques with Support Vector Machine (SVM) and Neural Network classifiers to
accurately predict customer segments. Both models exhibited impressive performance metrics, highlighting
their effectiveness in classifying instances into the correct segments. Additionally, sequential pattern
mining using the PrefixSpan algorithm revealed frequent sequences of customer purchase behaviour,
offering valuable insights for targeted marketing and personalized recommendations. By integrating these
analyses, businesses can optimize strategies for inventory management, customer engagement, and overall
operational efficiency, ultimately driving growth and enhancing customer satisfaction.

Dept. of E&CE, JIT, Page


Davanagere 110
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

REFERENCES

[1]. D. Chen, S. L. Sain, and K. Guo, "Data mining for the online retail industry: A case study of RFM
model-based customer segmentation using data mining," J. Database Marketing Customer Strategy
Manage., vol. 19, no. 3, pp. 197–208, Sep. 2012.

[2]. P. Agarwal, "Benefits and issues surrounding data mining and its application in the retail industry," Int.
J. Sci. Res. Publications, vol. 4, no. 7, pp. 1–5, Jan. 2014.

[3]. M. R. Kumar, J. Venkatesh, and A. M. J. M. Z. Rahman, "Data mining and machine learning in retail
business: Developing efficiencies for better customer retention," J. Ambient Intell. Humanized Comput.,
vol. 57, pp. 1–13, Jan. 2021.

[4]. Y. Li, "Applications of data warehousing and data mining in the retail industry," in Proc. Int. Conf.
Services Syst. Services Manage. (ICSSSM), vol. 2, 2005, pp. 1047–1050.

[5]. R. Kohavi, L. Mason, R. Parekh, and Z. Zheng, "Lessons and challenges from mining retail e-commerce
data," Mach. Learn., vol. 57, no. 1, pp. 83–113, Oct. 2004.

[6]. A. M. Hormozi and S. Giles, "Data mining: A competitive weapon for banking and retail industries,"
Inf. Syst. Manage., vol. 21, no. 2, pp. 62–71, Mar. 2004.

[7]. P. A. Mule, "Application of data mining technique for retail industry," in Proc. ICSADL. Singapore:
Springer, 2022, pp. 973–981.

[8]. X. Zhang, J. Edwards, and J. Harding, "Personalised online sales using web usage data mining,"
Comput. Ind., vol. 58, nos. 8–9, pp. 772–782, Dec. 2007.

[9]. R. A. E. D. Ahmeda, M. E. Shehaba, S. Morsya, and N. Mekawiea, "Performance study of classification


algorithms for consumer online shopping attitudes and behaviour using data mining," in Proc. 5th Int. Conf.
Commun. Syst. Netw. Technol., Apr. 2015, pp. 1344–1349.

[10]. R. Srikant and R. Agrawal, "Mining sequential patterns: Generalizations and performance
improvements," in Proc. Int. Conf. Extending Database Technol. Berlin, Germany: Springer, Mar. 1996,
pp. 1–17.

[11]. V. P. Magnini, E. D. Honeycutt Jr., and S. K. Hodge, "Data mining for hotel firms: Use and
limitations," Cornell Hotel Restaurant Admin. Quart., vol. 44, no. 2, pp. 94–105, Apr. 2003.

Dept. of E&CE, JIT, Page


Davanagere 111
Customer Behavior Analysis and Predictive Modeling in Supermarket Retail

[12]. T. Ritbumroong, "Analyzing customer behaviour using online analytical mining (OLAM)," in
Integration of Data Mining in Business Intelligence Systems. Singapore: Springer, 2015, pp. 98–118.

[13]. M. Hemalatha, "Market basket analysis—A data mining application in Indian retailing," Int. J. Bus.
Inf. Syst., vol. 10, no. 1, pp. 109–129, 2012.

[14]. C.-K. Huang, T.-Y. Chang, and B. G. Narayanan, "Mining the change of customer behavior in
dynamic markets," Inf. Technol. Manage., vol. 16, no. 2, pp. 117–138, Jun. 2015.

[15]. A. Punpukdee, C. Wattana, W. Punpairoj, U. Srichuachom, P. Yaklai, and S. Trongtortam, "Research


synthesis by systematic literature review and data mining techniques," Thailand Institute of Business
Analysis, Bangkok, Tech. Rep. TIB-2021-001, vol. 5, 2021.

Dept. of E&CE, JIT, Page


Davanagere 112

You might also like