Customer Segmentation Based on RFM Model and Clustering Techniques With K-Means Algorithm
Customer Segmentation Based on RFM Model and Clustering Techniques With K-Means Algorithm
Abstract- Every day there is a transaction process performed unknown or hidden information can be known by processing
by Customer. The process generates a lot of data where there are the data so that it is useful for the credit business agent [4], for
82,648 transactions from the month of January-December 2017. example in which information on the grouping of agent data
This study aims to perform customer segmentation on Nine
has the potential to give the most profit to the company which
Reload Credit by utilizing data mining process based on RFM
will help companies to make decisions in product marketing.
model and by using techniques Clustering. The algorithm used
for cluster formation is K-Means algorithm. K-Means produces a The model used by the researcher is RFM (Recency,
visual cluster model with the Rapidminer 5.2 tools that represent Frequency, Monetary) commonly used to perform the last visit
the number of customers in each cluster by using RFM (Recency, time grouping, visit frequency, and revenue obtained by the
Frequency, and Monetary) attributes. From 82,648 transactions company [5]. The reason why continuing to use the RFM
that were then processed, based on RFM model it resulted in 102 model is that it is easy to use and quickly implemented in
Customers. Furthermore, we analyzed cluster by using K-Means companies, and in addition RFM is easily understood by
algorithm with the result of 63 Customers in Cluster 1 and 39 managers and marketing decision makers [6].
Customers in Cluster 2. The result of this research can be used
The results of this study can be used as a decision support
by company to know customer category, and then the company
system in the credit business to map customers and to know
will know how to maintain the customer owned.
potential customers.
Keywords—Data Mining; RFM Model; Cluster Analysis; Customer
Segmentation; K-Means Algorithm. II. LITERATURE REVIEW OF RFM MODEL
Authorized licensed use limited to: International Institute of Information Technology-Raipur. Downloaded on April 23,2025 at 12:45:50 UTC from IEEE Xplore. Restrictions apply.
transaction where the amount of the data is very much. Every d. Group the data by the closest distance between data with
month, there are thousands of transactions. The total number centroid.
of transactions for a year is 82,648 times collected from
January-December 2017. After the data is mapped by using IV. A CASE STUDY
RFM variable, it will be combined with K-Means algorithm to
The dataset used in this case study is credit sales data
categorize from each customer so that from the process the
on Nine Reload Credit Server. At the company there is a lot of
company will be able to know the category of each customer.
data stacking, thousands of transactions every month. You can
III. REVIEW OF CLUSTER ANALYSIS imagine how difficult it would be if you had to analyze the
data manually one by one. The researchers tried to analyze the
Data mining is a process that uses statistics, data as much as 82,648 customer transactions. The model
mathematics, artificial intelligence, and machine learning proposed in determining the profitable customer is described
techniques to extract and identify useful information and in Figure 1 which shows the steps to determine the profitable
related knowledge from large databases. Data mining is a part customer.
of knowledge discovery data which is a useful, unknown, and
hidden information extraction process from data [4].
Data mining aims to obtain a relationship or pattern Transaction
that may provide useful indications [10]. The relationship Dataset
Marketing strategies
sought by data mining is a relationship between two or more
in one dimension [10].
This research using K-means to grouping data
transaction with consideration, such as:
1. Could not specified the number of manual data cluster. RFM
2. Unknown a cluster central point of data. Segmentation
3. Difficult to grouping the customer types with the amount Data
of data 82.648 preprocessing
Authorized licensed use limited to: International Institute of Information Technology-Raipur. Downloaded on April 23,2025 at 12:45:50 UTC from IEEE Xplore. Restrictions apply.
Table 1. TRANSACTION DATASET weighting was divided into 5 scales/ scores as listed in Table
4.
Data Preparation 1 longest >8 Month lower <2000 fewer <50 Million
Authorized licensed use limited to: International Institute of Information Technology-Raipur. Downloaded on April 23,2025 at 12:45:50 UTC from IEEE Xplore. Restrictions apply.
Table 7. CALCULATION RESULT OF EACH DATA 9 C011 ASNEY TRONIK
10 C012 ATIKA CELL
CUSTOMER Closest
R F M C1 C2 11 C014 AYTHA CELL
CODE Distance
C001 5 2 1 0.985150517 3.669114335 0.985150517 12 C015 BARLI TRONIK
C002 1 1 1 3.527989798 0.471593045 0.471593045
13 C016 BOYOUT21
C003 1 1 1 3.527989798 0.471593045 0.471593045
14 C017 CAHAYA CELL
C004 5 1 1 0.50619742 3.564042648 0.50619742
C005 5 1 1 0.50619742 3.564042648 0.50619742 15 C018 DEZTI CELL
16 C019 DIA TRONIK
4. After all the data is placed into the closest cluster, then 17 C020 EB TRONIK
recalculate the new cluster center based on the member 18 C021 ERNI CELL
average in the cluster.
19 C022 FAIT CELL
5. After obtaining a new center point for each cluster,
20 C023 FITRI CELL
repeat the third step until the center point of each cluster
21 C024 FITRI POJOK CELL
is fixed, and no data moves from one cluster to another.
22 C025 GRISELDA CELL
From the results of data processing performed, based on the 23 C026 HERA CELL
customer transaction dataset using K-Means through 4 24 C027 HESTI CELL
iterations in the form of clusters as shown in Figure 2, shows
25 C028 HILYA CELL
that the clustering results obtained 63 members of cluster 1, 39
26 C029 IBU CELL
members of cluster 2.
27 C030 LIA CELL
28 C031 LIDA CELL
29 C032 MUJI ASTUTI
30 C033 MUSTIKA
31 C034 NABIL CELL
32 C035 NDARI CELL
33 C036 ONDLENK CELL
34 C037 PUJI CELL
35 C038 QORY CELL
36 C039 RARA CELL
Authorized licensed use limited to: International Institute of Information Technology-Raipur. Downloaded on April 23,2025 at 12:45:50 UTC from IEEE Xplore. Restrictions apply.
53 C064 ARRASYID RELOAD 29 C092 AGUSTIN CELL
54 C068 DEDERIZKY CELL 30 C093 FAKIH CELL
55 C069 FAIS CELL 31 C094 TABALONG-RELOAD
56 C070 TASY CELL 32 C095 MEI-TRONIK
57 C071 ADIVA CELL 33 C096 KALILLA CELL
58 C073 FAIZAL CELL 34 C097 DELTRA TRONIK
59 C076 FATH CELL 35 C098 DWI
60 C077 LUCAS TRONIK 36 C099 AJENG JKT
61 C079 LULU CELL 37 C100 AYU
62 C088 UNYIEL 38 C101 DWI CELL
63 C090 YUNITA CELL 39 C102 EGA CELL
Authorized licensed use limited to: International Institute of Information Technology-Raipur. Downloaded on April 23,2025 at 12:45:50 UTC from IEEE Xplore. Restrictions apply.
Ratnam. 2012. “A Study of Data Mining Tools in
Knowledge Discovery Process.” International Journal of
Soft Computing and Engineering 2 (3):191–94.
[5] Wongchinsri, Pornwatthana, and Werasak Kuratach.
2016. “A Survey -Data Mining Frameworks in Credit
Card Processing.” 2016 13th International Conference
on Electrical Engineering/Electronics, Computer,
Telecommunications and Information Technology, ECTI-
CON 2016.
https://doi.org/10.1109/ECTICon.2016.7561287.
[6] Peiman Alipour Sarvari, Alp Ustundag, and Hidayet
Takci. 2014. “Performance Evaluation of Different
Customer Segmentation Approaches Based on RFM and
Demographics Analysis.” Kybernetes 43 (8):1209–23.
https://doi.org/10.1108/K-01-2015-0009
[7] Rachid, et al. 2015. “Combining RFM Model and
Clustering Techniques for Customer Value Analysis of a
Company selling online.” 2015 12th International
Conference of Computer Systems and Applications
(AICCSA) 2015,1-6.
[8] Liu Jiali and Du Hyung. 2010. “Study on Airline
Customer Value Evaluation Based on RFM Model
(2010).” 2010 International Conference On Computer
Design And Appliations (ICCDA 2010) ,278-281
[9] Aviliani, U. Sumarwan, I. Sugema, and A. Saefuddin.
2011. “Segmentasi Nasabah Tabungan Mikro
Berdasarkan Recency, Frequency, dan Monetary : Kasus
Bank BRI.” Finance and Banking Journal 13 (1):95–
109.
[10] Kusrini Luthfi, Ema Taufiq. 2009. Algoritma Data
Mining. Edited by Theresia Ari Prabawati. Yogyakarta:
C.V Andi OFFSET.
https://books.google.co.id/books?id=Ojclag73O8C&pg=
PA3&dq=data+mining+adalah&hl=id&sa=X&ved=0ah
UKEwijrefgpYnZAhXBPY8KHWeJCQ4Q6AEIKzAA#
v=onepage&q=data mining adalah&f=false.
[11] Lubis, Abdul Haris. 2016. “Model Segmentasi Pelanggan
Dengan Kernel K-Means Clustering Berbasis Customer
Relationship Management.” Jurnal & Penelitian Teknik
Informatika 1:36–41.
[12] Rahman, Aulia Tegar; Wiranto ;Rini Anggrainingsih.
2017. “Coal Trade Data Clustering Using K-Means (
Case Study PT . Global Bangkit Utama )” 6 (1):24–31.
Authorized licensed use limited to: International Institute of Information Technology-Raipur. Downloaded on April 23,2025 at 12:45:50 UTC from IEEE Xplore. Restrictions apply.