Market Basket Analysis With Association Rules
Market Basket Analysis With Association Rules
To cite this article: Yüksel Akay Ünvan (2020): Market basket analysis with association rules,
Communications in Statistics - Theory and Methods, DOI: 10.1080/03610926.2020.1716255
Article views: 9
1. Introduction
Information technologies continue to develop rapidly these days and companies can
obtain, store, analyze and interpret the data they are interested in much easier and at a
lower cost. With the development of data mining today, data sets have gained a great
deal of value. The association rules, which can determine the buying behavior of cus-
tomers who shop at retail stores or e-commerce sites, are one of the most used data
mining techniques and identify products or product groups that are highly interdepend-
ent with each other. For this purpose, market basket analysis using association rules is
one of the most popular methods. The association rules, which use algorithms such as
Apriori and FP Growth, give the top 10 rules.
In this study, it is aimed to perform a market basket analysis by using association
rules. For this purpose, taking a data set of a supermarket on the Vancouver Island
University website was used (Vancouver Island University 2019). This data set contains
information on purchases made by customers for 255 different products. After the
CONTACT Y€uksel Akay Unvan€ [email protected] Faculty of Management, Banking and Finance
Department, Ankara Yıldırım Beyazıt University, Ankara, Turkey.
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/lsta.
ß 2020 Taylor & Francis Group, LLC
2 Y. A. ÜNVAN
necessary examinations and studies on the data are made, they will be analyzed with
the Weka program and necessary interpretations will be made.
2. Literature
Ulaş et al. (2001) applied basket analysis in their study to some of the sales data
obtained from various stores of Gima T€ urk A.Ş. The database used contained 756,868
transactions, 140,610 items and 7,237 kinds of goods. Each item was on average 105
items. There were 9,985 registered customers in the database. Since the data used was
taken in the summer, the best-selling products were; tomatoes, bread, and cucumbers.
In addition, products such as eggs and watermelons were also identified. Based on these
data, it can be said that people tend to eat, especially light meals in the summer
months. “77% of cucumber areas receive tomatoes” or “Purslane and 55% of tomato
areas receive parsley” can be cited as examples of the rules.
Chen et al. (2005) studied market basket analysis and stated that the proposed
method is efficient in terms of calculations. They also identified that the stores vary in
size and that this method is advantageous over the traditional method when more stores
and periods are used.
By using the Association Rules, Timor and Şimşek (2008) analyzed the exchange data
belonging to customers of a large supermarket chain in Turkey operating in the retail
sector. With the Association Rules and basket analysis, it was determined which prod-
ucts the customers purchased with which products. Then, the variables that affect the
purchasing behavior of the customers were determined by decision trees. When the val-
ues obtained as a result of the analyzes were examined, it was seen that customers who
buy a certain product X also buy a certain amount of Y product, but vice versa, those
who buy product Y are not at the same rate as those who buy product X. It has been
said that this and similar information may be used in campaign arrangements or shelf
arrangements and promotion and sales of products not associated with related products.
Song et al. (2009) conducted a competitive structure analysis of the Chinese soybean
import market. The study revealed that Chinese soybean importers may have stronger
market power in the Chinese soybean import market. It also develops a model that will
test the strength of soybean trade in the US-China market and develops a two-country
partial balance trade in the research and estimates it at the same time. The results sup-
port the hypothesis that Chinese soybean importers have a stronger market power than
US soybean exporters.
Musalem, Aburto, and Bosch (2018) presented an approach to identify the relation-
ships between product categories used to divide a retailer’s business into category sub-
sets. Browser data were used to reveal product category dependencies. Since the number
of possible relationships between them may be very large, the authors provided an
approach that produces an intuitive graphical representation of these relationships using
data analysis techniques found in standard statistical packages such as multidimensional
scaling and clustering. As a result of the analysis, four groups of product categories pur-
chased by customers emerged. The analysis of each of these groups was conducted with
the retail store under consideration as a small sub-group. As a result, it showed that
retailers can potentially benefit if they switch to a customer management approach that
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS 3
identifies relationships between product categories rather than the traditional category
management approach, where they manage product categories separately.
Setiabudi et al. (2011) implemented yet another MBA method in their paper. They
analyzed the buying habit of shopping users using MBA. The study conducted the
evaluation of implemented MBA on minimarket X. They used well know Apriori
method for discovering a frequent set of items which are frequently appeared in trans-
action history as well as a database. The itemsets exceeding the threshold of minimum
support value were selected as frequent itemsets. Such selected itemsets were further uti-
lized to generate association rules followed by decoding. Every selected frequent itemset
was able to generate association rules and hence compute the confidence using hybrid
dimension association rules. The experimental results claimed that their implemented
MBA can able to generate knowledge about the kind of items that were frequently pur-
chased in a similar time frame by the customers using the criteria of hybrid dimension
association rules. Their mining process outcomes showed the correlation among associ-
ation rules and confidence that can be analyzed.
Raeder and Chawla (2011), in their study, took a different approach to mining
process data. By modeling data as a product network, they discovered impressive
communities (clusters) in the data. In the network-based approach, they have shown
that they can isolate inferences between products precisely and reduce the need to
search through a list of aggregate rules. First, they examined the characteristics of
productive networks and showed that identifying communities within these networks
could reveal meaningful relationships between association rules and products that
may be difficult to find.
Dogan, Erol, and Buldu (2014) analyzed the data belonging to customers of a leading
insurance company in the industry operating in Turkey, using the Apriori algorithm.
As a result of this analysis, it can be seen which product groups the customers prefer to
buy together. When minimum support value is set as 4% and minimum trust value is
set as 15%; The results of the first five associations that had the highest support and
confidence values were determined as follows. 64% of the customers who purchased
Casco Insurance also received Traffic Insurance and these persons constitute 17% of the
total customers. 55% of those who have received fire insurance have purchased DASK
(Compulsory Earthquake Insurance) and they constitute 9% of the total persons. 47% of
those who have Compulsory Earthquake Insurance insurance have also purchased fire
insurance and these persons make up 9% of the total persons. 34% of those who have
received fire insurance also purchased Traffic Insurance and they accounted for 6% of
total customers. 33% of those who have received fire insurance have also purchased
Casco Insurance and these persons make up 5% of total customers.
Roodpishia and Nashtaei (2015) said that today many organizations are focused on
discovering the hidden patterns of their customers to maintain their competitive pos-
ition through customer analysis. They are now aware that organizations are now the
most valuable resource for customers. Their research was conducted using data related
to 300 customers of an insurance company in Anzali city of Iran and the K-Means clus-
tering method was used. Using the demographic variables such as gender, age, occupa-
tion, education level, marital status, place of residence and income of customers, the
optimum number of clusters was determined to obtain the data required to group the
4 Y. A. ÜNVAN
customers. Later, researchers used the partnership rules method to find hidden patterns
in the insurance industry.
In a study by Dogan (2015) some statistical inferences regarding the password struc-
tures of users of an e-commerce site were revealed. Accordingly, user password lengths
ranged from 4 characters to 12 characters. The average password length was 7.1 and it
was determined that 53% of the passwords were generated using only one character.
Based on this, it was found that the majority of passwords do not have sufficient secur-
ity. In the analysis, a data set of 9997 people with all variables was used. Nine meaning-
ful and useful rules were found after the elimination process in the rules obtained from
association analysis. In rule 1; Persons living in the Southeastern Anatolia Region with a
password complexity value of 2 and a shorter password length were in the 25–44 age
group with 98% accuracy. In rule 2; Of the male users who were male, who live in
Central Anatolia and who are between the ages of 45–64, the password length is short
(1). Of the 12 people who met the qualifications in the premise of the rule, 12 had the
same attribute (password complexity value 1) in the successive part of the rule. The
other seven rules can be interpreted in this way.
Kaur and Kang (2016) stated that the data mining technique, IE merger rule mining,
is presented as an adjunct technique in examining customer behavior and increasing
sales. Merger rule mining is said to be useful for discovering interesting relationships
hidden in large data sets. In their research, different mining types such as masonry rule
mining, classification, clustering, and other techniques were discussed. The study also
discussed the association rules, namely the two basic measures for support and trust.
Merger rule mining technique, rule induction technique, and Apriori algorithm were
examined after application. As a result of the analysis, it was found that a strong rela-
tionship exists between milk and butter. It was also concluded that many customers
bought milk and butter together. It is said that these rules can help retailers understand
customers’ purchasing capabilities.
€
Ozçalıcı (2017) in his research for the secondhand vehicle market, put together a data
set consisting of 73 different variables belonging to over two hundred thousand cars
from an e-commerce site by web scraping method. In order to use the Apriori algo-
rithm, 100 rules with a support value of 10% and the trust value of over 70% were iden-
tified with the variables whose values were rearranged. If you want to buy a vehicle
worth between 30,000 Turkish Liras and 50,000 Turkish Liras, the vehicles will most
likely include ABS, electric windscreen, central lock, CD player, electric mirror and
manual gear.
Gangurde, Kumar, and Gore (2017) designed an optimized technique for Market
Basket Analysis (MBA) to estimate and analyze customers’ buying behavior. The study
faced two difficulties in making the analysis. The first challenge was data cleanup since
none of the available techniques considered the possibility of the raw data or noisy data
in the transaction history. The second challenge was that customers’ demands constantly
change in terms of season and time. In addition, the output of the market basket ana-
lysis depends entirely on time and season and is therefore required to be performed
repeatedly. Therefore, a dynamic and automated MBA framework was needed. They
employed new algorithms based on data cleaning; Apriori and FFNN in solving these
challenges. The performance of the proposed approach was evaluated according to
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS 5
existing methods. It was stated that the results were important and promising to
demand the effectiveness of the proposed approach.
Moodley et al. (2018) stated that in recent years (especially after the recession in the UK
in 2008) competition in the shopping sector has intensified, shopping habits and demo-
graphic characteristics have changed and price sensitivity has become increasingly import-
ant. Numerous studies have been undertaken to understand the items that are often
purchased together (association rule mining/frequent product sets) with a few measures
proposed to collect substantial support and to establish confidence at different levels of
accuracy as these criteria are highly content-dependent. Uninorms was used as an alterna-
tive measure to increase support and confidence in the analysis of market basket data
using the UK grocery retail sector as a case study. Experiments were conducted on con-
sumer panel data to compare Uninorm with three other popular measures (Jaccard,
Cosine, and Conviction). Uninorm has been found to outperform other models when it
complies with the basic monotonic characteristics of support in market basket analysis.
Liew (2018) performed a study to reveal the basic nutritional habits related to phys-
ical activity with the students’ opinions about the food quality in the school cafeteria
and vending machines. The empirical analysis was based on the 2011 Healthy School
Program (HSP) Assessment. HSP assesses the demographic characteristics, nutritional
habits, and exercise patterns of a representative sample of primary, secondary, and high
school students in the United States. The findings showed that students assigned to dif-
ferent clusters have different eating habits, exercise models, weight status, weight man-
agement, and opinions about the quality of food in the school cafeteria and vending
machines. Also, there were great differences in diet profiles and lifestyle behaviors
among students who were not sure of their overweight or weight status.
Bilgiç (2019) stated that market basket analysis is very important in terms of under-
standing consumers’ preferences and purchasing behaviors in the retail sector and
developing the most suitable production and marketing strategies. The study not only
provided useful results for the company which was the focus of the research but also
encourages the researchers to use the shared R programing language in detail through-
out the study so that both researchers and retailers can analyze their data with advanced
algorithms. Besides, unlike many other studies, the relationship between the products
was found rather than working with more general product groups. As a result of the
analysis of the strongest buying behavior; it was determined that the customers who
buy eggs also shop from the grocery store.
Yulianto and Heryanto (2019) conducted research on software from e-commerce appli-
cations using market basket analysis to market handmade products produced by the
Handicraft Industry. From the analysis, two main results were obtained. 1. With the estab-
lished e-commerce application, the handicraft industry is expected to assist the business
process in the marketing of handmade products. 2. The use of the basket, basket analysis
method can improve the quality of service to customers, particularly in providing product
selection information, and can directly help owners decide on innovation.
Raja et al. (2019) collected the the market-based data and determined the frequent
and non frequent item sets in order to uncover the reasons for the sales data. By this
way the least preferred product of the market was identified. In addition, a visualization
technique was used to make sales data more comprehensible than previously seen. The
6 Y. A. ÜNVAN
authors also claimed that the research they proposed in which the profit and loss of
each product was examined can be used for supermarket-based organization in future.
Rezende and Ladeira (2019) studied on Market Basket Analysis of a financial institu-
tion and perform some rules of personal consumer association of S~ao Paulo state. Three
association algorithms was demonstrated in the article, but only one of them is appli-
cated. The data handled was explained in detail with all the filters and treatments that
were done. The modeling reporting on algorithms of association rules and examples of
these algorithms were also described in the paper. Based on the results obtained, they
were able to determine the shopping basket of the financial institution and tested the
results in changing rules and conditions.
not, compares the entire rule with the randomly selected right-side elements. Therefore,
the degree of leverage, as well as the degree of trust of a rule, should be considered.
If the leverage value is greater than 1, the association between the products in the
rule is positive, meaning that products A and B appear more together than expected
and the rule is interesting. And if the value is less than 1, there is a negative correlation
between them. Therefore, rules with a leverage value of less than 1 are ignored. If the
value is exactly equal to 1, there is no correlation, indicating independence. The leverage
value also determines whether the rule emerges as a chance, or, on the contrary, is a
really expected and good rule. The reason for taking the name of leverage can be
explained as follows: For example, if a leverage value is greater than 1 in a two-product
rule, the sale of one product may increase the sale of the other product. Finally, it is
necessary to say about these three values that the limit values of these values should be
determined by analysts or experts (Bilgiç 2019, 92).
Also, conviction value is used to form the rule of association. When calculating the
conviction value, the probability that elements A are seen without element B are calcu-
lated. If the conviction is 1, A and B are independent of each other. If the conviction
value is less than 1, the related rule can be established (Şeker 2011).
1 Support ðBÞ
Conviction ðA ! BÞ ¼ (2)
1 Confidence ðA, BÞ
Since the high of all values does not mean that consistently interesting and important
high rules will be achieved, the degree to which a rule is interesting is determined using
the lift value (Ateş and Karabatak 2017). The fact that the lift criterion is less than or
greater than 1 indicates that the interest increases and that Lift 1” indicates that there is
no interest (Jabbour, Mazouri, and Sais 2018).
P ðA \ BÞ
Lift ðA ! BÞ ¼ (3)
P ðAÞ P ðBÞ
In general, Association Rule analysis consists of two stages (Aggarwal 2015, 98; Han,
Pei, and Kamber 2011, 231).
1. All frequent product clusters above a support level (minimum support value) pre-
viously determined by the user are detected.
2. Strong association rules are established from frequent product clusters; the rules
must be above a minimum value of support and trust set by the user.
product clusters are candidates for two-product rules in the future (for the right-hand
side of the rules). This process merges the products frequently until the specified sup-
port value is reached, and finally, no more curing clusters can be found. After the detec-
tion of frequent product clusters, now the rule-finding process is started. There is a
minimum predetermined value of support as well as association rules above a confi-
dence value (Aggarwal 2015, 100; Giudici 2005, 93).
This algorithm takes advantage of the previous step through the use of prior know-
ledge of frequently repeating objects (Agrawal and Srikant 1994, 487–99).
The Apriori algorithm is based on the rule that all subsets of the frequently repeating
object set must also consist of frequently repeating sets and use an iterative approach.
First, there are frequently repetitive sets with one element. This set is called L1 (fre-
quently repeating 1-element set). L1 is used to obtain L2 (a repetitive 2-element cluster).
The algorithm works repetitively to find the most repetitive sets that can be obtained.
The presence of each LK means scanning the entire database. The database is scanned
many times to find frequently occurring items, and these scans include elements that
are associated with the Apriori algorithm’s concatenation, pruning, and minimum sup-
port criteria (Han and Kamber 2000).
4.1.2. Pruning
The Ck symbol represents the candidate set with length k. It is necessary to examine
whether any element in Ck has been a frequently repeated element. A repetitive element
set Lk is found, discarding elements whose repeat value is less than the minimum sup-
port value. Subsequently, each element (k-1) has a length of subsets and these subsets
are checked for frequent repetitive elements. The element whose repeat value of any
subset is less than the minimum support value is discarded from Ck. The remaining ele-
ments in Ck have a repetition value by scanning the entire data set and the cluster ele-
ments that cannot exceed the minimum support value are also removed from the Ck.
As a result, the Lk set is formed, with subsets of all (k-1) lengths being frequently
repeated and repetition values of all elements providing the minimum support value. By
making subset control, after creating Ck candidate set containing elements with k
length, the supported values of all elements in Ck set are calculated by scanning all data-
sets. Within this candidate set, the minimum support value is formed by the frequently
repeated element set Lk. The assembly and pruning continue until the Lk-1 set is equal
to the empty set (Agrawal and Srikant 1994, 487–99; Han and Kamber 2000).
10 Y. A. ÜNVAN
4.2. FP Growth
FP Growth is an improvement of Apriori designed to eliminate some of the heavy bot-
tlenecks in Apriori. This algorithm was planned with the benefits of MapReduce taken
into account. Therefore, it works well with any distributed system focused on
MapReduce. FP Growth simplifies all the problems present in Apriori by using a struc-
ture called an FP Tree. In an FP Tree each node represents an item and its current
count, and each branch represents a different association (Singularities 2019).
The biggest advantage found in FP Growth is the truth that the algorithm only needs
to read the file twice, as opposed to Apriori who reads it once for every iteration. This
also reduces costs. Another huge advantage is that it removes the need to calculate the
pairs to be counted, which is very processing heaviest. Because it uses FP Tree. This
makes it O(n) which is much faster than Apriori algorithm. FP Growth algorithm stores
in memory a compact version of the database. But it also has the problem of the inter-
dependence of data. The interdependency problem is that for the parallelization of the
algorithm some that still need to be shared, which creates a bottleneck in the
shared memory.
5. Methodology
Weka is the name of the software developed at the University of Waikato for the pur-
pose of machine learning, consisting of the initials of the words “Waikato Environment
for Knowledge Analysis”. Weka is one of the 10 most used software in the field of busi-
ness intelligence and it ranks in the top 3 among the most used free software in the
field of business intelligence (Vohra 2012). Which is widely used today, includes
machine learning algorithms and methods such as Filtered Associator, FP Growth,
Generalized Sequential Patterns, Predictive Apriori and Tertius.
As it is developed in Java and the libraries come in jar files, it can be easily integrated
into projects written in Java, making it use more widespread (Şeker 2013).
Weka has a completely modular design and can perform visualization, data analysis,
business intelligence applications, data mining on data sets with the features it contains.
The Weka software comes with support for an .arff extension. However, Weka software
has tools for converting CSV files to ARFF format. Basically, the following 3 Data
Mining operations can be done with Weka:
Classification
Clustering
Association
In addition to the above operations, pre and post operations can be performed on
the data sets.
Data Pre-Processing
Visualization
All attributes are understood by Weka as numeric. In fact, they are all binary, having
values either 0 (not purchased) or 1 (purchased) in the dataset. Therefore, the dataset
has been rearranged under NumericToNominal and no class options. Since Apriori
algorithm does not detect numeric data sets that do not give good results, this study
was done by using the FP Growth algorithm in the Weka program. The FP Growth
algorithm is regarded as an input format expressed as nominal attributes with only 2
values (i.e., that 0 and 1).
Dataset consists of 1361 transactions. Initially, the minMetric and
lowerBoundMindSupport parameters were correct because the algorithm did not give
any rules at first. The algorithm was run with Delta: 0.05, lowerBoundMinSupport 0.01,
minMetric 0.7, and upperBoundMinSupport 1.0. As a result of FP Growth algorithm,
286,304 rules were obtained and the first 10 rules with the highest conviction value
are given.
6. Results
The higher the conviction value, the more decisive the rule is. If the conviction value is
1, the products in the rule are independent of each other. So it cannot be taken as a
12 Y. A. ÜNVAN
rule. However, the dependence increases as you move away from conviction value 1. So
the highest value can be said to be the most decisive rule.
1. [2pct. Milk ¼ 1, Sweet Relish ¼ 1, Pepperoni Pizza - Frozen ¼ 1]: 24 ¼¼> [Eggs
¼ 1]: 24 conf:(1) lift:(8.15) lev:(0.02) <conv:(21.06)>
2pct. Milk, Sweet Relish and Pepperoni Pizza - Frozen products are brought together
by customers, the probability of receiving Eggs increases 8.15 times. The value of the
conviction is 21.06. Therefore, it ranks first as the best rule. Since the Confidence value
is 1, it is concluded that 100% of the customers who buy 2 pct. Milk, Sweet Relish and
Pepperoni Pizza-Frozen products receive Eggs. 24 of the 24 customers who bought these
products also received Eggs. If the confidence value was 0.9, 90% of the customers who
bought these products would receive Eggs with them.
2. [Onions ¼ 1, Wheat Bread ¼ 1, Apples ¼ 1]: 22 ¼¼> [2pct. Milk ¼ 1]: 22
conf:(1) lift:(9.13) lev:(0.01) <conv:(19.59)>
Onions, Wheat Bread, and Apples products are brought together by customers, the
probability of receiving 2pct. Milk increases 9.13 times. The value of the conviction is
19.59. Therefore, this rule is determined as the second-best rule. Since Confidence value
is 1, it is concluded that 100% of the customers who buy Onions, Wheat Bread and
Apples products receive 2pct. Milk. 24 of the 24 customers who bought these products
also received 2pct. Milk.
3. [White Bread ¼ 1, 2pct. Milk ¼ 1, Plain Bagels ¼ 1]: 22 ¼¼> [Eggs ¼ 1]: 22
conf:(1) lift:(8.15) lev:(0.01) <conv:(19.3)>
The combination of White Bread, Milk, and Plain Bagels increases the likelihood of
buying Eggs 8.15 times. Here the conviction value is set to 19.3. Therefore, this rule is
determined as the third-best rule. Since Confidence value is 1, it is concluded that 100%
of customers who buy White Bread, Milk and Plain Bagels buy Eggs. Of the 22 custom-
ers who bought these products, 22 also received Eggs.
4. [98pct. Fat Free Hamburger ¼ 1, Toothpaste ¼ 1, Garlic ¼ 1]: 20 ¼¼> [White
Bread ¼ 1, Potato Chips ¼ 1]: 20 conf:(1) lift:(19.44) lev:(0.01) <conv:(18.97)>
The fact that customers buy 98 pct. Fat-Free Hamburger, Toothpaste and Garlic
products together increase the probability of buying White Bread and Potato Chips
together by 19.44 times. Here the conviction value is set to 18.97. Therefore, this rule is
determined as the 4th best rule. Since Confidence value is 1, 100% of customers who
buy 98pct. Fat-Free Hamburger, Toothpaste and Garlic products also buy White Bread
and Potato Chips. 20 of the 20 customers who bought these products bought White
Bread and Potato Chips.
The other 6 rules obtained can be interpreted as the first 4 rules interpreted above.
Since the remaining 6 rules also have a confidence value of 1, 100% of the customers in
the rule will be interpreted as having received these products. It can also be said that
addiction decreases as the conviction value decreases more and more.
5. [Eggs ¼ 1, 2pct. Milk ¼ 1, 98pct. Fat Free Hamburger ¼ 1, Onions ¼ 1]: 21 ¼¼>
[Potato Chips ¼ 1]: 21 conf:(1) lift:(10.23) lev:(0.01) <conv:(18.95)>
6. [Eggs ¼ 1, White Bread ¼ 1, Wheat Bread ¼ 1, Bananas ¼ 1]: 21 ¼¼> [2pct.
Milk ¼ 1]: 21 conf:(1) lift:(9.13) lev:(0.01) <conv:(18.7)>
7. [Popcorn Salt ¼ 1, Apple Fruit Roll ¼ 1]: 21 ¼¼> [Eggs ¼ 1]: 21 conf:(1)
lift:(8.15) lev:(0.01) <conv:(18.42)>
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS 13
References
Aggarwal, C. C. 2015. Data mining: The textbook. New York: Springer; IBM T.J. Watson
Research Center.
Agrawal, R., and R. Srikant. 1994. Fast algorithms for mining association rules. In Proceedings of
20th International Conference on Very Large Data Bases, VLDB, Vol. 1215, 487–99. Santiago,
Chile: IBM Almaden Research Center.
Albion Research Ltd. 2019. Market basket analysis. https://www.albionresearch.com/data_mining/
market_basket.php (accessed August 10, 2019).
Ateş, Y., and M. Karabatak. 2017. Multiple minimum support value for quantitative association
rules. Fırat University Journal of Engineering Sciences 29 (2):57–65.
Bilgiç, E. 2019. Market basket analysis with R programming language: An application on con-
sumer purchasing behavior of a supermarket in Muş. Journal of Social Sciences of Mus
Alparslan University 7 (3):89–97.
Chen, Y.-L., K. Tang, R.-J. Shen, and Y.-H. Hu. 2005. Market basket analysis in a multiple store
environment. Decision Support Systems 40 (2):339–54. doi:10.1016/j.dss.2004.04.009.
Cios, K. J., W. Pedrycz, R. W. Swiniarski, and L. Kurgan. 2007. Data mining - A knowledge dis-
covery approach. New York: Springer.
Dogan, B., B. Erol, and A. Buldu. 2014. Using association rule mining for customer relationship
management in insurance sector. Marmara Journal of Natural and Applied Sciences 3:105–14.
Dogan, O. 2015. The analysis of passwords structures in an E-commerce site user accounts by
using association rules. Journal of Internet Applications and Management 6 (2):49–61. doi:10.
5505/iuyd.2015.29491.
Dunham, M. H. 2002. Data mining: Introductory and advanced topics. USA: Pearson Education.
Gangurde, R., B. Kumar, and S. D. Gore. 2017. Optimized predictive model using artificial neural
network for market basket analysis. Computer Science and Electronics Journals 9 (1):42–52.
Giudici, P. 2005. Applied data mining: Statistical methods for business and industry. USA: John
Wiley & Sons.
Han, J., and M. Kamber. 2000. Data mining concept and techniques. 1st ed. USA: Morgan
Kaufmann Publishers.
Han, J., J. Pei, and M. Kamber. 2011. Data mining: Concepts and techniques. USA: Elsevier.
Jabbour, S., F. Mazouri, and L. Sais. 2018. Mining negatives association rules using constraints.
Procedia Computer Science 127:481–8. doi:10.1016/j.procs.2018.01.146.
14 Y. A. ÜNVAN
Kaur, M., and S. Kang. 2016. Market basket analysis: Identify the changing trends of market data
using association rule mining. Procedia Computer Science 85:78–85. doi:10.1016/j.procs.2016.05.
180.
Liew, H. 2018. Dietary habits and physical activity: Results from cluster analysis and market bas-
ket analysis. Nutrition and Health 24 (2):83–92. doi:10.1177/0260106018770942.
Moodley, R., F. Chiclana, F. Caraffini, and J. Carter. 2018. Application of uninorms to market
basket analysis. International Journal of Intelligent Systems 34 (1): 1–14. doi:10.1002/int.22039.
Musalem, A., L. Aburto, and M. Bosch. 2018. Market basket analysis insights to support category
management. European Journal of Marketing 52 (7/8):1550–73. doi:10.1108/EJM-06-2017-0367.
€
Ozçalıcı, M. 2017. Predicting second-hand car sales price using decision trees and genetic algo-
rithms. Alphanumeric Journal 5 (1):103–14.
Raeder, T., and N. V. Chawla. 2011. Market basket analysis with networks. Social Network
Analysis and Mining 1 (2):97–113. doi:10.1007/s13278-010-0003-7.
Raja, B., J. Pamina, P. Madhavan, and A. S. Kumar. 2019. Market behavior analysis using descrip-
tive approach. https://ssrn.com/abstract=3330017; http://dx.doi.org/10.2139/ssrn.3330017
(accessed February 6, 2019).
Rezende, F., and M. Ladeira. 2019. Market basket analysis in a financial institution. Singular
Engenheria 1 (1):6–12. 10.33911/singular-etg.v1i1.18.
Roodpishia, M. V., and R. A. Nashtaei. 2015. Market basket analysis in insurance industry.
Management Science Letters 5:393–400.
Scheffer, T. 2001. Finding association rules that trade support optimally against confidence. In
Proceedings of the 5th European Conference on Principles and Practice of Knowledge Discovery
in Databases, 424–35. Berlin Heidelberg: Springer.
Setiabudi, D. H., G. S. Budhi, I. W. J. Purnama, and A. Noertjahyana. 2011. Data mining market
basket analysis using hybrid-dimension association rules, case study in Minimarket X. 2011
International Conference on Uncertainty Reasoning and Knowledge Engineering, IEEE, Bali,
Indonesia.
Singularities. 2019. Apriori vs FP-growth for frequent item set mining. https://www.singularities.
com/blog/our-blog-1/post/apriori-vs-fp-growth-for-frequent-item-set-mining-11 (accessed
August 17, 2019).
Smartbridge. 2019. Market basket analysis 101: Anticipating customer behavior. https://smart-
bridge.com/market-basket-analysis-101/ (accessed August 10, 2019).
Song, B., M. A. Marchant, M. R. Reed, and S. Xu. 2009. Competitive analysis and market power
of China’s soybean import market. International Food and Agribusiness Management Review
12 (1):21–42.
Şeker, S. E. 2011. Interest measures for association rules. http://bilgisayarkavramlari.sadievren-
seker.com/2011/09/09/birliktelik-kurallarinin-pay-olcumleri-interest-measures-for-association-
rules/ (accessed August 13, 2019).
_ Zekası ve Veri Madencilig i (Weka ile). Turkey: Cinius Publications. ISBN:
Şeker, S. E. 2013. Iş
9786051276717.
Timor, M., and T. U. Şimşek. 2008. Customer behavior modeling by using market basket analysis
_
in data mining. Istanbul €
Universitesi _
Işletme Fak€ _
ultesi Işletme _
Iktisadı Enstit€
us€
u Dergisi 19 (59):
3–10.
Ulaş, M. A., E. Alpaydın, N. S€ onmez, and A. ve Kalkan. 2001. Veri Madencilig inde Sepet Analizi
Uygulamaları, IT Summit 2001, TBD 18. Informatics Congress, 4–7 September, Istanbul. _
Vancouver Island University. 2019. Marina Barsky, Computing Science, Vancouver Island
University. http://csci.viu.ca/barskym/teaching/DM2012/labs/LAB7/PartII.html (accessed
August 11, 2019).
Vohra, G. 2012. 10 Most popular analytics tools in business. https://analyticstraining.com/10-
most-popular-analytic-tools-in-business/ (accessed August 12, 2019).
Yulianto, E., and H. Heryanto. 2019. Rancang Bangun Perangkat Lunak E-commerce
Menggunakan Metode market basket analysis. Media Informatika 18 (1):21–41.