Association Rule Mining
Association Rule Mining
Mining
04/16/2025
Technique behind the E-commerce recommendation system
2
Learning objectives
Algorithm
3
Motivation & History
4
Why you need to undergo this
topic?
Example:
Amazon(2005) , Bigbasket (2011) , Flipcart (2007),
etc.
5
What Is Frequent Pattern
Analysis?
Frequent pattern: a pattern (a set of items, subsequences,
substructures, etc.) that occurs frequently in a data set
Association Rule:
An association rule has two parts: an antecedent (if) and a
consequent
(then). An antecedent is an item found within the data. A
consequent is
an item found in combination with the antecedent.
7
Basic Concepts: Frequent
Patterns
itemset: A set of one or more items
k-itemset I = {I1, …, Ik}
Transaction Database T: a set of transactions
T = {t1, t2, …, tn}.
support count of I: Frequency or occurrence of an
itemset I
(relative) support, s, is the fraction of transactions
that contains I (i.e., the probability that a transaction
contains I)
8
Transaction data: supermarket
data
Market basket transactions:
t1: {bread, cheese, milk}
t2: {apple, eggs, salt, yogurt}
… …
tn: {biscuit, eggs, milk}
Concepts:
An item: an item/article in a basket
transactions 9
The model: rules
A transaction t contains X, a set of items
(itemset) in I, if X t.
An association rule is an implication of the
form:
X Y, where X, Y I, and X Y =
itemset.
A k-itemset is an itemset with k items.
E.g., {milk, bread, cereal} is a 3-itemset
10
Rule strength measures
Support: The rule holds with support sup in
T (the transaction data set) if sup% of
transactions contain X Y.
sup = Pr(X Y).
11
Step 1: Mining all frequent
itemsets
A frequent itemset is an itemset whose
support is ≥ minsup.
Key idea: The apriori property (downward
closure property): any subsets of a frequent
itemset are also frequent itemsets
ABC ABD ACD BCD
AB AC AD BC BD CD
A B C D
12
Closed Itemset:
An itemset is closed if none of its immediate
supersets have same support count same as
Itemset.
What is the set of closed itemset?
<a1, …, a100>: 1
< a1, …, a50>: 2
A B C D E
AB AC AD AE BC BD BE CD CE DE
Found to be
Infrequent
ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE
Pruned
ABCDE
supersets
The Apriori Algorithm
Iterative algo. (also called level-wise
search): Find all 1-item frequent itemsets;
then all 2-item frequent itemsets, and so on.
In each iteration k, only consider
itemsets that contain some k-1 frequent
itemset.
Find frequent itemsets of size 1: F1
From k = 2
Ck = candidates of size k: those itemsets
of size k that could be frequent, given Fk-
1
Fk = those itemsets that are actually 15
Transactional data
16
17
18
19
Generating Strong Rules
20
Learning Outcomes
21