0% found this document useful (0 votes)
3 views

Frequent Itemset Mining Methods

The document discusses frequent itemset mining methods, focusing on the Apriori algorithm and its efficiency improvements through techniques like FP-growth. It outlines the process of generating association rules from frequent itemsets and introduces variations to enhance the Apriori algorithm's scalability. Additionally, it covers mining frequent itemsets using vertical data formats and the steps involved in this process.

Uploaded by

umakanthn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Frequent Itemset Mining Methods

The document discusses frequent itemset mining methods, focusing on the Apriori algorithm and its efficiency improvements through techniques like FP-growth. It outlines the process of generating association rules from frequent itemsets and introduces variations to enhance the Apriori algorithm's scalability. Additionally, it covers mining frequent itemsets using vertical data formats and the steps involved in this process.

Uploaded by

umakanthn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Frequent Itemset

Mining Methods

By,
Umakanth N
Session Outcomes

 Mine frequent item sets by applying Apriori


algorithm and Improve its efficiency by using FP-
growth algorithm
 Generate association rules on frequent data set
 Mine frequent item using vertical data format; and
Mine the closed and max patterns

February 10, 2025 Frequent Itemset Mining Methods 2


Agenda
 Apriori algorithm – finding frequent itemsets
 Generate strong association rules
 Variations to Apriori algorithm to improve
efficiency & scalability
 Pattern-growth method
 Mining frequent itemsets using vertical data
format

February 10, 2025 Frequent Itemset Mining Methods 3


Apriori Algorithm
 Uses prior knowledge of frequent itemset properties
 Principle:
 Apriori employs an iterative approach known as a level-wise
search, where k-itemsets are used to explore (k + 1) itemsets.
 Set of frequent 1-itemset is calculated and take only the
itemsets which satisfies the minimum support count; referred
as L1
 L1 is used to find L2 , the set of frequent 2-itemsets….. and so
on
 Apriori Property:
 All non-empty subsets of a frequent itemset must also be
frequent
 If a set cannot pass a test, all of its superset will fail the
same test; such property is called Antimonotonicity
February 10, 2025 Frequent Itemset Mining Methods 4
Apriori Algorithm
(Contd.,)
 Apriori property is used in the algorithm by 2
step process:
 Join step:
 Lk (set of k-itemset): set of candidate k-itemsets is
generated by joining Lk-1 with itself.
 Prune step:
 Ck is a superset of Lk; members of Ck may or may not
be frequent, but all the frequent k-itemsets are
included in Ck.

February 10, 2025 Frequent Itemset Mining Methods 5


Apriori Algorithm
(Example)
 Assume minimum support count is 2.

February 10, 2025 Frequent Itemset Mining Methods 6


February 10, 2025 Frequent Itemset Mining Methods 7
Generating Association Rules
from Frequent Itemsets
 Association rules can be generated by 2 steps:
 For each frequent itemset l, generate all non-empty
subsets of l.
 For every non-empty set s of l, define the rule as,

 where min_conf is the minimum confidence threshold


 Note: As Rules are generated from frequent
itemsets, each one automatically satisfies the
minimum support
 Example: Consider {I1, I2, I5} is one of the
frequent itemset. Write all possible association
February 10, 2025 Frequent Itemset Mining Methods 8
Improving the Efficiency of
Apriori
 Hash based technique
 Used to reduce the size of candidate k-itemsets
 Transaction reduction
 Reducing the number of transactions scanned in
future iterations
 Partitioning

February 10, 2025 Frequent Itemset Mining Methods 9


Improving the Efficiency of
Apriori
 Sampling
 Searching for frequent itemset in the sample S of
dataset D
 We can use lower support threshold instead of
minimum support threshold
 Dynamic Itemset counting
 Itemset is partitioned into blocks and the support
count is calculated dynamically
 This technique uses support-count-so-far as the
lower to add the corresponding itemset into
frequent itemsets

February 10, 2025 Frequent Itemset Mining Methods 10


Pattern-Growth Approach for
Mining Frequent Itemsets
 Drawback of Apriori algorithm:
 Need to generate huge number of candidate sets
 Need to scan whole database repeatedly and check
large set of candidate keys by pattern matching
 To avoid these drawbacks, Frequent Pattern
Growth (FP Growth) algorithm adopts the Divide-
and-Conquer strategy
 Compresses the database representing frequent items
into a frequent pattern tree, or FP-tree
 Divides the compressed database into a set of
conditional databases, each associated with a frequent
item or “pattern fragment”
 Mine each database separately
February 10, 2025 Frequent Itemset Mining Methods 11
FP Growth - Example

February 10, 2025 Frequent Itemset Mining Methods 12


February 10, 2025 Frequent Itemset Mining Methods 13
February 10, 2025 Frequent Itemset Mining Methods 14
Mining Frequent Itemsets
using the Vertical Data Format
 Apriori & FP-Growth methods can mine the
frequent patterns for Horizontal data format
(ie., {TID: itemset} format; TID refers transaction
ID and itemset refers set of items bought in that
transaction)
 Vertical data format: {item : TID_set}, where
item is the item-name and TID_set is set of
transaction containing that item
Horizontal to
Vertical

February 10, 2025 Frequent Itemset Mining Methods 15


Mining Vertical data format -
Steps
 Convert the horizontal to Vertical format if
required
 Use Apriori property to construct k+1 candidate
set from k itemset
 Take only candidate set which satisfies
min_support and form the k+1 itemset
 Repeat steps 2 & 3 untill no k+1 candidate sets
are possible to construct

February 10, 2025 Frequent Itemset Mining Methods 16


Mining Vertical data format -
Steps
 Convert the horizontal to Vertical format if
required
 Use Apriori property to construct k+1 candidate
set from k itemset
 Take only candidate set which satisfies
min_support and form the k+1 itemset
 Repeat steps 2 & 3 untill no k+1 candidate sets
are possible to construct

February 10, 2025 Frequent Itemset Mining Methods 17


Mining Vertical data format -
Steps
 Convert the horizontal to Vertical format if required
 Use Apriori property to construct k+1 candidate set
from k itemset
 Take only candidate set which satisfies min_support
and form the k+1 itemset
 Repeat steps 2 & 3 untill no k+1 candidate sets are
possible to construct

February 10, 2025 Frequent Itemset Mining Methods 18


? ?
. . ?
ie s
e r
Q u

You might also like