0% found this document useful (0 votes)
2 views

FP GROWTH

The document outlines the construction of an FP-tree from a transaction database, detailing the process of identifying frequent itemsets and their organization into a header table. It distinguishes between frequent and infrequent itemsets, highlighting maximal frequent itemsets as those whose immediate supersets are infrequent. Additionally, it explains the concept of closed itemsets, noting that certain itemsets maintain unique support counts compared to their supersets.

Uploaded by

Anbumozhy T.S.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

FP GROWTH

The document outlines the construction of an FP-tree from a transaction database, detailing the process of identifying frequent itemsets and their organization into a header table. It distinguishes between frequent and infrequent itemsets, highlighting maximal frequent itemsets as those whose immediate supersets are infrequent. Additionally, it explains the concept of closed itemsets, noting that certain itemsets maintain unique support counts compared to their supersets.

Uploaded by

Anbumozhy T.S.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

FP GROWTH

Construct FP-tree from a Transaction Database

TID Items bought (ordered) frequent items


100 {f, a, c, d, g, i, m, p} {f, c, a, m, p}
200 {a, b, c, f, l, m, o} {f, c, a, b, m}
300 {b, f, h, j, o, w} {f, b} min_support = 3
400 {b, c, k, s, p} {c, b, p}
500 {a, f, c, e, l, p, m, n} {f, c, a, m, p}
{}
Header Table
1. Scan DB once, find frequent 1-
itemset (single item pattern) Item frequency head f:4 c:1
f 4
2. Sort frequent items in c 4 c:3 b:1 b:1
frequency descending order, f- a 3
list b 3 a:3 p:1
m 3
3. Scan DB again, construct FP-
p 3
tree m:2 b:1

F-list = f-c-a-b-m-p p:2 m:1


2
The itemsets in the lattice are divided into two groups: those that are
frequent and those that are infrequent.
A frequent itemset border, which is represented by a dashed line, is also illustrated in the
diagram
Every itemset located above the border is frequent, while those located below the border
(the shaded nodes) are infrequent.

Among the itemsets residing near the border, {a, d}, {a, c, e}, and {b, c, d, e} are considered
to be maximal frequent itemsets because their immediate supersets are infrequent.

An itemset such as {a, d} is maximal frequent because all of its immediate supersets, {a, b,
d}, {a, c, d}, and {a, d, e}, are infrequent.
In contrast, {a, c} is non-maximal because one of its immediate supersets, {a, c, e}, is
frequent.
Maximal frequent itemsets effectively provide a compact representation of frequent
itemsets.
For example, since the node {b, c} is associated with transaction IDs 1,
2, and 3, its support count is equal to three.
From the transactions given in this diagram, notice that every
transaction that contains b also contains c.
Consequently, the support for {b} is identical to {b, c} and {b} should
not be considered a closed itemset.
Similarly, since c occurs in every transaction that contains both a and d,
the itemset {a, d} is not closed.
On the other hand, {b, c} is a closed itemset because it does not have
the same support count as any of its supersets.

You might also like