4.1) FP Growth Algorithm
4.1) FP Growth Algorithm
FP-Tree/FP-Growth Algorithm
• Use a compressed representation of the database using an
FP-tree
• Once an FP-tree has been constructed, it uses a recursive
divide-and-conquer approach to mine the frequent itemsets.
B:1
TID Items
1 {A,B}
2 {B,C,D} A:1
3 {A,C,D,E}
4 {A,D,E} After reading TID=2:
5 {A,B,C} null
6 {A,B,C,D}
B:2
7 {B,C}
8 {A,B,C}
C:1
9 {A,B,D} A:1
10 {B,C,E}
D:1
FP-Tree Construction
TID Items
Transaction
1 {A,B} null
2 {B,C,D} Database
3 {A,C,D,E}
4 {A,D,E}
B:8 A:2
5 {A,B,C}
6 {A,B,C,D} A:5 C:3 C:1 D:1
7 {B,C}
8 {A,B,C}
9 {A,B,D} C:3 D:1 D:1 E:1 D:1 E:1
10 {B,C,E}
• The size of an FPtree also depends on how the items are ordered.
– If the ordering scheme in the preceding example is reversed,
• i.e., from lowest to highest support item, the resulting FPtree probably is
denser (shown in next slide).
• Not always though…ordering is just a heuristic.
An FPtree representation for the data set with a different item ordering scheme.
FP-Growth (I)
• FPgrowth generates frequent itemsets from an FPtree by
exploring the tree in a bottomup fashion.
B:8 A:2
null
D:1 E:1
B:3 A:2
E:1
Conditional FP-Tree for E
• We now need to build a conditional FP-Tree for E, which is the
tree of itemsets ending in E.
B:3
E:1 A:1 D:1
We continue recursively.
Base of recursion: When the tree
has a single path only.
FP-Tree Another Example
Transactions Freq. 1-Itemsets. Transactions with items sorted based
Supp. Count 2 on frequencies, and ignoring the
infrequent items.
ABCEFO A:8 ACEBF
ACG C:8 ACG
EI E:8 E
ACDEG G:5 ACEGD
B:2
ACEGL ACEG
D:2
EJ E
F:2
ABCEFP ACEBF
ACD ACD
ACEGM ACEG
ACEGN ACEG
FP-Tree after reading 1st transaction
ACEBF
Header null
ACG
E A:8 A:1
C:8
ACEGD
E:8 C:1
ACEG
G:5
E
B:2 E:1
ACEBF D:2
ACD F:2 B:1
ACEG
ACEG F:1
FP-Tree after reading 2nd transaction
ACEBF
Header null
ACG
E A:8 A:2
C:8
ACEGD
E:8 C:2
ACEG
G:5
E G:1
B:2 E:1
ACEBF D:2
ACD F:2 B:1
ACEG
ACEG F:1
FP-Tree after reading 3rd transaction
ACEBF
Header null
ACG
E A:8 A:2 E:1
C:8
ACEGD
E:8 C:2
ACEG
G:5
E G:1
B:2 E:1
ACEBF D:2
ACD F:2 B:1
ACEG
ACEG F:1
FP-Tree after reading 4th transaction
ACEBF
Header null
ACG
E A:8 A:3 E:1
C:8
ACEGD
E:8 C:3
ACEG
G:5
E G:1
B:2 E:2
ACEBF D:2
ACD F:2 B:1
G:1
ACEG
ACEG F:1 D:1
FP-Tree after reading 5th transaction
ACEBF
Header null
ACG
E A:8 A:4 E:1
C:8
ACEGD
E:8 C:4
ACEG
G:5
E G:1
B:2 E:3
ACEBF D:2
ACD F:2 B:1
G:2
ACEG
ACEG F:1 D:1
FP-Tree after reading 6th transaction
ACEBF
Header null
ACG
E A:8 A:4 E:2
C:8
ACEGD
E:8 C:4
ACEG
G:5
E G:1
B:2 E:3
ACEBF D:2
ACD F:2 B:1
G:2
ACEG
ACEG F:1 D:1
FP-Tree after reading 7th transaction
ACEBF
Header null
ACG
E A:8 A:5 E:2
C:8
ACEGD
E:8 C:5
ACEG
G:5
E G:1
B:2 E:4
ACEBF D:2
ACD F:2 B:2
G:2
ACEG
ACEG F:2 D:1
FP-Tree after reading 8th transaction
ACEBF
Header null
ACG
E A:8 A:6 E:2
C:8
ACEGD
E:8 C:6
ACEG
G:5
E G:1 D:1
B:2 E:4
ACEBF D:2
ACD F:2 B:2
G:2
ACEG
ACEG F:2 D:1
FP-Tree after reading 9th transaction
ACEBF
Header null
ACG
E A:8 A:7 E:2
C:8
ACEGD
E:8 C:7
ACEG
G:5
E G:1 D:1
B:2 E:5
ACEBF D:2
ACD F:2 B:2
G:3
ACEG
ACEG F:2 D:1
FP-Tree after reading 10th transaction
ACEBF
Header null
ACG
E A:8 A:8 E:2
C:8
ACEGD
E:8 C:8
ACEG
G:5
E G:1 D:1
B:2 E:6
ACEBF D:2
ACD F:2 B:2
G:4
ACEG
ACEG F:2 D:1
Conditional FP-Trees
Build the conditional FP-Tree for each of the items.
For this:
2. Read again the tree to determine the new counts of the items
along those paths. Build a new header.
F:2
A:8
A:2 A:2
C:2
C:2
C:8
The other items are
E:6 D:1 removed as infrequent.
The tree is just a single path; it is
G:4 the base case for the recursion.
So, we just produce all the
subsets of the items on this path
merged with D.
D:1
{D} {A,D} {C,D} {A,C,D}
Paths containing D after updating the counts
Exercise: Complete the example.