DMDW Lab10[1]
DMDW Lab10[1]
1. Implement and demonstrate the FP-Growth algorithm using (i) the WEKA tool and (ii)
Python programming.
The FP-Growth algorithm is an efficient method for finding frequent patterns in large datasets
without the need for candidate generation, which makes it much faster than the Apriori algorithm.
It compresses the input dataset into a compact data structure known as an FP-Tree (Frequent
Pattern Tree). The algorithm first scans the database to identify frequent items and organizes them
into the tree structure based on their frequency. Then, it recursively mines the FP-Tree to extract
frequent itemsets by exploring the conditional patterns. Since it avoids the expensive process of
generating and testing a large number of candidate sets, FP-Growth is highly efficient, especially
for large and dense datasets. It is widely used in applications like market basket analysis, customer
behavior analysis, and recommender systems.
(ii)Python programming
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
OUTPUT:
2. Implement and demonstrate the Hierarchical clustering algorithm using (i) the WEKA
tool and (ii) Python programming.
Hierarchical clustering is a method of cluster analysis that seeks to build a hierarchy of clusters.
It does not require specifying the number of clusters in advance, unlike other methods such as K-
Means. The process begins by treating each data point as its own individual cluster. Then, in a
step-by-step manner, it merges the closest pairs of clusters based on a chosen distance metric
(like Euclidean distance) and linkage criterion (such as single, complete, or average linkage).
This continues until all points are combined into a single cluster, forming a tree-like structure
known as a dendrogram. This dendrogram can be cut at any level to obtain the desired number of
clusters. Hierarchical clustering is useful for visualizing data structure and is often applied in
fields like bioinformatics and social sciences.
(i) the WEKA tool
(ii)Python programming
import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import dendrogram, linkage
from sklearn.cluster import AgglomerativeClustering
OUTPUT:
Data Points
Dendrogram