0% found this document useful (0 votes)
13 views5 pages

simple 4,6 DWDM

Uploaded by

sunnyreddy670
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views5 pages

simple 4,6 DWDM

Uploaded by

sunnyreddy670
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

1st question

Frequent Itemset Generation Using Apriori Algorithm

What It Does: The Apriori Algorithm nds items that are often bought together in transactions.

Steps

1. Support: Check how often an item or group of items appears in transactions.


Support
=
Number of Transactions with the Item(s)
Total Transactions

\text{Support} = \frac{\text{Number of Transactions with the Item(s)}}{\text{Total Transactions}}

2. Threshold: Only keep items that appear frequently enough (e.g., at least 40%).

Example

Dataset:

Transactio
Items Bought
n
T1 Bread, Milk, Beer
Bread, Diaper,
T2
Milk
T3 Milk, Diaper, Beer
T4 Bread, Milk
T5 Bread, Diaper

1. Find Single Frequent Items:

◦ Bread: 4/5 = 0.8


◦ Milk: 4/5 = 0.8
◦ Beer: 2/5 = 0.4
◦ Diaper: 3/5 = 0.6
Keep: Bread, Milk, Beer, Diaper.
2. Find Frequent Pairs:

◦ Bread + Milk: 3/5 = 0.6 (Keep)


◦ Bread + Diaper: 2/5 = 0.4 (Keep)
◦ Milk + Diaper: 2/5 = 0.4 (Keep)
3. Find Frequent Triples:
fi
◦ Bread + Milk + Diaper: 1/5 = 0.2 (Too low, discard).

Result:

• Frequent Single Items: Bread, Milk, Beer, Diaper.


• Frequent Pairs: {Bread, Milk}, {Bread, Diaper}, {Milk, Diaper}.

Conclusion: The Apriori Algorithm shows patterns like:


"If someone buys Bread, they are likely to buy Milk."

2nd question

2nd Question: FP-Growth Algorithm

What It Does: FP-Growth nds frequent itemsets without generating candidates like Apriori. It uses a
special tree (FP-Tree) to store transactions compactly.

Steps

1. Build the FP-Tree:

◦ Count how often items appear.


◦ Keep items meeting the support threshold (e.g., 40% or 2 transactions).
◦ Sort items by frequency and add transactions to the tree.
2. Mine the FP-Tree:

◦ Start from the bottom of the tree and nd combinations of frequent items.

Example:

Transactio
Items Bought
n
T1 Bread, Milk, Beer
Bread, Milk,
T2
Diaper
T3 Milk, Diaper, Beer
T4 Bread, Milk
T5 Bread, Diaper

Frequent Items: Bread (4), Milk (4), Diaper (3), Beer (2).

FP-Tree:

NULL
fi
fi
|
Bread (4)
/ \
Milk (3) Diaper (1)
| |
Beer (1) Milk (1)
Frequent Itemsets:

• {Bread}, {Milk}, {Bread, Milk}, {Bread, Diaper}, {Milk, Diaper}.

Result: FP-Growth quickly nds frequent itemsets like:

• Bread and Milk often appear together.

3rd Question: What is Classi cation?

What It Does: Classi cation assigns categories (labels) to data based on past examples.

Examples:

• Spam Detection: Is an email "spam" or "not spam"?


• Medical Diagnosis: Is a tumor "benign" or "malignant"?

Issues in Classi cation

1. Imbalanced Data: If one class (e.g., "spam") is rare, the model may focus on the common class.

◦ Solution: Balance the dataset.


2. Over tting: The model learns noise in training data.

◦ Solution: Simplify the model or use more data.


3. Under tting: The model is too simple and misses patterns.

◦ Solution: Use a better algorithm or features.


4. Noisy Data: Errors or irrelevant information can confuse the model.

◦ Solution: Clean the data.

Conclusion: Classi cation helps predict labels, but good data and careful modeling are essential.

4th question

4th Question: Classi cation by Decision Tree Induction

What It Does: A Decision Tree splits data into branches based on rules to classify it into categories.
fi
fi
fi
fi
fi
fi
fi
fi
Steps

1. Feature Selection: Choose the best feature to split the data (e.g., "Weather").

◦ Use methods like Information Gain or Gini Index to decide.


2. Build the Tree:

◦ Split data into branches based on the chosen feature.


◦ Repeat until:
▪ All data in a branch belongs to the same class, or
▪ A stopping condition is met (e.g., max depth).
3. Prune the Tree:

◦ Remove unnecessary branches to avoid over tting.

Example

Dataset:

Weathe Temperatur Play


r e Tennis
Sunny Hot No
Sunny Mild No
Overcast Mild Yes
Rainy Cool Yes
Rainy Mild Yes

Decision Tree:

Weather?
├── Sunny: No
├── Overcast: Yes
└── Rainy: Yes
Prediction:

• If Weather is Sunny, predict No.

Advantages

• Easy to understand and visualize.


• Works with numerical and categorical data.
Disadvantages

• Can over t if the tree is too deep.


• Sensitive to noisy data.

Conclusion: Decision Trees classify data by splitting it based on features. They're simple but need pruning
to work well.
fi
fi
Support Vector Machine (SVM)

• What It Does: SVM separates data points into classes using a line (or hyperplane in higher
dimensions).
• Key Idea: It nds the line that keeps the maximum distance from the closest points of each class
(called support vectors).
• When To Use: For data with clear separation between classes.
• Advantages:
◦ Works well with high-dimensional data.
◦ Effective when there’s a clear margin between classes.
• Disadvantages:
◦ Can be slow with large datasets.
◦ Needs careful selection of parameters (like kernels).

K-Nearest Neighbors (KNN)

• What It Does: KNN assigns a class to a point based on the majority class of its k

-nearest neighbors.
• Key Idea: It uses the distance to nearby points to classify new data.
• When To Use: Simple problems with smaller datasets.
• Advantages:
◦ Easy to understand and implement.
◦ No need for a training phase (lazy learning).
• Disadvantages:
◦ Can be slow for large datasets.
◦ Sensitive to irrelevant features and the value of k

Comparison of SVM and KNN

Feature SVM KNN


Model-based (builds a Instance-based (on-the-
Learning Type
rule) y)
Speed Fast prediction, slow train Slow prediction
Data
High-dimensional data Small datasets
Suitability

Both are useful, but SVM is better for complex and high-dimensional problems, while KNN is great for
simpler, intuitive tasks.
fl
fi

You might also like