0% found this document useful (0 votes)

13 views

FML - |||

The document provides an overview of classification in machine learning, detailing its definition, types (binary, multi-class, multi-label, and imbalanced classification), and steps to build a classification model. It also explains algorithms like k-Nearest Neighbour and Decision Trees, their advantages and disadvantages, and compares Support Vector Machines with k-NN. Additionally, it discusses the importance of feature selection and presents a real-world use case for classification in loan approval prediction.

Uploaded by

toufiqkhan809

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

FML - |||

Uploaded by

toufiqkhan809

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

1. A) Define Classification and list its types with examples.

Definition: Classification is a supervised machine learning technique used to predict a category or

class label for given input data. The model learns from already labeled data and uses this learning to
predict labels for new data.

Types of Classification with Examples:

1. Binary Classification:

o Only two output classes.

o Example: Classifying emails as Spam or Not Spam.

2. Multi-Class Classification:

o More than two categories.

o Example: Classifying fruits as Apple, Banana, Orange.

3. Multi-Label Classification:

o An instance can belong to more than one class at the same time.

o Example: A movie can be labeled as both Comedy and Romance.

4. Imbalanced Classification:

o When one class has much more data than others.

o Example: Detecting fraud (fraud cases are fewer than normal ones).

1. B) Explain the steps involved in building a classification model.

1. Data Collection:
Gather the dataset which contains input features and class labels.

2. Data Preprocessing:
Clean the data by handling missing values, converting categories to numbers (encoding), and
scaling values if needed.

3. Splitting Data:
Divide the data into training and testing sets (e.g., 80% train, 20% test).

4. Model Selection:
Choose an algorithm like Decision Tree, SVM, k-NN, etc.

5. Training the Model:

Train the algorithm on the training data to learn patterns.

6. Testing the Model:

Test the trained model using test data to check how well it predicts.
7. Model Evaluation:
Use metrics like accuracy, precision, recall, and F1-score.

8. Tuning and Optimization:

Adjust model parameters to improve performance.

2. Explain the working principle of the k-Nearest Neighbour algorithm with an example.

k-NN (k-Nearest Neighbour) Algorithm:

 It is a lazy learning algorithm, meaning it doesn’t learn during training.

 When given a new data point, it compares it to all the data in the training set.

 It selects the ‘k’ closest points (neighbors) and assigns the class that is most frequent among
those neighbors.

Steps:

1. Choose a value for k (e.g., k = 3).

2. Calculate the distance (usually Euclidean) from the new point to all existing points.

3. Pick the k nearest neighbors.

4. Count how many belong to each class.

5. Assign the most frequent class to the new point.

Example: Predicting if a person likes tea or coffee based on age and income:

 New person: age 30, income ₹40k.

 Find 3 closest people from the dataset.

 If 2 like tea, 1 likes coffee → Prediction = Tea.

3. A) Describe the Decision Tree algorithm.

A Decision Tree is a tree-shaped structure used to make decisions. It splits the data into branches
based on questions or conditions.

Working:

1. It chooses the best feature that divides the data well (based on Gini Index or Information
Gain).

2. It creates a node for that feature.

3. For each possible value, it creates branches.

4. This continues until the tree reaches leaves with class labels.

Example: To predict whether to play:

 Is it sunny?

o Yes → Is humidity high?

 Yes → Don’t Play

 No → Play

o No → Play

3. B) Advantages and Disadvantages of Decision Trees (Expanded)

Advantages:

1. Simple to Understand:

o The structure looks like a flowchart with decisions (questions) and outcomes
(answers), so it is easy to explain.

2. No Need for Data Scaling:

o Works without normalizing or scaling the input features.

3. Works for Different Types of Data:

o Handles both categorical (like gender) and numerical (like age) data.

4. Fast and Efficient:

o Especially when the dataset is not too large.

Disadvantages:

1. Overfitting:

o If the tree is too deep, it might memorize the training data and perform badly on
new data.

2. Unstable:

o Small changes in data can lead to a completely different tree.

3. Biased to Dominant Classes:

o If some classes appear more in the data, it may prefer those.

4. Not Always Optimal:

o Sometimes, the decisions are not the best for generalizing.

4. A) Explain Support Vector Machines (SVM) and its key components.

Definition: SVM is a powerful supervised learning algorithm used mainly for binary classification. It
tries to find the best separating line or hyperplane between two classes.

Key Components:

1. Hyperplane:
o A line (2D) or plane (3D+) that divides the data.

o The best hyperplane separates the classes with the largest margin.

2. Support Vectors:

o Data points that are closest to the hyperplane.

o These points help in deciding the position of the hyperplane.

3. Margin:

o The distance between the support vectors and the hyperplane.

o Larger margin = better generalization = better model.

4. B) Kernel Tricks in SVM (Expanded)

Sometimes data points are not linearly separable in the current dimension. In such cases, we use
kernel tricks.

What is a Kernel Trick?

A kernel is a mathematical function that transforms the data into a higher-dimensional space, where
a linear separator (a hyperplane) can be found.

Example:

 Suppose we have data in a circle shape.

 In 2D, we can't draw a straight line to separate the classes.

 Kernel functions like RBF (Gaussian) or polynomial kernel can map the data into a higher
dimension where a straight line (hyperplane) can separate them.

Types of Kernels:

1. Linear Kernel:

o Used when data is linearly separable.

o Example: Text classification.

2. Polynomial Kernel:

o Adds polynomial features (like x², x³).

o Good for curved boundaries.

3. RBF (Radial Basis Function):

o Good for very complex patterns.

o Used when we don’t know the best shape of the decision boundary.
4. Sigmoid Kernel:

o Similar to neural networks.

Benefits:

 Helps SVM work on non-linear data.

 No need to manually transform data.

 Keeps calculations efficient.

5. Compare SVM and k-NN in performance and applications

Feature SVM k-NN

Training Time Long Very short

Prediction Time Fast Slow

Memory Usage Low (only support vectors) High (stores all data)

Best For Text, image classification Simple datasets, recommendation

Handling Large Data Good Not good

Type of Model Global model Local model

Works Well With High-dimensional data Low-dimensional data

6. A) Steps in Classification Learning (Expanded)

1. Problem Definition:

o Understand what you want to classify (e.g., spam detection).

2. Data Collection:

o Gather historical data with input features and known labels.

3. Data Preprocessing:

o Clean data, handle missing values, encode categories, and scale features if needed.

4. Split Data:

o Divide the data into training and test sets (e.g., 70/30 split).

5. Model Selection:

o Choose a suitable algorithm like Decision Tree, SVM, or k-NN based on data and
accuracy needs.

6. Training:
o Use the training data to help the algorithm learn the pattern.

7. Evaluation:

o Use the test data and metrics like accuracy, precision, recall, and F1-score.

8. Improvement (Tuning):

o Try changing algorithm parameters or selecting better features for higher

performance.

9. Deployment:

o Use the trained model in a real-world system (like a medical app or email filter).

6. B) Real-World Use-Case of Classification (Expanded)

Use Case: Loan Approval Prediction

 Problem: Banks need to decide whether to give a loan to a person or not.

 Input Features: Age, job type, income, credit score, number of dependents.

 Output Label: Approve loan (Yes/No).

How Classification Helps:

 The model learns from past loan data.

 It predicts whether a new applicant is likely to repay the loan.

 Saves time and reduces human error.

 Can improve bank profits by reducing bad loans.

7. How does feature selection affect the performance of classification algorithms? (Expanded)

Explanation:

Features are the input values used to make predictions (like age, income, marks, etc.).
If the right features are chosen, the model will learn better patterns and make more accurate
predictions.

Importance of Feature Selection:

1. Better Accuracy:

o Irrelevant or noisy features can mislead the model.

o Example: Using a person’s “eye color” to predict salary may confuse the model.

2. Faster Computation:

o Fewer features = less time needed to train and predict.

3. Prevents Overfitting:

o Using only important features helps the model generalize better to new data.

4. Simplifies the Model:

o Makes it easier to understand and explain.

Example: For classifying whether a student will pass or fail:

 Good features: Study hours, attendance, past scores.

 Bad features: Favourite food, T-shirt color.

Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Electrical System: Vo(s) sLI(s) (S) - Vo(s) )
No ratings yet
Electrical System: Vo(s) sLI(s) (S) - Vo(s) )
65 pages
Machine Learning
No ratings yet
Machine Learning
15 pages
Supervised Learning - SVM - DT
No ratings yet
Supervised Learning - SVM - DT
43 pages
Module Iii
No ratings yet
Module Iii
15 pages
ml 2m cie2
No ratings yet
ml 2m cie2
4 pages
ML QA
No ratings yet
ML QA
10 pages
ML UNIT4
No ratings yet
ML UNIT4
10 pages
6th_SEM Machine Learning Notes PDF
100% (1)
6th_SEM Machine Learning Notes PDF
36 pages
Unit 5 Learning with Algorithm
No ratings yet
Unit 5 Learning with Algorithm
7 pages
Assignment 0.2
No ratings yet
Assignment 0.2
8 pages
U21amg05 Aif and ML Unit 04 Notes
No ratings yet
U21amg05 Aif and ML Unit 04 Notes
42 pages
ML-2m
No ratings yet
ML-2m
3 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
UCS551 Chapter 6 - Classification
No ratings yet
UCS551 Chapter 6 - Classification
20 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Classification
No ratings yet
Classification
7 pages
Machine learning algorithms laiki
No ratings yet
Machine learning algorithms laiki
123 pages
15
No ratings yet
15
38 pages
Chatgpt Unit - 3
No ratings yet
Chatgpt Unit - 3
4 pages
Quiz 1 On Wednesday
No ratings yet
Quiz 1 On Wednesday
46 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
Supervised Classification Notes
No ratings yet
Supervised Classification Notes
31 pages
ML Supervised Learning Unit 3
No ratings yet
ML Supervised Learning Unit 3
51 pages
ppt4dl
No ratings yet
ppt4dl
81 pages
DM assignment 2
No ratings yet
DM assignment 2
23 pages
Understanding Machine Learning Algorithms - in Depth
No ratings yet
Understanding Machine Learning Algorithms - in Depth
167 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
Topic 08 - Data Modelling - Part II
No ratings yet
Topic 08 - Data Modelling - Part II
59 pages
NIT ML SUGG
No ratings yet
NIT ML SUGG
5 pages
Unit 4 Introduction to Algorithm
No ratings yet
Unit 4 Introduction to Algorithm
10 pages
Module 3
No ratings yet
Module 3
11 pages
ML-Notes
No ratings yet
ML-Notes
12 pages
ML ModuleUntitled 2
No ratings yet
ML ModuleUntitled 2
8 pages
Chapter 4. Classification Algorithms-Stud
No ratings yet
Chapter 4. Classification Algorithms-Stud
43 pages
Supervised Learning Final With Diagrams Cleaned
No ratings yet
Supervised Learning Final With Diagrams Cleaned
7 pages
Assignment 2
No ratings yet
Assignment 2
111 pages
Chapter 2 Machine Learning Draft-85-172
No ratings yet
Chapter 2 Machine Learning Draft-85-172
88 pages
ML - Interview Prep
No ratings yet
ML - Interview Prep
9 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
5 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
ML QB WITH ANSWER
No ratings yet
ML QB WITH ANSWER
20 pages
49 Machine Learning
No ratings yet
49 Machine Learning
300 pages
Assessing a Single Classification Algorithm and Two Classification Algorithms
No ratings yet
Assessing a Single Classification Algorithm and Two Classification Algorithms
12 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
Machine Learning Concept1
No ratings yet
Machine Learning Concept1
16 pages
Introduction to AI
No ratings yet
Introduction to AI
51 pages
1 - Supervised Learning & Its Types
No ratings yet
1 - Supervised Learning & Its Types
24 pages
Day 4 Content
No ratings yet
Day 4 Content
35 pages
New Classification and Regression Models
No ratings yet
New Classification and Regression Models
7 pages
ML models
No ratings yet
ML models
21 pages
Module 3 (1)
No ratings yet
Module 3 (1)
63 pages
Machine Learning in A Nutshell
No ratings yet
Machine Learning in A Nutshell
36 pages
ML U4
No ratings yet
ML U4
48 pages
Accelerated Data Science Introduction To Machine Learning Algorithms
No ratings yet
Accelerated Data Science Introduction To Machine Learning Algorithms
37 pages
AML Imp Ques
No ratings yet
AML Imp Ques
10 pages
Data Science Introduction
No ratings yet
Data Science Introduction
6 pages
Lecture 2 Final
No ratings yet
Lecture 2 Final
90 pages
Unit 5
No ratings yet
Unit 5
28 pages
Machine Learning Midterm
No ratings yet
Machine Learning Midterm
18 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Deep Learning
No ratings yet
Deep Learning
4 pages
machineLearning
No ratings yet
machineLearning
3 pages
Data Visualization and techniques
No ratings yet
Data Visualization and techniques
7 pages
covid_toy
No ratings yet
covid_toy
2 pages
customer dataset
No ratings yet
customer dataset
1 page
cloud_computing
No ratings yet
cloud_computing
7 pages
Amazon Sales Data Excel
No ratings yet
Amazon Sales Data Excel
94 pages
FML - |
No ratings yet
FML - |
18 pages
FML - ||
No ratings yet
FML - ||
11 pages
Simple Linear Regression Notes
No ratings yet
Simple Linear Regression Notes
4 pages
Chi-Square Test: Understanding Research
No ratings yet
Chi-Square Test: Understanding Research
1 page
(Ebook) Non-Hermitian Quantum Mechanics by Nimrod Moiseyev ISBN 0521889723 2024 Scribd Download
100% (3)
(Ebook) Non-Hermitian Quantum Mechanics by Nimrod Moiseyev ISBN 0521889723 2024 Scribd Download
71 pages
Big-P-Task-Grade-8 Fourth Quarter
No ratings yet
Big-P-Task-Grade-8 Fourth Quarter
3 pages
Design and Optimization OF A Vertical Turbine Pump - Empowering Pumps and Equipment
100% (1)
Design and Optimization OF A Vertical Turbine Pump - Empowering Pumps and Equipment
12 pages
3110015
No ratings yet
3110015
2 pages
Project Guidelines
No ratings yet
Project Guidelines
3 pages
List of 50 Practice Aptitude Questions For SBI PO Preliminary Exams Part-I PDF
No ratings yet
List of 50 Practice Aptitude Questions For SBI PO Preliminary Exams Part-I PDF
26 pages
Immediate download (Ebook) Engineering Computation: An Introduction Using MATLAB and Exce, 2nd Edition by Musto, Joseph, Howard, William, Williams, Richard ISBN 9781260570717, 1260570711 ebooks 2024
100% (10)
Immediate download (Ebook) Engineering Computation: An Introduction Using MATLAB and Exce, 2nd Edition by Musto, Joseph, Howard, William, Williams, Richard ISBN 9781260570717, 1260570711 ebooks 2024
81 pages
Mehlub - S.I - Final
No ratings yet
Mehlub - S.I - Final
35 pages
BQQ6214 Statistical Formulae
No ratings yet
BQQ6214 Statistical Formulae
3 pages
Tailings Dam With Core and Filter: Model Description and Geometry
No ratings yet
Tailings Dam With Core and Filter: Model Description and Geometry
9 pages
Class 6 Seamo: Answer The Questions
No ratings yet
Class 6 Seamo: Answer The Questions
3 pages
Number Bases and C's Bitwise Operators
No ratings yet
Number Bases and C's Bitwise Operators
3 pages
An Efficient 3D Topology Optimization Code Written in Matlab
No ratings yet
An Efficient 3D Topology Optimization Code Written in Matlab
22 pages
Pps Cse
No ratings yet
Pps Cse
93 pages
02 - 1a Total Dynamic Head
100% (1)
02 - 1a Total Dynamic Head
55 pages
Infinite Series Objective
No ratings yet
Infinite Series Objective
3 pages
KD_Kidoku_4x4_v2
No ratings yet
KD_Kidoku_4x4_v2
9 pages
Accenture Preparatory Material
No ratings yet
Accenture Preparatory Material
3 pages
几何群论讲义杨文元
No ratings yet
几何群论讲义杨文元
90 pages
Valuation of Securities
100% (1)
Valuation of Securities
43 pages
Adiabatic Logic
0% (1)
Adiabatic Logic
21 pages
The Nature of Mathematics: at The End of The First Module, You Must Have
No ratings yet
The Nature of Mathematics: at The End of The First Module, You Must Have
7 pages
2022 2023 s5 1st Term Ut Math CP 1
No ratings yet
2022 2023 s5 1st Term Ut Math CP 1
8 pages
HTML Canvas Deep Dive
No ratings yet
HTML Canvas Deep Dive
49 pages
Parallel Computing Seminar Report
100% (3)
Parallel Computing Seminar Report
35 pages
Cot 3 DLL
No ratings yet
Cot 3 DLL
8 pages
Gas-Liquids Separators - Mark Bothamley - Part 2
No ratings yet
Gas-Liquids Separators - Mark Bothamley - Part 2
14 pages
Auditorium Aided Design
No ratings yet
Auditorium Aided Design
21 pages