ML1 - Classification - KNN & NB

The document discusses k-nearest neighbors (kNN) and naive Bayes classifiers. It provides references for learning about these algorithms from books and online sources. It also describes classification tasks, binary versus multiclass classification, and direct outcome versus two-step classification processes. The document uses the Iris dataset as a sample classification problem and discusses splitting data into training and test sets. It provides explanations of kNN classification, including calculating similarity, distance metrics, choosing k, and making predictions. It also explains the naive Bayes approach including the independence assumption and calculating conditional probabilities.

Uploaded by

param_email

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

ML1 - Classification - KNN & NB

Uploaded by

param_email

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

Classification : kNN & NB

REFERENCES
kNN Classifier:
Book: Machine Learning with Python for Everyone (Chapter 3)
NB Classifier:
Book: Machine Learning with Python for Everyone (Chapter 3)
Naive Bayes Classifier in Machine Learning (enjoyalgorithms.com)
Bayes Theorem - Statement, Proof, Formula, Derivation & Examples (byj
us.com)
Classification Tasks
• Depending on no. of outcomes
• Binary Classifiction (Two class classification)
• {Yes, No}; {Red, Black}; {True, False}
• {-1 +1}; {0, 1}
• Multiclass Classification
• {Cruiser, Destroyer, Frigate Mine Sweeper, Air Craft Carrier…}
• Depending on steps involved
• Direct outcome in one step
• K Nearest Neighbours
• Two step process
• (1) build a model of how likely the outcomes are and
• (2) pick the most likely outcome
• Naïve Bayes
Sample (& Simple) Classification Dataset
• IRIS Dataset
• Included with sklearn
• Fisher’s Dataset
• Sir Ronald Fisher, mid-20th-century statistician
• First academic paper on classification
• Edgar Anderson
• Gatherer of data!
• Contents
• Each Row: describes one iris flower, in terms of the length and width of that flower’s sepals
and petals
• Rows: Examples / samples
• Final Column: Particular species of that iris: setosa, versicolor, or virginica
• Features / Attributes / IV (initial columns) and Target / Label / DV (Final column)
Sample (& Simple) Classification Dataset
Sample (& Simple) Classification Dataset
Training and Test (Data) Sets
Training and Test (Data) Sets
• Generalization
• Performance on novel data (general knowledge)
• Evaluation Schemes
• in-sample evaluation or training error
• out-of-sample or test error evaluation
• sklearn’s train_test_split
• training data
• portion of the data that we will use to study and build up our understanding
• testing data
• portion of the data that we will use to test ourselves
• Split randomly
Training and Test (Data) Sets
Training and Test (Data) Sets

iris Python variable Symbol Phrase

iris Dall (total) dataset
iris.data Dftrs train and test features
iris.target Dtgt train and test targets
iris_train_ftrs Dtrain training features
iris_test_ftrs Dtest testing features
iris_train_tgt Dtraintgt training target
iris_test_tgt Dtesttgt testing target
Evaluation
• Accuracy
• If the answer is true and we predicted true, then we get a point!
• If the answer is false and we predicted true, we don’t get a point!!
• Formula: (#correct answers / #questions)
• sklearn’s train_test_split
• training data
• portion of the data that we will use to study and build up our understanding
• sklearn’s metrics.accuracy_score
k Nearest Neighbours Classifier
• Simple Classifier
• Single step to make predictions from labelled dataset
• Method
• Find a way to describe the similarity of two different examples.
• When you need to make a prediction on a new, unknown example, simply take the
value from the most similar known example
• Consider more than just the single most similar example:
• Describe similarity between pairs of examples.
• Pick several of the most-similar examples.
• Combine those picks to get a single answer.
k Nearest Neighbours Classifier
• Similarity
• A distance between pairs of examples
• similarity = distance(example_one, example_two)
• Similar things are close - a small distance apart
• Dissimilar things are far away - a large distance apart
• Distance Metrics
• Euclidean Distance
• treat the two examples as points in space
• Hamming Distance
• when we have examples that consist of simple Yes ; No or True; False features, with Boolean
data, we can compare two examples very nicely by counting up the number of features that are
different
• Minkowski Distance etc…
k Nearest Neighbours Classifier
• k in the k-NN and Answer Combination
• 1 / 3 / 5 / 10 / 20
• Voting method to classify
• Noise problem
• Tie problem
• {cat, dog, dog, zebra, cat}
• Statistic (mean / median) to regress
k Nearest Neighbours Classifier
• We want to use 3-NN - three nearest neighbors - as our model
• We want that model to capture the relationship between the iris
training features and the iris training targets
• We want to use that model to predict - on previously unseen test
examples - the iris target species.
• Finally, we want to evaluate the quality of those predictions, using
accuracy, by comparing predictions against reality. We don’t peek at
these known answers, but we use them as an answer key for the test.
k Nearest Neighbours Classifier
k Nearest Neighbours Classifier
• sklearn’s terminology
• An estimator is fit on some data and then used to predict on some data.
• We fit the estimator on training data and then use the fit-estimator to predict
on the test data.
• In other words:-
• Create a 3-NN model,
• Fit that model on the training data,
• Use that model to predict on the test data, and
• Evaluate those predictions using accuracy
k Nearest Neighbours Classifier
• Hyperparameters
• 3 in our 3-nearest-neighbors is not something that we adjust by training
• If we want a 5-NN machine, we have to build a completely different model
• 3 is a hyperparameter
• Hyperparameters are not trained or manipulated by the learning method they
help define
• Hyperparameters are predetermined and fixed before we get a chance to do
anything with them while learning
Naïve Bayes Classifier
Naïve Bayes Classifier
• Example: Football Play with single feature

• Aim: to make a ML model which receives the feature value of humidity and tries to
predict whether the play will happen or not
• Given that Humidity is Normal, lets find the chances of the play
• p(Play = Yes | Humidity = Normal)
Naïve Bayes Classifier
Naïve Bayes Classifier

Analysis of Survival Data - LN - D Zhang - 05
100% (1)
Analysis of Survival Data - LN - D Zhang - 05
264 pages
Who's #1?: The Science of Rating and Ranking
From Everand
Who's #1?: The Science of Rating and Ranking
Amy N. Langville
4.5/5 (4)
Cornell ECE 5790: RF Integrated Circuit Design Assignment 3
No ratings yet
Cornell ECE 5790: RF Integrated Circuit Design Assignment 3
5 pages
5. K-Nearest Neighbors Classifiers 2025
No ratings yet
5. K-Nearest Neighbors Classifiers 2025
33 pages
Practical 10 K-Nearest Neighbors Algorithm
No ratings yet
Practical 10 K-Nearest Neighbors Algorithm
16 pages
Unit 4_KVR
No ratings yet
Unit 4_KVR
111 pages
L05-Predictive Analytics I
No ratings yet
L05-Predictive Analytics I
49 pages
Lecture7 KNN
No ratings yet
Lecture7 KNN
40 pages
Lecture_07_slides
No ratings yet
Lecture_07_slides
45 pages
ML-KN
No ratings yet
ML-KN
12 pages
Mod3_Classification
No ratings yet
Mod3_Classification
32 pages
KNN Lab
No ratings yet
KNN Lab
4 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
UNIT-3
No ratings yet
UNIT-3
100 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
A Complete Guide To KNN
No ratings yet
A Complete Guide To KNN
16 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
ml unit2
No ratings yet
ml unit2
38 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
No ratings yet
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
33 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning - Javatpoint
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning - Javatpoint
18 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
ML.4-Classification Techniques (Week 5,6,7)
No ratings yet
ML.4-Classification Techniques (Week 5,6,7)
56 pages
ML Mid2 Ans
No ratings yet
ML Mid2 Ans
24 pages
cs4302-lecture2
No ratings yet
cs4302-lecture2
40 pages
FPA unit 2
No ratings yet
FPA unit 2
20 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
105 pages
06-knn
No ratings yet
06-knn
41 pages
ML CH 3
No ratings yet
ML CH 3
88 pages
Total Listing Machine Learning
100% (1)
Total Listing Machine Learning
114 pages
Nearest Neighbour
No ratings yet
Nearest Neighbour
25 pages
T6- KNN - Features, Distances &amp; Non-Parametric Models
No ratings yet
T6- KNN - Features, Distances &amp; Non-Parametric Models
23 pages
Lec 5 b Analytics Classification
No ratings yet
Lec 5 b Analytics Classification
56 pages
ML UNIT-2
No ratings yet
ML UNIT-2
33 pages
Unit-5
No ratings yet
Unit-5
73 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
ML_Course_15 -17
No ratings yet
ML_Course_15 -17
31 pages
KNN CIML
No ratings yet
KNN CIML
12 pages
sensitivity unit 4
No ratings yet
sensitivity unit 4
4 pages
Data Analytics Classification
No ratings yet
Data Analytics Classification
56 pages
5c. Nearest Neighbour Classifier
No ratings yet
5c. Nearest Neighbour Classifier
2 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
ML practical Manjot 6-10
No ratings yet
ML practical Manjot 6-10
10 pages
Classification-Bayesian Classification
No ratings yet
Classification-Bayesian Classification
9 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
22 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
51 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
DM - MP (1)
No ratings yet
DM - MP (1)
15 pages
Lecture 3 Basics of Clssification
No ratings yet
Lecture 3 Basics of Clssification
53 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
13 pages
K-Nearest Neighbor(KNN) 6
No ratings yet
K-Nearest Neighbor(KNN) 6
46 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
ch2
No ratings yet
ch2
30 pages
MachineLearning Unit-III Ppt
No ratings yet
MachineLearning Unit-III Ppt
26 pages
L4
No ratings yet
L4
37 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Fault Solution Administration Guide Helix 11.1
No ratings yet
Fault Solution Administration Guide Helix 11.1
131 pages
Roadside Video Data Analysis Deep Learning 1st Edition Brijesh Verma 2024 Scribd Download
100% (4)
Roadside Video Data Analysis Deep Learning 1st Edition Brijesh Verma 2024 Scribd Download
55 pages
Manning 11
No ratings yet
Manning 11
26 pages
1 Quarter 2 Quarter 3 Quarter 4 Quarter: Product Introduction Cost Reduction Introduction of New Feature
No ratings yet
1 Quarter 2 Quarter 3 Quarter 4 Quarter: Product Introduction Cost Reduction Introduction of New Feature
4 pages
Project Introduction
No ratings yet
Project Introduction
15 pages
BPA121 Eng
No ratings yet
BPA121 Eng
14 pages
Pratibha CV
No ratings yet
Pratibha CV
3 pages
PLSQL Examples
100% (1)
PLSQL Examples
81 pages
1727069393.brochure (2) (3)
No ratings yet
1727069393.brochure (2) (3)
31 pages
02 Performance Task 1 (Plat Tech) - Jadulco
No ratings yet
02 Performance Task 1 (Plat Tech) - Jadulco
4 pages
Syllabus of Django Online Training Course
No ratings yet
Syllabus of Django Online Training Course
3 pages
(Ebook) Game audio programming: principles and practices by Somberg, Guy ISBN 9781498746731, 9781498746748, 149874673X, 1498746748instant download
100% (5)
(Ebook) Game audio programming: principles and practices by Somberg, Guy ISBN 9781498746731, 9781498746748, 149874673X, 1498746748instant download
60 pages
Mastering C 2nd Edition Venugopal pdf download
100% (1)
Mastering C 2nd Edition Venugopal pdf download
43 pages
A Distinct Method To Find The Critical Path and Total Float Under Fuzzy Environment
No ratings yet
A Distinct Method To Find The Critical Path and Total Float Under Fuzzy Environment
5 pages
Dilations Translations PDF
0% (1)
Dilations Translations PDF
5 pages
analog-behavioral-modeling-using-pspice
No ratings yet
analog-behavioral-modeling-using-pspice
4 pages
Net Command
No ratings yet
Net Command
3 pages
Precious
No ratings yet
Precious
3 pages
246064 Ph Hl Mnr-tr Sg 22-h1 Us Original File
No ratings yet
246064 Ph Hl Mnr-tr Sg 22-h1 Us Original File
8 pages
AdmitCard PDF
No ratings yet
AdmitCard PDF
1 page
Sony Smart LED
100% (3)
Sony Smart LED
153 pages
7
No ratings yet
7
2 pages
Module 1 - Current State of ICT
No ratings yet
Module 1 - Current State of ICT
4 pages
WHERE Clause: DCL Command
No ratings yet
WHERE Clause: DCL Command
8 pages
Manual MP 4000
No ratings yet
Manual MP 4000
48 pages
AutoCAD Tutorial 16 - Chamfer and Fillet in AutoCAD
No ratings yet
AutoCAD Tutorial 16 - Chamfer and Fillet in AutoCAD
5 pages
Farmer Friendly Application For Resource Mapping of Village With Government Aided Schemes
No ratings yet
Farmer Friendly Application For Resource Mapping of Village With Government Aided Schemes
3 pages
Computer Office Application
No ratings yet
Computer Office Application
56 pages

ML1 - Classification - KNN & NB

Uploaded by

ML1 - Classification - KNN & NB

Uploaded by

Classification : kNN & NB

iris Python variable Symbol Phrase

You might also like