0% found this document useful (0 votes)

4 views

Classification Metrics.pptx

The document discusses binary classification, defining key concepts such as instances, model, and label space, and emphasizes the importance of assessing classification performance through metrics like confusion matrix, sensitivity, specificity, accuracy, and precision. It also covers visualization techniques such as ROC curves and AUC for evaluating classifiers, as well as class probability estimation and methods for assessing probability estimates like Brier score and Mean Squared Error. Overall, it provides a comprehensive overview of classification tasks in machine learning and the metrics used to evaluate their performance.

Uploaded by

darshuipath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Classification Metrics.pptx

Uploaded by

darshuipath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

Evaluation

Binary classification
•Task of classifying the elements of a set into two groups on
the basis of a classification rule.
•Observed response (output) ‘y’ takes only two possible
values + / – , or T/ F .
• Need to define the relationship between h(x) and y
• Use the decision rule:

Ex:
•Medical test- to determine if a patient has certain disease or not;
•Is the person fit or not
•Spam email classification
Definition
Instances
• The objects of interest in machine learning
Instance space [ ]
• The set of all possible instances
• Example: set of all possible e-mails
Label Space
• The label space is used in supervised learning to label the examples
Model
• In order to achieve the task under consideration we need a model:
a mapping from the instance space to the output space.
• For instance, in classification the output space is a set of classes,
while in regression it is the set of real numbers.
• In order to learn such a model we require a training set of labelled
instances (x, l(x)), also called examples
Assessing classification
performance
• The outputs of learning algorithms need to be
assessed and analyzed carefully and this
analysis must be interpreted correctly, so as to
evaluate different learning algorithms.

• The performance of classifiers can be

summarized by means of a table known as a
contingency table or confusion matrix.
1. Contingency table or confusion
matrix
• A confusion matrix is a table that is used to describe
the performance of a classification model (or
“classifier”) on a set of test data for which the true
values are known.
• It is a summary of prediction results on a classification
problem.
• The number of correct and incorrect predictions are
summarized with count values and broken down by
each class.
• shows the ways in which the classification model is
confused when it makes predictions
• It contains information about actual and predicted
classifications
• True Positive (TP) : Given class is spam and the classifier has been
correctly predicted it as spam.
• False Negative (FN) : Given class is spam however, the classifier
has been incorrectly predicted it as non-spam.
• False positive (FP) : Given class is non-spam however, the classifier
has been incorrectly predicted it as spam.
• True Negative (TN) : Given class is non- spam and the classifier has
been correctly predicted it as negative.
Ex: Confusion matrix of email classification
• Classification problem has spam and non-spam classes
and dataset contains 100 examples, 65 are spams and
35 are non-spams.
Sensitivity : referred as True Positive Rate or Recall.
• It is measure of positive examples labeled as positive by classifier.
• It should be higher.
• For instance, proportion of emails which are spam among all spam
emails.
• Out of all the positive classes, how much we predicted correctly.

Sensitivity = 45/(45+20) = 69.23% (The 69.23% spam emails are

correctly classified and excluded from all non-spam emails).
• Specificity - True Negative Rate.
• It is measure of negative examples labeled as negative
by classifier - it should be higher value
• For instance, proportion of emails which
are non-spam among all non-spam emails.

specificity = 30/(30+5) = 85.71% (The 85.71% non-spam emails

are accurately classified and excluded from all spam emails.
• Accuracy is the proportion of the total number of
predictions that are correct.

Accuracy = (45+30)/(45+20+5+30) = 75% (The 75% of examples are

correctly classified by the classifier.
• Precision is ratio of total number of correctly classified positive
examples and the total number of predicted positive examples.
• It shows correctness achieved in positive prediction , (Out of all
the positive classes we have predicted correctly, how many are
actually positive)
• High Precision indicates an example labeled as positive is
indeed positive (a small number of FP).

Precision = 45/(45+5)= 90% (The 90% of examples are classified as

spam are actually spam.
• Recall : The ratio of the total number of correctly
classified positive examples divide to the total
number of positive examples
• Out of all the positive classes, how much we
predicted correctly.
• It should be high as possible.

• High Recall indicates the class is correctly

recognized (a small number of FN).
F-measure (F1 score)
– good choice when you seek to balance between
Precision and Recall.
– It helps to compute recall and precision in one equation
so that the problem to distinguish the models with low
recall and high precision or vice versa could be solved.
• The last column and the last row give the
marginals (i.e., column and row sums).
Visualizing classification
performance
1. Coverage plot -- A coverage plot visualizes these four
numbers by means of a rectangular coordinate system
and a point.(the number of positives Pos, the number of
negatives Neg, the number of true positives TP and the
number of false positives FP).
• In a coverage plot, classifiers with the same accuracy are
connected by line segments with slope 1.
In a coverage plot, classifiers with the same
accuracy are connected by line segments with
slope 1.
2. ROC curves
An ROC curve (receiver operating characteristic curve) is a graph showing the
performance of a classification model at all classification thresholds.

This curve plots two parameters:

True Positive Rate
False Positive Rate

An ROC curve plots TPR vs. FPR at different classification thresholds. Lowering the
classification threshold classifies more items as positive, thus increasing both False
Positives and True Positives
https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc
ROC Curve
Let us consider the hypothetical data,
True Labels: [1, 0, 1, 0, 1, 1, 0, 0, 1, 0]
Predicted Probabilities: [0.8, 0.3, 0.6, 0.2, 0.7, 0.9, 0.4, 0.1, 0.75, 0.55]
https://www.geeksforgeeks.org/auc-roc-curve/
ROC Curve
Actual Class Predicted
Probability
• Set different
1 0.8
thresholds 0, 0.2, 0.4,
0 0.96 0.6, 0.8, 1 and
1 0.4 generate ROC curve
1 0.3 by plotting between
0 0.2 TPR and FPR.
1 0.7
Problem 2
Problem 3
Example
Actual Predicted
1. Set Threshold
P 2. Calculate TP, T >=0.8
N TN, FP, FN
N 3. Calculate TPR
N and FPR
N 4. Draw ROC curve
N with FPR vs TPR
N
N A\P C ¬C
N C TP FN P
N
N ¬C FP TN N
P’ N’ All

FPR – the proportion of samples

that were incorrectly classified
3. AUC CURVE
■ AUC stands for "Area under the ROC
Curve." That is, AUC measures the
entire two-dimensional area
underneath the entire ROC curve
(think integral calculus) from (0,0)
to (1,1).
■ AUC ranges in value from 0 to 1. A
model whose predictions are 100%
wrong has an AUC of 0.0; one
whose predictions are 100% correct
has an AUC of 1.0.
■ Thus, AUC ROC indicates how well the
probabilities from the positive classes
are separated from the negative classes.

https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc
30
https://medium.com/greyatom/lets-learn-about-auc-roc-curve-4a94b4d88152
AUC CURVE

■ ROC curves for different

classifiers for a given dataset.

■ Useful to pick the right classifier

based on AUC curve with good
TP rate

■ In this figure, classifier

corresponds to red line will be
selected

31
Class Probability Estimation
• The probability of an event is the likelihood that the event will
happen.

• Probability-based classifiers produce the class probability estimation

(the probability that a test instance belongs to the predicted class)

• Not only predicting the class label, but also obtaining a probability of
the respective label – use the estimated class probability for decision
making.
• Defn: A probabilistic classifier is a classifier that is able to predict,
given an observation of an input, a probability distribution over
a set of classes.
• Binary (ordinary) classifier uses function, that assigns to a
sample ’x’ a class label ‘ŷ’
ŷ= f(x)

• Probabilistic classifiers : instead of functions, they are

conditional distributions Pr=(Y/X) for a given x∈ X,
assign probabilities to all y ∈ Y (and these probabilities
sum to one)

• Ex: Naive Bayes, logistic regression and multilayer

perceptrons are naturally probabilistic.
Ex: Probability estimation tree
Assessing class probability
estimates
1. Sum Squared Error (SSE) : Square the individual
error terms (difference between the estimated values
and the actual value) which results in a positive
number for all values.
2. Mean squared error (MSE) -measures
the average of the squares of the errors
• The average squared difference between the
estimated values and the actual value (take the
average, or the mean, of the individual squared
error terms)
3. Brier score
• The definition of error in probability estimates - used in forecasting
theory

ft -the probability that was forecast,

ot - the actual outcome of the event at instance t (0 if it does not happen
and 1 if it does happen),
N is the number of forecasting instances.
• In effect, it is the mean squared error of the forecast.
• Brier score is a proper scoring rule only for binary events (for example
"rain" or "no rain").
• Ex: Suppose that one is forecasting the probability P that it
will rain on a given day. Then the Brier score is calculated as
follows:

• If the forecast is 100% (P = 1) and it rains, then the Brier

Score is 0, the best score achievable.
• If the forecast is 100% and it does not rain, then the Brier
Score is 1, the worst score achievable.
• If the forecast is 70% (P = 0.70) and it rains, then the Brier
Score is (0.70−1)2 = 0.09.
• In contrast, if the forecast is 70% (P= 0.70) and it does not
rain, then the Brier Score is (0.70−0)2 = 0.49.
Empirical probability
• uses the number of occurrences of an outcome within a sample set as a
basis for determining the probability of that outcome.
• The number of times "event X" happens out of 100 trials will be
the probability of event X happening.
• The empirical probability of an event is the ratio of the number of
outcomes in which a specified event occurs to the total number of trials,
• Empirical probability (experimental probability) estimates probabilities
from experience and observation.
Ex:
In a buffet, 95 out of 100 people chose to order coffee over tea. What is the empirical
probability of someone ordering tea?
Ans: The empirical probability of someone ordering tea is 5/100 = 5%.

Forecasting PDF
No ratings yet
Forecasting PDF
35 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
Module 5 ML
No ratings yet
Module 5 ML
12 pages
EvaluationMatrix
No ratings yet
EvaluationMatrix
29 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
lecture11evaluationmetricsforclassification-240913060639-0c766554
No ratings yet
lecture11evaluationmetricsforclassification-240913060639-0c766554
28 pages
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
No ratings yet
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
53 pages
UNIT-1-2.Binary Classification and Related Tasks
No ratings yet
UNIT-1-2.Binary Classification and Related Tasks
22 pages
Session01 DataScience
No ratings yet
Session01 DataScience
79 pages
Machine Learning Project Report (Group 3) Shahbaz Khan
No ratings yet
Machine Learning Project Report (Group 3) Shahbaz Khan
11 pages
ML-Lecture-11-Evaluation
No ratings yet
ML-Lecture-11-Evaluation
17 pages
IAI&ML UNIT-5
No ratings yet
IAI&ML UNIT-5
15 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
Ca 3 Merged
No ratings yet
Ca 3 Merged
275 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
9b. Evaluation of Classifiers
No ratings yet
9b. Evaluation of Classifiers
4 pages
9 - Session 9 - Visualizing Model Performance, Evidence and Probabilities
No ratings yet
9 - Session 9 - Visualizing Model Performance, Evidence and Probabilities
37 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
Evaluating Model Performance Unit 6
No ratings yet
Evaluating Model Performance Unit 6
33 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
DL_IT324a_4
No ratings yet
DL_IT324a_4
52 pages
lec5_Classification
No ratings yet
lec5_Classification
27 pages
ML Lec-11
No ratings yet
ML Lec-11
12 pages
Data Mining Final
No ratings yet
Data Mining Final
25 pages
06-FSSR_DS610_2024=2025T1_ٍMetrics
No ratings yet
06-FSSR_DS610_2024=2025T1_ٍMetrics
24 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
DSML Clasification
No ratings yet
DSML Clasification
44 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
AI Performance Evaluation - Annotated
No ratings yet
AI Performance Evaluation - Annotated
52 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
UNIT-3
No ratings yet
UNIT-3
13 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
bi2
No ratings yet
bi2
25 pages
IT 138 - Lecture 4
No ratings yet
IT 138 - Lecture 4
30 pages
6.Data Mining - Classification Ppt
No ratings yet
6.Data Mining - Classification Ppt
37 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Auc Roc Curve Machine Learning
No ratings yet
Auc Roc Curve Machine Learning
12 pages
Lecture 10
No ratings yet
Lecture 10
16 pages
Unit 2 Chap 4
No ratings yet
Unit 2 Chap 4
14 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
Imbalance Problem
No ratings yet
Imbalance Problem
13 pages
Evaluating Model Performance Unit 6
No ratings yet
Evaluating Model Performance Unit 6
46 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
CH-5_ML
No ratings yet
CH-5_ML
36 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
MISY 631 Final Review Calculators Will Be Provided For The Exam
No ratings yet
MISY 631 Final Review Calculators Will Be Provided For The Exam
9 pages
ML-Lecture-12 (Evaluation Metrics For Classification)
No ratings yet
ML-Lecture-12 (Evaluation Metrics For Classification)
15 pages
Analytics in Practice: Model Evaluation
No ratings yet
Analytics in Practice: Model Evaluation
40 pages
Model Evaluation - II
No ratings yet
Model Evaluation - II
12 pages
W6 CSE 4781 Classification Metrics
No ratings yet
W6 CSE 4781 Classification Metrics
28 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
Tutorial 6 Evaluation Metrics For Machine Learning Models: Classification and Regression Models
No ratings yet
Tutorial 6 Evaluation Metrics For Machine Learning Models: Classification and Regression Models
22 pages
Classification Algorithm in Machine Learning
No ratings yet
Classification Algorithm in Machine Learning
13 pages
Evaluation Metrics:: Confusion Matrix
No ratings yet
Evaluation Metrics:: Confusion Matrix
7 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Unmarked R Package
No ratings yet
Unmarked R Package
89 pages
Time-Independent Flow Modeling of Lassi-28!10!2011
No ratings yet
Time-Independent Flow Modeling of Lassi-28!10!2011
46 pages
The Basics of Calibration Procedure and Estimation of Uncertainty
No ratings yet
The Basics of Calibration Procedure and Estimation of Uncertainty
6 pages
Lehna Et Al. 2022 - Forecasting Day-Ahead Electricity Prices - A Compari ... Ies and Neural Network Models Taking External Regressors Into Account
No ratings yet
Lehna Et Al. 2022 - Forecasting Day-Ahead Electricity Prices - A Compari ... Ies and Neural Network Models Taking External Regressors Into Account
15 pages
Lod Loq
No ratings yet
Lod Loq
68 pages
STAT 312 Midterm 3 Study Guide
No ratings yet
STAT 312 Midterm 3 Study Guide
6 pages
Question Bank of Statistics
No ratings yet
Question Bank of Statistics
6 pages
Potato Friction Coefficient
No ratings yet
Potato Friction Coefficient
1 page
Scott and Watson CHPT 4 Solutions
No ratings yet
Scott and Watson CHPT 4 Solutions
4 pages
Module 2 - ML
No ratings yet
Module 2 - ML
53 pages
RP ch06
No ratings yet
RP ch06
121 pages
Sbe10 10 Simple Regression
No ratings yet
Sbe10 10 Simple Regression
100 pages
Isothermal Ciclohexane ELV
No ratings yet
Isothermal Ciclohexane ELV
7 pages
MIT BA 12 - Time Series Session Notes
No ratings yet
MIT BA 12 - Time Series Session Notes
16 pages
Name: Teacher: Date: Score:: Reading Bar Graphs
No ratings yet
Name: Teacher: Date: Score:: Reading Bar Graphs
2 pages
HW Packet 1 - The Study of Life (KEY) PDF
No ratings yet
HW Packet 1 - The Study of Life (KEY) PDF
2 pages
Ws Activity 1 - Boiling Salt Water
No ratings yet
Ws Activity 1 - Boiling Salt Water
3 pages
AC Lab 4 Molecular Weight Freezing Point Depression
No ratings yet
AC Lab 4 Molecular Weight Freezing Point Depression
10 pages
Bethany Pratt
No ratings yet
Bethany Pratt
11 pages
Growth Performance of Alugbati
No ratings yet
Growth Performance of Alugbati
5 pages
We I Bull Distribution
No ratings yet
We I Bull Distribution
3 pages
COURSE CODE
No ratings yet
COURSE CODE
23 pages
173 Funtions of Excel
No ratings yet
173 Funtions of Excel
182 pages
On Naive Bayes Algorithm
No ratings yet
On Naive Bayes Algorithm
17 pages
Chapter Three Frequency Analysis 3.1. General: Engineering Hydrology Lecture Note
No ratings yet
Chapter Three Frequency Analysis 3.1. General: Engineering Hydrology Lecture Note
29 pages
Wind Speed in Sri Lanka PDF
100% (1)
Wind Speed in Sri Lanka PDF
15 pages
Quiz 3 Name: Kainat Iftikhar Reg# 2021630007 1. List Three Examples of Time Series Data. Time Series Data
No ratings yet
Quiz 3 Name: Kainat Iftikhar Reg# 2021630007 1. List Three Examples of Time Series Data. Time Series Data
2 pages
OMAE2012-83586: Safety Factors For Fatigue Analysis of Flexible Pipes Based On Structural Reliability
No ratings yet
OMAE2012-83586: Safety Factors For Fatigue Analysis of Flexible Pipes Based On Structural Reliability
6 pages

Classification Metrics.pptx

Uploaded by

Classification Metrics.pptx

Uploaded by

Evaluation

• The performance of classifiers can be

Sensitivity = 45/(45+20) = 69.23% (The 69.23% spam emails are

specificity = 30/(30+5) = 85.71% (The 85.71% non-spam emails

Accuracy = (45+30)/(45+20+5+30) = 75% (The 75% of examples are

Precision = 45/(45+5)= 90% (The 90% of examples are classified as

• High Recall indicates the class is correctly

This curve plots two parameters:

FPR – the proportion of samples

■ ROC curves for different

■ Useful to pick the right classifier

■ In this figure, classifier

• Probability-based classifiers produce the class probability estimation

• Probabilistic classifiers : instead of functions, they are

• Ex: Naive Bayes, logistic regression and multilayer

ft -the probability that was forecast,

• If the forecast is 100% (P = 1) and it rains, then the Brier

You might also like