0% found this document useful (0 votes)

37 views

DSA1101 2019 Week4 Part1

This document discusses evaluating the performance of classifiers. It introduces the confusion matrix and defines true positives, true negatives, false positives and false negatives. It then describes metrics like accuracy, true positive rate, false positive rate, false negative rate and precision that can be used to evaluate classifiers. Finally, it discusses how the acceptable levels of different types of errors can depend on the specific business context or application.

Uploaded by

ttt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views

DSA1101 2019 Week4 Part1

Uploaded by

ttt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

Introduction to Data Science

DSA1101

Semester 1, 2019/2020
Week 4
Diagnostics of Classifiers

1 / 38
Diagnostics of Classifiers

We have studied the k-nearest neighbor algorithm as an

example of a classifier
However, there is a need to evaluate the performance of the
classifiers

2 / 38
Diagnostics of Classifiers

k-nearest neighbor is often used as a classifier to assign class

labels to a person, item, or transaction.
In general, for two class labels, C and ¬C , where ¬C denotes
“not C,” some working definitions and formulas follow:
- True Positive: Predict C , when actually C
- True Negative: Predict ¬C , when actually ¬C
- False Positive: Predict C , when actually ¬C
- False Negative: Predict ¬C , when actually C

3 / 38
Diagnostics of Classifiers

We will study the confusion matrix which is a specific table

layout that allows visualization of the performance of a
classifier.
In a two-class classification, a preset threshold may be used to
separate positives from negatives (e.g. we used the majority
rule, Ŷ < 0.5, in the k-nearest neighbor example).

Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)

4 / 38
Diagnostics of Classifiers

TP and TN are the correct guesses.

A good classifier should have large TP and TN and small
(ideally zero) numbers for FP and FN.

Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)

5 / 38
Diagnostics of Classifiers: example

A testing set of 100 emails (with their spam or non-spam

label known)
Example confusion matrix of a k-nearest neighbor classifier to
predict if each email is spam or not

Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100

6 / 38
Diagnostics of Classifiers

The accuracy (or the overall success rate) is a metric defining

the rate at which a model has classified the records correctly.
It is defined as the sum of TP and TN divided by the total
number of instances:
TP + TN
Accuracy = × 100%
TP + TN + FP + FN

7 / 38
Diagnostics of Classifiers

A good model should have a high accuracy score, but having

a high accuracy score alone does not guarantee the model is
well established.
We will introduce more fine-grained measures better evaluate
the performance of a classifier.

8 / 38
Diagnostics of Classifiers

The true positive rate (TPR) shows the proportion of positive

instances the classifier correctly identified:
TP
TPR =
TP + FN

Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)

9 / 38
Diagnostics of Classifiers

The false positive rate (FPR) shows what percent of negatives

the classifier marked as positive.
The FPR is also called the false alarm rate or the type I error
rate
FP
FPR =
FP + TN
Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)

10 / 38
Diagnostics of Classifiers

The false negative rate (FNR) shows what percent of positives

the classifier marked as negatives.
It is also known as the miss rate or type II error rate.
FN
FNR =
TP + FN

Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)

11 / 38
Diagnostics of Classifiers

Precision is the percentage of instances marked positive that

really are positive:
TP
Precision =
TP + FP

Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)

12 / 38
Diagnostics of Classifiers

A well-performed model should have a high TPR that is

ideally 1 and a low FPR and FNR that are ideally 0.
In reality, it is rare to have TPR = 1, FPR = 0, and FNR = 0,
but these measures are useful to compare the performance of
multiple models that are designed for solving the same
problem.
Note that in general, the model that is more preferable may
depend on the business situation.

13 / 38
Diagnostics of Classifiers

During the discovery phase of the data analytics lifecycle, the

team should have learned from the business what kind of
errors can be tolerated.
Some business situations are more tolerant of type I errors,
whereas others may be more tolerant of type II errors.

14 / 38
Diagnostics of Classifiers

Consider the example of e-mail spam filtering.

Some people (such as busy executives) only want important
e-mail in their inbox and are tolerant of having some less
important e-mail end up in their spam folder as long as no
spam is in their inbox.
In this case, a higher false positive rate (FPR) or type I error
can be tolerated.

15 / 38
Diagnostics of Classifiers

Other people may not want any important or less important

e-mail to be specified as spam and are willing to have some
spam in their inboxes as long as no important e-mail makes it
into the spam folder.
In this case, a higher false negative rate (FNR) or type II error
can be tolerated.

16 / 38
Diagnostics of Classifiers

Another example involves medical screening during an

infectious disease outbreak.
The cost of having a person, who has the disease, to be
instead diagnosed as disease-free is extremely high, since the
disease may be highly contagious.
Therefore, the false negative rate (FNR) or type II error needs
to be low.
A higher false positive rate (FPR) or type I error can be
tolerated.

17 / 38
Diagnostics of Classifiers

Third example involves security screening at the airport.

The cost of a false negative in this scenario is extremely high
(not detecting a bomb being brought onto a plane could
result in hundreds of deaths) whilst the cost of a false positive
is relatively low (a reasonably simple further inspection)
Therefore, a higher false positive rate (FPR) or type I error
can be tolerated, in order to keep the false negative rate
(FNR) or type II error low.

18 / 38
Diagnostics of Classifiers: example

TP + TN
Accuracy = × 100%
TP + TN + FP + FN
3 + 87
= × 100% = 90%
3 + 87 + 2 + 8
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100

19 / 38
Diagnostics of Classifiers: example

TP 3
TPR = = ≈ 0.273
TP + FN 3+8
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100

20 / 38
Diagnostics of Classifiers: example

FP 2
FPR = = ≈ 0.022
FP + TN 2 + 87
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100

21 / 38
Diagnostics of Classifiers: example

FN 8
FNR = = ≈ 0.727
TP + FN 3+8
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100

22 / 38
Diagnostics of Classifiers: example

TP 3
Precision = = = 0.6
TP + FP 3+2
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100

23 / 38
Diagnostics of Classifiers

We have studied a number of measures that can be used to

evaluate the performance of a classifier.
In practice, when we are presented with a dataset, how should
we go about estimating these performance measures?
A common practice is to perform N-Fold Cross-Validation

24 / 38
Diagnostics of Classifiers

The entire dataset is randomly split into N datasets of

approximately equal size.
N-1 of these datasets are treated as the training dataset, while
the remaining one is the test dataset. A measure of the model
error is obtained.
This process is repeated across the various combinations of N
datasets taken N − 1 at a time.
The observed N models errors are averaged across the N folds.

25 / 38
Diagnostics of Classifiers

26 / 38
Example: Anti-spam techniques

Let us illustrate
N-Fold
Cross-Validation
with an example
with the k-nearest
neighbor classfier for
spams, where we
specify k = 1.
Suppose our dataset
consists of 10 data
points.

27 / 38
Diagnostics of Classifiers

For 2-fold cross validation, we randomly split the whole

dataset of 10 points into two datasets of 5 points each

28 / 38
Example: Anti-spam techniques

29 / 38
Diagnostics of Classifiers

For the first iteration, we use the first dataset as the training
set and the second dataset as the testing set.

30 / 38
Example: Anti-spam techniques

31 / 38
Example: Anti-spam techniques

32 / 38
Diagnostics of Classifiers

In this iteration, we estimate the accuracy of the 1-nearest

neighbor algorithm to be equal to 54

33 / 38
Diagnostics of Classifiers

For the second iteration, we use the second dataset as the

training set and the first dataset as the testing set.

34 / 38
Example: Anti-spam techniques

35 / 38
Example: Anti-spam techniques

36 / 38
Diagnostics of Classifiers

In this iteration, we estimate the accuracy of the 1-nearest

neighbor algorithm to be equal to 53

37 / 38
Diagnostics of Classifiers

Therefore, based on 2-fold cross validation, the accuracy of

the 1-nearest neighbor algorithm is estimated to be
4 3 7

5 + 5 /2 = 10 .
We will continue with more examples next week to ground
ideas.

38 / 38

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6434)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (640)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1173)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (996)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1853)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4102)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (628)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1018)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (297)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1138)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5143)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (463)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2126)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (279)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4360)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2788)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2010)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2876)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4088)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (835)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (918)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
SierraLeone 2013-006 Evaluation Report IMAM and IYCF Sierra Leone
No ratings yet
SierraLeone 2013-006 Evaluation Report IMAM and IYCF Sierra Leone
48 pages
DSA1101 2019 Week1 Intro
No ratings yet
DSA1101 2019 Week1 Intro
29 pages
DSA1101 2019 Week2 Part1
No ratings yet
DSA1101 2019 Week2 Part1
51 pages
DSA1101 2019 Week3 Part1
No ratings yet
DSA1101 2019 Week3 Part1
38 pages
DSA1101 2019 Week1 Part2
No ratings yet
DSA1101 2019 Week1 Part2
38 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
ASSIGNMENT OF SOCIOLOGY
No ratings yet
ASSIGNMENT OF SOCIOLOGY
8 pages
GRADE 1 TERM 2 MUSIC SCHEMES
No ratings yet
GRADE 1 TERM 2 MUSIC SCHEMES
4 pages
North Western
No ratings yet
North Western
152 pages
Best Ent Surgeon in Hyderabad - Ent Hospital
No ratings yet
Best Ent Surgeon in Hyderabad - Ent Hospital
14 pages
Rubrics For Bookfair
No ratings yet
Rubrics For Bookfair
7 pages
French MS
No ratings yet
French MS
5 pages
School Catch Up Plan
No ratings yet
School Catch Up Plan
2 pages
MBBS in Kyrgyzstan
No ratings yet
MBBS in Kyrgyzstan
2 pages
Thomas Nechyba Original
No ratings yet
Thomas Nechyba Original
5 pages
Teachers' Guide
No ratings yet
Teachers' Guide
32 pages
Math - Diagnostic Test - 4th Quarter
100% (1)
Math - Diagnostic Test - 4th Quarter
4 pages
Nishant Shukla 65000233
No ratings yet
Nishant Shukla 65000233
4 pages
The Power of Habit
No ratings yet
The Power of Habit
2 pages
Attitude Therapy NCM 117 Skills
No ratings yet
Attitude Therapy NCM 117 Skills
4 pages
Workbook
No ratings yet
Workbook
10 pages
Four Key Words in The Definition of Communication: 1. Increased Business Opportunities
No ratings yet
Four Key Words in The Definition of Communication: 1. Increased Business Opportunities
5 pages
MHF4U-Unit4
No ratings yet
MHF4U-Unit4
30 pages
CSE District Roll Out Pre Test
100% (1)
CSE District Roll Out Pre Test
2 pages
Vivek 6th Sem
No ratings yet
Vivek 6th Sem
2 pages
Handbook of Rehabilitation Psychology 3rd Edition Lisa A. Brenner - Download the full ebook now for a seamless reading experience
100% (4)
Handbook of Rehabilitation Psychology 3rd Edition Lisa A. Brenner - Download the full ebook now for a seamless reading experience
79 pages
Lesson Plan-Liceo Decroliano
No ratings yet
Lesson Plan-Liceo Decroliano
1 page
Desert Roundup
100% (2)
Desert Roundup
16 pages
RM RN
No ratings yet
RM RN
4 pages
Lesson Plan in Grade VII English
No ratings yet
Lesson Plan in Grade VII English
5 pages
Vote of Thanks Teachers Day 2019
100% (1)
Vote of Thanks Teachers Day 2019
4 pages
Finops Foundation Poster
No ratings yet
Finops Foundation Poster
1 page
Stock Prediction Using Machine Learning
No ratings yet
Stock Prediction Using Machine Learning
9 pages
Explorotary Data Analysis
100% (1)
Explorotary Data Analysis
30 pages
Andrea Carolina Zapata Perez: Curriculum Vitae
No ratings yet
Andrea Carolina Zapata Perez: Curriculum Vitae
2 pages

DSA1101 2019 Week4 Part1

Uploaded by

DSA1101 2019 Week4 Part1

Uploaded by

Introduction to Data Science

We have studied the k-nearest neighbor algorithm as an

k-nearest neighbor is often used as a classifier to assign class

We will study the confusion matrix which is a specific table

TP and TN are the correct guesses.

A testing set of 100 emails (with their spam or non-spam

The accuracy (or the overall success rate) is a metric defining

A good model should have a high accuracy score, but having

The true positive rate (TPR) shows the proportion of positive

The false positive rate (FPR) shows what percent of negatives

The false negative rate (FNR) shows what percent of positives

Precision is the percentage of instances marked positive that

A well-performed model should have a high TPR that is

During the discovery phase of the data analytics lifecycle, the

Consider the example of e-mail spam filtering.

Other people may not want any important or less important

Another example involves medical screening during an

Third example involves security screening at the airport.

We have studied a number of measures that can be used to

The entire dataset is randomly split into N datasets of

For 2-fold cross validation, we randomly split the whole

In this iteration, we estimate the accuracy of the 1-nearest

For the second iteration, we use the second dataset as the

In this iteration, we estimate the accuracy of the 1-nearest

Therefore, based on 2-fold cross validation, the accuracy of

You might also like