DSA1101 2019 Week4 Part1
DSA1101 2019 Week4 Part1
DSA1101
Semester 1, 2019/2020
Week 4
Diagnostics of Classifiers
1 / 38
Diagnostics of Classifiers
2 / 38
Diagnostics of Classifiers
3 / 38
Diagnostics of Classifiers
Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)
4 / 38
Diagnostics of Classifiers
Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)
5 / 38
Diagnostics of Classifiers: example
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100
6 / 38
Diagnostics of Classifiers
7 / 38
Diagnostics of Classifiers
8 / 38
Diagnostics of Classifiers
Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)
9 / 38
Diagnostics of Classifiers
10 / 38
Diagnostics of Classifiers
Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)
11 / 38
Diagnostics of Classifiers
Predicted Class
Positive Negative
Positive True Positives (TP) False Negatives (FN)
Actual Class
Negative False Positives (FP) True Negatives (TN)
12 / 38
Diagnostics of Classifiers
13 / 38
Diagnostics of Classifiers
14 / 38
Diagnostics of Classifiers
15 / 38
Diagnostics of Classifiers
16 / 38
Diagnostics of Classifiers
17 / 38
Diagnostics of Classifiers
18 / 38
Diagnostics of Classifiers: example
TP + TN
Accuracy = × 100%
TP + TN + FP + FN
3 + 87
= × 100% = 90%
3 + 87 + 2 + 8
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100
19 / 38
Diagnostics of Classifiers: example
TP 3
TPR = = ≈ 0.273
TP + FN 3+8
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100
20 / 38
Diagnostics of Classifiers: example
FP 2
FPR = = ≈ 0.022
FP + TN 2 + 87
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100
21 / 38
Diagnostics of Classifiers: example
FN 8
FNR = = ≈ 0.727
TP + FN 3+8
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100
22 / 38
Diagnostics of Classifiers: example
TP 3
Precision = = = 0.6
TP + FP 3+2
Predicted Class
Spam Non-Spam Total
Spam 3 8 11
Actual Class
Non-Spam 2 87 89
Total 5 95 100
23 / 38
Diagnostics of Classifiers
24 / 38
Diagnostics of Classifiers
25 / 38
Diagnostics of Classifiers
26 / 38
Example: Anti-spam techniques
Let us illustrate
N-Fold
Cross-Validation
with an example
with the k-nearest
neighbor classfier for
spams, where we
specify k = 1.
Suppose our dataset
consists of 10 data
points.
27 / 38
Diagnostics of Classifiers
28 / 38
Example: Anti-spam techniques
29 / 38
Diagnostics of Classifiers
For the first iteration, we use the first dataset as the training
set and the second dataset as the testing set.
30 / 38
Example: Anti-spam techniques
31 / 38
Example: Anti-spam techniques
32 / 38
Diagnostics of Classifiers
33 / 38
Diagnostics of Classifiers
34 / 38
Example: Anti-spam techniques
35 / 38
Example: Anti-spam techniques
36 / 38
Diagnostics of Classifiers
37 / 38
Diagnostics of Classifiers
38 / 38