Classification Metrics Mod 6
Classification Metrics Mod 6
Observed response (output) 'y' has two possible values: +/-, or True/False.
Requires defining the relationship between h(x) and y.
Uses a decision rule.
Examples:
Definitions
Instances: The objects of interest in machine learning.
Instance Space: The set of all possible instances. For example, the set of all
possible e-mails.
Label Space: Used in supervised learning to label examples.
Model: A mapping from the instance space to the output space.
In classification, the output space is a set of classes.
In regression, it is the set of real numbers.
To learn a model, a training set of labeled instances (x, l(x)), also called
examples, is needed.
Page 1
Created by Turbolearn AI
Key terms:
True Positive (TP): The classifier correctly predicts a spam email as spam.
False Negative (FN): The classifier incorrectly predicts a spam email as non-
spam (a miss).
False Positive (FP): The classifier incorrectly predicts a non-spam email as
spam (a false alarm).
True Negative (TN): The classifier correctly predicts a non-spam email as non-
spam.
Page 2
Created by Turbolearn AI
correctly classified).
Specificity (True Negative Rate): Measure of negative examples labeled as
negative by the classifier. Should be higher.
For instance, the proportion of emails which are non-spam among all non-
spam emails.
TN
Specif icity =
T N +F P
30+5
accurately classified).
Accuracy: Proportion of the total number of predictions that are correct.
T P +T N
Accuracy =
T P +T N +F P +F N
45+30+20+5
classified).
Precision: Ratio of correctly classified positive examples to the total number of
predicted positive examples.
Shows correctness achieved in positive prediction (out of all the positive
classes we have predicted correctly, how many are actually positive).
High precision indicates that an example labeled as positive is indeed
positive (small number of FPs).
TP
P recision =
T P +F P
45+5
Page 3
Created by Turbolearn AI
The last column and the last row give the marginals (i.e., column and row sums).
1. Coverage Plot
A coverage plot visualizes the four numbers (number of positives Pos, number of
negatives Neg, number of true positives TP, and number of false positives FP) using a
rectangular coordinate system and a point. In a coverage plot, classifiers with the
same accuracy are connected by line segments with slope 1.
2. ROC Curves
An ROC curve (receiver operating characteristic curve) is a graph showing
the performance of a classification model at all classification thresholds.
An ROC curve plots TPR vs. FPR at different classification thresholds. Lowering the
classification threshold classifies more items as positive, thus increasing both False
Positives and True Positives.
Example:
Hypothetical Data:
Page 4
Created by Turbolearn AI
TP 4
TPR = = = 0.8
T P +F N 4+1
FP 0
FPR = = = 0
F P +T N 0+5
AUC Curve
AUC stands for "Area Under the ROC Curve." AUC measures the entire
two-dimensional area underneath the entire ROC curve from (0,0) to (1,1).
Page 5
Created by Turbolearn AI
Definition:
Binary (ordinary) classifier uses a function that assigns to a sample 'x' a class
label 'ŷ':
ŷ = f (x)
Page 6
Created by Turbolearn AI
1. Sum Squared Error (SSE): Square the individual error terms (difference
between the estimated values and the actual value), which results in a positive
number for all values.
2. Mean Squared Error (MSE): Measures the average of the squares of the errors.
The average squared difference between the estimated values and the
actual value (take the average, or the mean, of the individual squared error
terms).
3. Brier Score:
Definition of error in probability estimates, used in forecasting theory.
f - the probability that was forecast.
If the forecast is 70% (P = 0.70) and it does not rain, then the Brier
Score is (0.70 − 0) = 0.49.
2
Empirical Probability
Empirical probability uses the number of occurrences of an outcome
within a sample set as a basis for determining the probability of that
outcome.
Page 7
Created by Turbolearn AI
The number of times "event X" happens out of 100 trials will be the probability
of event X happening.
The empirical probability of an event is the ratio of the number of outcomes in
which a specified event occurs to the total number of trials.
Empirical probability (experimental probability) estimates probabilities from
experience and observation.
Example: In a buffet, 95 out of 100 people chose to order coffee over tea. What
is the empirical probability of someone ordering tea?
Answer: The empirical probability of someone ordering tea is 5/100 = 5.
Page 8