0% found this document useful (0 votes)
7 views

Section 5

The document discusses techniques for assessing the performance of data mining models, including R-squared for linear regression, AUC and Gini coefficient for categorical response models, and confusion matrices. Higher R-squared, AUC, and Gini values indicate better model performance.

Uploaded by

HuanYu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Section 5

The document discusses techniques for assessing the performance of data mining models, including R-squared for linear regression, AUC and Gini coefficient for categorical response models, and confusion matrices. Higher R-squared, AUC, and Gini values indicate better model performance.

Uploaded by

HuanYu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Section 5: Measuring Models

1
Main objective of this session

Aim

•To present different techniques used to assess the performance of data mining
models.

Learning outcomes

1.Understand how the performance of linear regression models is measured: R-squared.

2.Understand how the discriminatory power of scorecards is measured: ROC and


AUROC, and the Gini coefficient.

2
1.4 SEMMA Process

SAMPLE EXPLORE MODIFY MODEL ASSESS

• Acquire an • For each • Data • Statistical • Determine


unbiased variable: adjustment. techniques how well the
sample of the • Get a feel of and models model fits the
data which typical values. • Outliers on data to data.
describes the treatments undertake the
situation • Outliers required data • What
detection • Adjust/take mining task. confidence
• Define the functions of you should
“targets” • Inter data to put it have in the
variables relationship in most useful results
which capture between form. obtained
the respond of different (Measurement
the situation. variables. function).

3
6.1 Assessing linear models

• Coefficient of determination R2

– Says how much variation in y is explained by the model.


– Is a goodness-of-fit measure.

• Adjusted R2 for multivariate linear models.

• This will be a number between 0 and 1.

– High values mean that most of variation in y is explained by the


model; low values mean that little variation is explained in this
way.
4
6.2 Assessing categorical
response models – scorecards
• Discriminatory power

– Given a cut off: The model can make the classification. Example
Credit Card Fraud detection: We have a model that assigns a
probability of whether a credit card transaction is fraudulent.
Fraud

P (Y = 1 | X 1 ,..., X n ) =
Cut off

No Fraud

5
6.2 Assessing categorical
response models – scorecards

• Discriminatory power True values


FRAUD NO FRAUD TOTAL
– Confusion Matrix: Given a particular cut off
POSITIVE TRUE FALSE TOTAL
Test result Model said POSITIVE (TP) POSITIVE (FP) POSITIVES
(from the “Fraud”
model) NEGATIVE FALSE TRUE
TOTAL NEGATIVES
Model said NEGATIVE (FN) NEGATIVE (TN)
“No Fraud”
TOTAL TOTAL FRAUDS TOTAL NO FRAUDS TOTAL CASES

– True Positive: Correct detection that the event has happened. 6


6.2 Assessing categorical
response models – scorecards

• Discriminatory power True values


FRAUD NO FRAUD TOTAL
– Confusion Matrix: Given a particular cut off
POSITIVE TRUE FALSE TOTAL
Test result Model said POSITIVE (TP) POSITIVE (FP) POSITIVES
(from the “Fraud”
model) NEGATIVE FALSE TRUE
TOTAL NEGATIVES
Model said NEGATIVE (FN) NEGATIVE (TN)
“No Fraud”
TOTAL TOTAL FRAUDS TOTAL NO FRAUDS TOTAL CASES

 Classification accuracy = (TP+TN) / (TOTAL CASES)


7
 Error rate = (FP+FN) / (TOTAL CASES)
6.2 Assessing categorical
response models – scorecards

• Discriminatory power

– For each cut off we have a new confusion matrix and one pair of
(sensitivity, 1-specificity).
Fraud
Cut off 1
P (Y = 1 | X 1 ,..., X n ) = Cut off 2
Cut off 3

No Fraud

8
6.2 Assessing categorical
response models – scorecards

• Discriminatory power Cu a
ROC Curve
OC
1
0,9
– Plotting all (sensitivity, 1-specificity),
0,8 we obtain the Receiver

Sensitivity
Operating Characteristic (ROC)0,7 curve.
0,6
0,5
0,4
– The diagonal represents a “random”
0,3 classifier model.
0,2
0,1
0
0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0

1-Specificity

9
6.2 Assessing categorical
response models – scorecards

• Discriminatory power

– Interpretation of the ROC and AUROC

• The Area Under the ROC (AUROC or C) is a measure of


discriminatory power. Notice that 0.5 ≤ AUROC ≤ 1.

• An intuitive interpretation of the AUROC is that it provides an


estimate of the probability that a randomly chosen instance of
class 1 is correctly ranked higher than a randomly chosen
instance of class 0 (Hanley and McNeil, 1983).

10
• The higher the better!
6.2 Assessing categorical
response models – scorecards
1,0 1 1,0
0,9
0,8
• Discriminatory power 0,9
0,8
0,9
0,8

0,7 0,7 0,7

0,6 0,6 0,6

0,5 0,5 0,5

0,4 0,4 0,4


0,3 0,3
0,3
0,2 0,2 0,2
AUROC = 0.74 0,1
AUROC = 0.85 0,1
AUROC = 0.94
0,1
0 0,0
0,0
0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1
0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

– Models with AUROC larger than 0.7 are acceptable (Mays,


2004)
11
– Models with AUROC larger than 0.95: Warning: “too good to be
6.2 Assessing categorical
response models – scorecards
• Discriminatory power
GINI
The Gini = 2 × ( area between ROC curve and diagonal)
coefficient
area between ROC curve and random scorecard curve
=
area between perfect scorecard curve and random scorecard curve
B
C
• If GINI = 1 then perfect discrimination

• If GINI = 0 then no discrimination


F(s|B)
• Relationship between AUROC and GINI:

GINI= 2(AUROC - 0.5)= 2AUROC - 1


A
F(s | G)

12

You might also like