Model Evaluation and Selection

This document discusses model evaluation and selection techniques. It introduces terminology like true positives, true negatives, false positives and false negatives used in model evaluation. It describes accuracy, error rate, sensitivity, specificity, precision, recall and F-measure - common metrics used to evaluate classifiers. It also discusses techniques like hold-out method, k-fold cross validation and bootstrap to obtain reliable estimates of a classifier's accuracy.

Uploaded by

Dev kartik Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Model Evaluation and Selection

Uploaded by

Dev kartik Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 41

Model Evaluation and

Selection
Model Evaluation and Selection
• One would like an estimate of how accurately the classifier is able to
classify data on which the classifier has not been trained.
• What if we have more than one classifier and want to choose the best
one?
• The answer to these questions is obtained through model evaluation
and selection.
• What is accuracy?
• How can we estimate it?
• Are some measures of a classifier’s accuracy more appropriate than others?
Terminology-Introduction
• Positive tuples: (tuples of the main class of interest)
• Negative tuples: (all other tuples).
• Example,
• the positive tuples may be buys_computer=yes while the negative tuples are
buys_computer=no.
• Lets consider P is the number of positive tuples and N is negative
tuples
Terminology-introduction
• True positives (TP): These refer to the positive tuples that were correctly
labeled by the classifier. Let TP be the number of true positives.
• True negatives (TN): These are the negative tuples that were correctly labeled
by the classifier. Let TN be the number of true negatives.
• False positives (FP): These are the negative tuples that were incorrectly labeled
as positive (e.g., tuples of class buys_computer =no for which the classifier
predicted buys_computer=yes). Let FP be the number of false positives.
• False negatives (FN): These are the positive tuples that were incorrectly labeled
as negative (e.g., tuples of class buys_computer=yes for which the classifier
predicted buys_computer=no). Let FN be the number of false negatives.
Model Evaluation Metrics
Confusion Matrix
• The confusion matrix is a useful tool for analyzing how well your
classifier can recognize tuples of different classes.
• TP and TN tell us when the classifier is getting things right, while FP
and FN tell us when the classifier is getting things wrong.
Confusion Matrix where m≥2
• Given m classes (where m ≥ 2), a confusion matrix is a table of at
least size m by m.
• An entry, CMi,j in the first m rows and m columns indicates the
number of tuples of class i that were labeled by the classifier as
class j.
• For a classifier to have good accuracy, ideally most of the tuples
would be represented along the diagonal of the confusion
matrix, from entry CM1,1 to entry CMm,m, with the rest of the
entries being zero or close to zero. That is, ideally, FP and FN are
around zero.
Accuracy
• The accuracy of a classifier on a given test set is the percentage of
test set tuples that are correctly classified by the classifier. That is,

This is also known as recognition rate of classifier

Error rate or misclassification rate
• We can also speak of the error rate or misclassification rate of a
classifier, M, which is simply 1-accuracy (M) where accuracy(M) is the
accuracy of M.
• This also can be computed as

• If we were to use the training set (instead of a test set) to estimate

the error rate of a model, this quantity is known as the resubstitution
error.
Class imbalance problem
• Recognition/accuracy is most effective when the class distribution is relatively balanced,
i.e. main class of interest (positive class) and other classes(negative class) are fairly
distributed.
• But it becomes an ineffective measure for imbalanced classes, i.e. situations where the
main class of interest(positive class) is rare and has significant number of other classes
(negative class)
• Example:
• 1. In fraud detection applications, the class of interest is “fraudulent” which occurs much
less frequently than negative class “non-fraudulent”.
• 2. In medical data, the class of interest “cancerous” occurs rarely than class “non-
cancerous”. With an accuracy of suppose 97%, the classifier might be correctly labelling
only negative class
• Therefore other measures are required which can depict separately how well the
classifier classifies positive tuples and how well it can recognize negative tuples
Sensitivity and Specificity
• Sensitivity is also referred to as the true positive (recognition) rate
(i.e., the proportion of positive tuples that are correctly identified),
while specificity is the true negative rate (i.e., the proportion of
negative tuples that are correctly identified). These measures are
defined as

• It can be shown that accuracy is a function of sensitivity and

specificity
Example
• Find the sensitivity, specificity and accuracy for the following data

• The sensitivity of the classifier is 90/300=30.00%.

• The specificity is 9560/9700=98.56%.
• The classifier’s overall accuracy is 9650/10,000=96.40%.
• Thus, we note that although the classifier has a high accuracy, it’s ability
to correctly label the positive (rare) class is poor given its low sensitivity.
Precision and Recall
• Precision is defined as what percentage of tuples labeled as positive
are actually such. It is a measure of exactness.
• Recall is defined as what percentage of positive tuples are labeled as
such. It is a measure of completeness
• These measures can be computed as

=sensitivity
• A perfect precision score of 1.0 for a class C means that every tuple
that the classifier labeled as belonging to class C does indeed belong to
class C.
• However, it does not tell us anything about the number of class C
tuples that the classifier mislabeled.
• A perfect recall score of 1.0 for C means that every item from class C
was labeled as such, but it does not tell us how many other tuples
were incorrectly labeled as belonging to class C.
• There tends to be an inverse relationship between precision and recall,
where it is possible to increase one at the cost of reducing the other.
Problem
• Find the precision and recall for the following confusion matrix

• The precision for the yes class is=90/230=39.13%.

• The recall is 90/300=30.00%,
F measure and Fβ measure
• An alternative way to use precision and recall is to combine them into a single measure.
• This is the approach of the F measure (also known as the F1 score or F-score) and the F
measure. They are defined as

where β is a non-negative real number.

• The F measure is the harmonic mean of precision and recall. It gives equal weight to
precision and recall.
• The F β measure is a weighted measure of precision and recall. It assigns times as much
weight to recall as to precision. Commonly used F measures are F2 (which weights recall
twice as much as precision) and F0.5 (which weights precision twice as much as recall).
Additional measures
In addition to accuracy-based measures, classifiers can also be compared with respect to the
following additional aspects:
• Speed: This refers to the computational costs involved in generating and using the given
classifier.
• Robustness: This is the ability of the classifier to make correct predictions given noisy data
or data with missing values. Robustness is typically assessed with a series of synthetic data
sets representing increasing degrees of noise and missing values.
• Scalability: This refers to the ability to construct the classifier efficiently given large
amounts of data. Scalability is typically assessed with a series of data sets of increasing size.
• Interpretability: This refers to the level of understanding and insight that is provided by the
classifier or predictor. Interpretability is subjective and therefore more difficult to assess.
Decision trees and classification rules can be easy to interpret, yet their interpretability may
diminish the more they become complex
Obtaining reliable classifier accuracy
estimates(I)
• The holdout method :the given data are randomly partitioned into two
independent sets, a training set and a test set.
• Typically, two-thirds of the data are allocated to the training set, and the remaining
one-third is allocated to the test set.
• The training set is used to derive the model. The model’s accuracy is then estimated
with the test set .
• The estimate is pessimistic because only a portion of the initial data is used to
derive the model.
• Random subsampling is a variation of the holdout method in which the holdout
method is repeated k times.
• The overall accuracy estimate is taken as the average of the accuracies obtained
from each iteration.
Obtaining reliable classifier accuracy
estimates (II)
• In k-fold cross-validation, the initial data are randomly partitioned into k
mutually exclusive subsets or “folds,” D1, D2, : : : , Dk, each of approximately
equal size.
• Training and testing is performed k times.
• In iteration i, partition Di is reserved as the test set, and the remaining
partitions are collectively used to train the model.
• Unlike the holdout and random subsampling methods, here each sample is
used the same number of times for training and once for testing.
• For classification, the accuracy estimate is the overall number of correct
classifications from the k iterations, divided by the total number of tuples in
the initial data.
Obtaining reliable classifier accuracy
estimates (III)
• Bootstrap method samples the given training tuples uniformly with
replacement.
• That is, each time a tuple is selected, it is equally likely to be selected
again and re-added to the training set.
• For instance, imagine a machine that randomly selects tuples for our
training set. In sampling with replacement, the machine is allowed to
select the same tuple more than once.
.632 bootstrap
• Suppose we are given a data set of d tuples.
• The data set is sampled d times, with replacement, resulting in a bootstrap
sample or training set of d samples.
• It is very likely that some of the original data tuples will occur more than once
in this sample.
• The data tuples that did not make it into the training set end up forming the
test set.
• Suppose we were to try this out several times. As it turns out, on average,
• 63.2% of the original data tuples will end up in the bootstrap sample, and the
remaining 36.8% will form the test set (hence, the name, .632 bootstrap).
Comparing Classifiers Based on Cost–Benefit
and ROC Curves
• The true positives, true negatives, false positives, and false negatives are also useful in
assessing the costs and benefits (or risks and gains) associated with a classification model.
• The cost associated with a false negative (such as incorrectly predicting that a cancerous
patient is not cancerous) is far greater than those of a false positive (incorrectly yet
conservatively labeling a noncancerous patient as cancerous).
• In such cases, we can outweigh one type of error over another by assigning a different
cost to each.
• These costs may consider the danger to the patient, financial costs of resulting therapies,
and other hospital costs.
• Similarly, the benefits associated with a true positive decision may be different than those
of a true negative. Up to now, to compute classifier accuracy, we have assumed equal
costs and essentially divided the sum of true positives and true negatives by the total
number of test tuples.
Comparing Classifiers Based on Cost–Benefit
and ROC Curves
• Receiver operating characteristic curves are a useful visual tool for
comparing two classification models.
• An ROC curve for a given model shows the trade-off between the true
positive rate (TPR) and the false positive rate (FPR).
Increasing classifier Accuracy
• An ensemble for classification is a composite model, made up of a combination of classifiers.
• The individual classifiers vote, and a class label prediction is returned by the ensemble based on
the collection of votes.
• Ensembles tend to be more accurate than their component classifiers.
• Bagging, boosting, and random forests are examples of ensemble methods
Increasing classifier Accuracy
• An ensemble tends to be more accurate than its base classifiers.
• For example, consider an ensemble that performs majority voting.
• That is, given a tuple X to classify, it collects the class label predictions returned
from the base classifiers and outputs the class in majority.
• The base classifiers may make mistakes, but the ensemble will misclassify X only if
over half of the base classifiers are in error.
• Ensembles yield better results when there is significant diversity among the models.
• That is, ideally, there is little correlation among classifiers.
• The classifiers should also perform better than random guessing.
• Each base classifier can be allocated to a different CPU and so ensemble methods
are parallelizable.
Bagging-bootstrap aggregation
• Given a set, D, of d tuples, bagging works as follows. For iteration i (i=1, 2,… , k), a training
set, Di , of d tuples is sampled with replacement from the original set of tuples, D.
• Each training set is a bootstrap sample.
• Because sampling with replacement is used, some of the original tuples of D may not be
included in Di , whereas others may occur more than once.
• A classifier model, Mi , is learned for each training set, Di .
• To classify an unknown tuple, X, each classifier, Mi , returns its class prediction, which
counts as one vote.
• The bagged classifier, M*, counts the votes and assigns the class with the most votes to
X.
• Bagging can be applied to the prediction of continuous values by taking the average
value of each prediction for a given test tuple.
Boosting
• In boosting, weights are also assigned to each training tuple.
• A series of k classifiers is iteratively learned.
• After a classifier, Mi , is learned, the weights are updated to allow the
subsequent classifier,Mi+1, to “pay more attention” to the training
tuples that were misclassified by Mi .
• The final boosted classifier, M*, combines the votes of each individual
classifier, where the weight of each classifier’s vote is a function of its
accuracy.
AdaBoost (Adaptive Boosting)
• AdaBoost is a popular boosting algorithm.
• Suppose we want to boost the accuracy of a learning method.
• We are given D, a data set of d class-labeled tuples, (X1, y1), (X2, y2),… , (Xd, yd), where yi
is the class label of tuple Xi.
• Initially, AdaBoost assigns each training tuple an equal weight of 1/d.
• Generating k classifiers for the ensemble requires k rounds through the rest of the
algorithm.
• In round i, the tuples from D are sampled to form a training set, Di , of size d.
• A classifier model, Mi , is derived from the training tuples of Di .
• Its error is then calculated using Di as a test set.
• The weights of the training tuples are then adjusted according to how they were
classified.
AdaBoost (Adaptive Boosting)
• If a tuple was incorrectly classified, its weight is increased.
• If a tuple was correctly classified, its weight is decreased.
• A tuple’s weight reflects how difficult it is to classify— the higher the weight,
the more often it has been misclassified.
• These weights will be used to generate the training samples for the classifier
of the next round.
• The basic idea is that when we build a classifier, we want it to focus more on
the misclassified tuples of the previous round.
• Some classifiers may be better at classifying some “difficult” tuples than
others.
• In this way, we build a series of classifiers that complement each other.
AdaBoost
• To compute the error rate of model Mi , we sum the weights of each
of the tuples in Di that Mi misclassified.

where err(Xj) is the misclassification error of tuple Xj: If the tuple was
misclassified, then err(Xj) is 1; otherwise, it is 0.
If the performance of classifier Mi is so poor that its error exceeds 0.5,
then we abandon it.
Instead, we try again by generating a new Di training set, from which
we derive a new Mi .
AdaBoost
• The error rate of Mi affects how the weights of the training tuples are
updated.
• If a tuple in round i was correctly classified, its weight is multiplied by
• Once the weights of all the correctly classified tuples are updated, the
weights for all tuples (including the misclassified ones) are normalized
so that their sum remains the same as it was before.
• To normalize a weight, we multiply it by the sum of the old weights,
divided by the sum of the new weights.
• As a result, the weights of misclassified tuples are increased and the
weights of correctly classified tuples are decreased, as described before.
AdaBoost
• Unlike bagging, where each classifier was assigned an equal vote, boosting
assigns a weight to each classifier’s vote, based on how well the classifier
performed.
• The lower a classifier’s error rate, the more accurate it is, and therefore, the
higher its weight for voting should be.
• The weight of classifier Mi ’s vote is
Log (
• For each class, c, we sum the weights of each classifier that assigned class c
to X.
• The class with the highest sum is the “winner” and is returned as the class
prediction for tuple X.
Random Forests
• Imagine that each of the classifiers in the ensemble is a decision tree
classifier so that the collection of classifiers.
• The individual decision trees are generated using a random selection
of attributes at each node to determine the split.
• More formally, each tree depends on the values of a random vector
sampled independently and with the same distribution for all trees in
the forest.
• During classification, each tree votes and the most popular class is
returned.
Forest-RC
• Another form of random forest, called Forest-RC, uses random linear combinations of
the input attributes.
• Instead of randomly selecting a subset of the attributes, it creates new attributes (or
features) that are a linear combination of the existing attributes.
• That is, an attribute is generated by specifying L, the number of original attributes to
be combined.
• At a given node, L attributes are randomly selected and added together with
coefficients that are uniform random numbers on [-1, 1].
• F linear combinations are generated, and a search is made over these for the best
split.
• This form of random forest is useful when there are only a few attributes available, so
as to reduce the correlation between individual classifiers.
Random Forests Vs AdaBoost
• Random forests are comparable in accuracy to AdaBoost, yet are
more robust to errors and outliers.
• The generalization error for a forest converges as long as the number
of trees in the forest is large.
• Thus, overfitting is not a problem. The accuracy of a random forest
depends on the strength of the individual classifiers and a measure of
the dependence between them.
• The ideal is to maintain the strength of individual classifiers without
increasing their correlation. Random forests are insensitive to the
number of attributes selected for consideration at each split.
Improving Classification Accuracy of
Class-Imbalanced Data
• Given two-class data, the data are class-imbalanced if the main class of interest
(the positive class) is represented by only a few tuples, while the majority of
tuples represent the negative class.
• The class imbalance problem is closely related to cost-sensitive learning, wherein
the costs of errors, per class, are not equal.
• In medical diagnosis, for example, it is much more costly to falsely diagnose a
cancerous patient as healthy (a false negative) than to misdiagnose a healthy
patient as having cancer (a false positive).
• A false negative error could lead to the loss of life and therefore is much more
expensive than a false positive error.
• Other applications involving class-imbalanced data include fraud detection, the
detection of oil spills from satellite radar images, and fault monitoring.
How to predict the class label of imbalanced
data?
• Sensitivity or recall (the true positive rate) and
• specificity (the true negative rate),
• F1 and
• ROC curves plot sensitivity versus (1-specificity) (i.e., the false positive
rate).
How to improve accuracy of class imbalance
data?
General approaches for improving the classification accuracy of class-imbalanced data
include
(1) Oversampling
―Oversampling works by resampling the positive tuples so that the resulting training set
contains an equal number of positive and negative tuples.
―Example SMOTE algorithm
(2) Undersampling
―Undersampling works by decreasing the number of negative tuples. It randomly eliminates
tuples from the majority (negative) class until there are an equal number of positive and
negative tuples.
(3) threshold moving, and
(4) ensemble techniques
―As discussed previously
Threshold-moving
• It applies to classifiers that, given an input tuple, return a continuous
output value.
• That is, for an input tuple, X, such a classifier returns as output a
mapping, f (X) [0, 1].
• Rather than manipulating the training tuples, this method returns a
classification decision based on the output values. In the simplest
approach, tuples for which f(x)≥t, for some threshold, t , are
considered positive, while all other tuples are considered negative.
• Other approaches may involve manipulating the outputs by weighting

Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
37 pages
Presentation On Classification
No ratings yet
Presentation On Classification
18 pages
Module 5 Advanced Classification Techniques
No ratings yet
Module 5 Advanced Classification Techniques
40 pages
ML Model Evaluation
No ratings yet
ML Model Evaluation
17 pages
Classification Evaluation
No ratings yet
Classification Evaluation
28 pages
Accuracy and Error Measures
No ratings yet
Accuracy and Error Measures
46 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
22 pages
CST 42315 Dam - L9 1
No ratings yet
CST 42315 Dam - L9 1
15 pages
Classification - Performance Evlaution
No ratings yet
Classification - Performance Evlaution
13 pages
Data Mining Final
No ratings yet
Data Mining Final
25 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Lecture 01-Model Selection and Evaluation
No ratings yet
Lecture 01-Model Selection and Evaluation
29 pages
Accuracy and error measures
No ratings yet
Accuracy and error measures
14 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
DM 09 Classification and Prediction 19112024 102854am
No ratings yet
DM 09 Classification and Prediction 19112024 102854am
21 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
25 pages
Prediction---accuracy
No ratings yet
Prediction---accuracy
33 pages
Data Mining: Accuracy and Error Measures For Classification and Prediction
No ratings yet
Data Mining: Accuracy and Error Measures For Classification and Prediction
15 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Lecture 5 Evaluation_Classifer
No ratings yet
Lecture 5 Evaluation_Classifer
61 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
TE_DWM Module no 3
No ratings yet
TE_DWM Module no 3
48 pages
Data MIning Chapter 8
No ratings yet
Data MIning Chapter 8
11 pages
Lesson 6 Analytics Methods
No ratings yet
Lesson 6 Analytics Methods
12 pages
Lecture 2 Classifier Performance Metrics
No ratings yet
Lecture 2 Classifier Performance Metrics
60 pages
CSC4316 9
No ratings yet
CSC4316 9
40 pages
Unit 6-Feature Engineering and Sensitivity Analysis
No ratings yet
Unit 6-Feature Engineering and Sensitivity Analysis
63 pages
ML Unit-3 - RTU
No ratings yet
ML Unit-3 - RTU
20 pages
Lecture 11
No ratings yet
Lecture 11
61 pages
Imbalance Problem
No ratings yet
Imbalance Problem
13 pages
DL_IT324a_4
No ratings yet
DL_IT324a_4
52 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Module 6
No ratings yet
Module 6
24 pages
EvaluationMatrix
No ratings yet
EvaluationMatrix
29 pages
UNIT-1-2.Binary Classification and Related Tasks
No ratings yet
UNIT-1-2.Binary Classification and Related Tasks
22 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
Chapter 10
No ratings yet
Chapter 10
31 pages
chapter 5 Model Evaluation
No ratings yet
chapter 5 Model Evaluation
21 pages
Accuracy Measures
No ratings yet
Accuracy Measures
61 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Lec07 Classification ModelEvaluation Ensemble
No ratings yet
Lec07 Classification ModelEvaluation Ensemble
62 pages
Evaluation Method Holdout
No ratings yet
Evaluation Method Holdout
14 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
BSC ML CH1.pptx
No ratings yet
BSC ML CH1.pptx
63 pages
Chapter 3 Model Evaluation Final
No ratings yet
Chapter 3 Model Evaluation Final
30 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
Evaluating Model Performance Unit 6
No ratings yet
Evaluating Model Performance Unit 6
33 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
19 pages
Machine Learning Chapter3
No ratings yet
Machine Learning Chapter3
27 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Module 5 ML
No ratings yet
Module 5 ML
12 pages
Lecture 11 Model Evaluation
No ratings yet
Lecture 11 Model Evaluation
11 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Hypothesis Testing: Six Sigma Thinking, #6
From Everand
Hypothesis Testing: Six Sigma Thinking, #6
Sumeet Savant
No ratings yet
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Small Sclae Industries
No ratings yet
Small Sclae Industries
22 pages
Data Warehouse
No ratings yet
Data Warehouse
74 pages
Baysian Belief Networks
No ratings yet
Baysian Belief Networks
32 pages
Self Organizing Networks
No ratings yet
Self Organizing Networks
9 pages
Generalization
No ratings yet
Generalization
9 pages
Enhancing Android Malware Detection Throught Ensemble Stakcking
No ratings yet
Enhancing Android Malware Detection Throught Ensemble Stakcking
11 pages
A Short Introduction To Boosting
No ratings yet
A Short Introduction To Boosting
14 pages
AWS ML Notes -Domain 2 - Data Transformation
No ratings yet
AWS ML Notes -Domain 2 - Data Transformation
32 pages
Predicting_Heart_Diseases_Using_Machine_Learning_a
No ratings yet
Predicting_Heart_Diseases_Using_Machine_Learning_a
16 pages
AI-driven Applications.: Differences Between AI vs. Machine Learning vs. Deep Learning
No ratings yet
AI-driven Applications.: Differences Between AI vs. Machine Learning vs. Deep Learning
10 pages
Thesis Template Final Content v6
No ratings yet
Thesis Template Final Content v6
75 pages
Cheatsheet Supervised Learning
100% (1)
Cheatsheet Supervised Learning
4 pages
AdaBoost Is Consistent
No ratings yet
AdaBoost Is Consistent
22 pages
Symptoms Diagnosis Using Machine Learning Model Random Forest
No ratings yet
Symptoms Diagnosis Using Machine Learning Model Random Forest
7 pages
DMV & ML Lab
No ratings yet
DMV & ML Lab
103 pages
Unit 4
No ratings yet
Unit 4
24 pages
Chapter3: Background and Related Work
No ratings yet
Chapter3: Background and Related Work
8 pages
Wu Nevatia Ijcv07 PDF
No ratings yet
Wu Nevatia Ijcv07 PDF
20 pages
9071-PDF
No ratings yet
9071-PDF
16 pages
I D L A R: Mbalanced ATA Earning Pproaches Eview
No ratings yet
I D L A R: Mbalanced ATA Earning Pproaches Eview
19 pages
Machine Learning With Boosting
100% (1)
Machine Learning With Boosting
212 pages
Sri Ram Project Phase 1 Report
No ratings yet
Sri Ram Project Phase 1 Report
36 pages
predictive-analytics
No ratings yet
predictive-analytics
18 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
43 pages
Tuning Parameters
No ratings yet
Tuning Parameters
15 pages
Solutions
No ratings yet
Solutions
25 pages
Predicting Employee Attrition Along With Identifying High Risk Employees Using Big Data and Machine Learning
No ratings yet
Predicting Employee Attrition Along With Identifying High Risk Employees Using Big Data and Machine Learning
8 pages
An Ensemble Method For Phishing Websites Detection Based On XGBoost
No ratings yet
An Ensemble Method For Phishing Websites Detection Based On XGBoost
6 pages
Boosting For Regression Transfer
No ratings yet
Boosting For Regression Transfer
8 pages
A Comprehensive Guide To Ensemble Learning (With Python Codes)
No ratings yet
A Comprehensive Guide To Ensemble Learning (With Python Codes)
21 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Classification
No ratings yet
Classification
58 pages
Mastering Predictive Analytics with R 2nd edition Edition Forte All Chapters Instant Download
100% (3)
Mastering Predictive Analytics with R 2nd edition Edition Forte All Chapters Instant Download
81 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Full download Grokking Machine Learning 1st Edition Luis Serrano pdf docx
100% (2)
Full download Grokking Machine Learning 1st Edition Luis Serrano pdf docx
40 pages

Model Evaluation and Selection

Uploaded by

Model Evaluation and Selection

Uploaded by

Model Evaluation and

This is also known as recognition rate of classifier

• If we were to use the training set (instead of a test set) to estimate

• It can be shown that accuracy is a function of sensitivity and

• The sensitivity of the classifier is 90/300=30.00%.

• The precision for the yes class is=90/230=39.13%.

where β is a non-negative real number.

You might also like