0% found this document useful (0 votes)

13 views

Model Validation and Perf Metrics - v2 - Noman - 08 - 06 - 24

Model Validation of supervised learning of data science

Uploaded by

ilias ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Model Validation and Perf Metrics - v2 - Noman - 08 - 06 - 24

Model Validation of supervised learning of data science

Uploaded by

ilias ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Model Valuation and

Performance Metrics
Model Valuation and Performance Metrics

1.Data Preparation: Splitting the dataset into training, validation, and test sets.
2.Model Fitting: Creating multiple models.
3.Model Evaluation:
- for Reg Problem: Mean Absolute error (MAE), Calculation of Mean Squared Error
(MSE) and Root Mean Squared Error (RMSE) on the validation set for both models.
- for Classification Problem: Using confusion matrix, accuracy, precision, recall
(sensitivity), specificity, F1-score etc on the validation set.
1.Cross-Validation: Performing cross-validation on the training set.
2.Final Model Selection and Evaluation: Applying the best model on the test set.
Model Valuation and Performance Metrics

• Mean Squared Error (MSE)

• MSE is a measure of the average of the squares of the errors—that is, the average squared difference between the estimated
values and the actual value. It's a common measure of the estimation accuracy of a predictive model in regression tasks.

• Root Mean Squared Error (RMSE)

• RMSE is the square root of the MSE. It's a widely used measure of the differences between values predicted by a model or an
estimator and the values observed. The RMSE represents the sample standard deviation of the differences between predicted and
observed values.
Model Valuation and Performance Metrics
• Accuracy
• Most commonly used metrics for evaluating classification models. It measures the proportion of total correct predictions (both
true positives and true negatives) out of all predictions made.
• Accuracy=Number of Correct Predictions / Total Number of Predictions
Or, using the terms of the confusion matrix:
• Accuracy= (TP + TN) / (TP + FP + FN + TN)
• Specificity
• Specificity measures the proportion of actual negatives that are correctly identified as such (e.g., the percentage of healthy people
who are correctly identified as not having the condition, in the medical context). It's a key metric when the cost of false positives is
high. Specificity=True Negatives (TN)/ (True Negatives (TN) + False Positives (FP))
• Recall (Sensitivity)
• Recall, also known as sensitivity, is the ratio of true positive predictions to the total actual positives. It answers the question: "Of all
the actual positive instances, how many did we correctly classify as positive?"
Recall=True Positives (TP) / (True Positives (TP) + False Negatives (FN))
Model Valuation and Performance Metrics
• Precision
• Precision is the ratio of true positive predictions to the total positive predictions (including both true
positives and false positives). It answers the question: "Of all instances classified as positive, how many
are actually positive?"
Precision=True Positives (TP)/(True Positives (TP) + False Positives (FP))

• F1-Score
• The F1-score is the harmonic mean of precision and recall. It provides a single score that balances both
the precision and recall. It's particularly useful when you need to balance both precision and recall, such
as in imbalanced datasets.
F1-score=2×(Precision×Recall) / (Precision+Recall)
Model Valuation and Performance Metrics
Log Reg training and Thresholds

Log Reg Curve Predicted as obese

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
Perf of Log Reg on test data with diff Thresholds

Predictions on test data with tp = 0.5

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
Perf of Log Reg on test data with diff Thresholds

Predictions on test data with tp = eg 0.1

Think about an infectious disease. This is very important to correctly predict all the “yes” infected cases

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
Perf of Log Reg on test data with diff Thresholds

Predictions on test data with tp = eg 0.9

This is better than 0.5 for sure But which threshold is the best?

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
Perf of Log Reg on test data with diff Thresholds

Predictions on test data with tp = eg 0.9

This is better than 0.5 for sure But which threshold is the best?

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
ROC (Receiver Operator Curve) Curve and
AUC (Area Under Curve)

https://mlwhiz.com/images/roc-auc-curves-
https://www.researchgate.net/profile/Jeffrey_Cochran2/publication/327954265/figure/fig5/AS:8936 explained/5_hu84485d9ac406291b53c5109d7ec0e2a3_312173_1500x0_resize_box_2.png
52507312129@1590074769668/ROC-Area-Under-the-Curve-AUC-ROC-curves-plot-the-true-positive-
rate-vs-the-false.ppm
ROC (Receiver Operator Curve) Curve and
AUC (Area Under Curve)
KNN (K Nearest Neighbour)

https://matlab1.com/wp-content/uploads/2017/11/knn-concept.jpg
Cartesian/Manhattan/Cosine etc
2D or 3D or ND
Mean or voting
K = ? (HP tuning)
KNN (K Nearest Neighbour) Which/How many IVs should we consider?
Any feature engineering? (stan/norm/unit transform etc)
No. of Bedrooms

House price prediction

https://www.jeremyjordan.me/content/images/2017/06/Screen-Shot-2017-06-17-at-9.30.39-AM-1.png
SVM (Support Vector Machine)
SVM (Support Vector Machine)

https://medium.com/@vk.viswa/support-vector-regression-unleashing-the-power-of-non-linear-predictive-modeling-
https://www.geeksforgeeks.org/introduction-to-support-vector-machines-svm/ d4495836884
SVM (Support Vector Machine) – Kernel trick

https://gregorygundersen.com/blog/2019/12/10/kernel-trick/
SVM (Support Vector Machine) – Kernel trick

https://towardsdatascience.com/svm-and-kernel-svm-fed02bef1200
SVM (Support Vector Machine) – Kernel trick

https://towardsdatascience.com/svm-and-kernel-svm-fed02bef1200
Random Forest
Bagging
Boosting
Regularization: Lasso vs Ridge vs Elastic

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
AD3501-DL-UNIT 4 NOTES
No ratings yet
AD3501-DL-UNIT 4 NOTES
16 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
Performance Parameters
No ratings yet
Performance Parameters
23 pages
Performance Metrics
No ratings yet
Performance Metrics
12 pages
Model Evaluation
No ratings yet
Model Evaluation
18 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
11 pages
IT 138 - Lecture 4
No ratings yet
IT 138 - Lecture 4
30 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
25 pages
DL_IT324a_4
No ratings yet
DL_IT324a_4
52 pages
Evaluation Metrics-ML
No ratings yet
Evaluation Metrics-ML
16 pages
2. Performance Measures
No ratings yet
2. Performance Measures
19 pages
11 - Model Eval and Tuning
No ratings yet
11 - Model Eval and Tuning
17 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
08 Classifier Evaluation
No ratings yet
08 Classifier Evaluation
39 pages
Performance Metrics
No ratings yet
Performance Metrics
8 pages
06-FSSR_DS610_2024=2025T1_ٍMetrics
No ratings yet
06-FSSR_DS610_2024=2025T1_ٍMetrics
24 pages
A10-Model-Performance-v2-2up
No ratings yet
A10-Model-Performance-v2-2up
11 pages
Lecture-(3-4) Evaluation Metrices Classification and Regression
No ratings yet
Lecture-(3-4) Evaluation Metrices Classification and Regression
28 pages
Module 2
No ratings yet
Module 2
72 pages
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
20 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Unit - I Chap-4 Model Evaluation and Development
No ratings yet
Unit - I Chap-4 Model Evaluation and Development
35 pages
Model Perf Cheat Sheet
No ratings yet
Model Perf Cheat Sheet
2 pages
Metrix in ML
No ratings yet
Metrix in ML
7 pages
Exp7_MLAI2
No ratings yet
Exp7_MLAI2
8 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
lec-4
No ratings yet
lec-4
24 pages
Model Perf Cheat Sheet
No ratings yet
Model Perf Cheat Sheet
2 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Lecture04
No ratings yet
Lecture04
33 pages
Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
14 pages
Accuracy Measures
No ratings yet
Accuracy Measures
18 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Lecture 5
No ratings yet
Lecture 5
18 pages
Tutorial 6 Evaluation Metrics For Machine Learning Models: Classification and Regression Models
No ratings yet
Tutorial 6 Evaluation Metrics For Machine Learning Models: Classification and Regression Models
22 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
IAI&ML UNIT-5
No ratings yet
IAI&ML UNIT-5
15 pages
22AIP3101A Session 3
No ratings yet
22AIP3101A Session 3
24 pages
lecture11evaluationmetricsforclassification-240913060639-0c766554
No ratings yet
lecture11evaluationmetricsforclassification-240913060639-0c766554
28 pages
S1-Evaluate-Performance-LKW-1Mar2025
No ratings yet
S1-Evaluate-Performance-LKW-1Mar2025
26 pages
Model Performance Assessment
No ratings yet
Model Performance Assessment
13 pages
Notes 03
No ratings yet
Notes 03
38 pages
IS4242 W6 Model Evaluation and Selection
No ratings yet
IS4242 W6 Model Evaluation and Selection
86 pages
Module 5 ML
No ratings yet
Module 5 ML
12 pages
Intermediate Analytics-Regression-Week 3-1
No ratings yet
Intermediate Analytics-Regression-Week 3-1
44 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
11 pages
LL Evaluationmatrics
No ratings yet
LL Evaluationmatrics
2 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
Evaluation Metrics: Yining Chen (Adapted From Slides by Anand Avati) May 1, 2020
No ratings yet
Evaluation Metrics: Yining Chen (Adapted From Slides by Anand Avati) May 1, 2020
31 pages
March_3rd&4th
No ratings yet
March_3rd&4th
19 pages
ML3 Evaluating Models
No ratings yet
ML3 Evaluating Models
40 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
upworkprofilecoverletter
No ratings yet
upworkprofilecoverletter
4 pages
Yes
No ratings yet
Yes
8 pages
Heart Disease data science project
No ratings yet
Heart Disease data science project
90 pages
AutoRecovery save of Document4
No ratings yet
AutoRecovery save of Document4
5 pages
exampleofregressions
No ratings yet
exampleofregressions
21 pages
bill
No ratings yet
bill
11 pages
02_02 excel
No ratings yet
02_02 excel
112 pages
Presentation1F
No ratings yet
Presentation1F
86 pages
Radiography Safety and Radiation Quiz doc
No ratings yet
Radiography Safety and Radiation Quiz doc
14 pages
treeBasedModels 08 06 24
No ratings yet
treeBasedModels 08 06 24
14 pages
NDT Presentation
No ratings yet
NDT Presentation
53 pages
Presentation 1 Pyhtonmodelevualation
No ratings yet
Presentation 1 Pyhtonmodelevualation
22 pages
Assignment 1 ME 474 IE 454
No ratings yet
Assignment 1 ME 474 IE 454
2 pages
Demand Forecasting Slides
No ratings yet
Demand Forecasting Slides
66 pages
ML Lab Codes
No ratings yet
ML Lab Codes
14 pages
Flight Price Predection 2
No ratings yet
Flight Price Predection 2
6 pages
STAT 497 - Old Exams
100% (2)
STAT 497 - Old Exams
71 pages
Time (T) Sales (In PHP) Forecast Time (T) Sales (In PHP) Forecast
No ratings yet
Time (T) Sales (In PHP) Forecast Time (T) Sales (In PHP) Forecast
10 pages
Assignment 4
No ratings yet
Assignment 4
1 page
Capstone Project Vivek
100% (4)
Capstone Project Vivek
145 pages
Rainfall Prediction Using Machine Learning Algorithms A Comparative Analysis Approach
100% (1)
Rainfall Prediction Using Machine Learning Algorithms A Comparative Analysis Approach
4 pages
1 Deep Learning Primer
No ratings yet
1 Deep Learning Primer
56 pages
CS3361-Data Science Lab Manual - B.rethina Kumar
No ratings yet
CS3361-Data Science Lab Manual - B.rethina Kumar
36 pages
Prac 12 Model Selection Solution
No ratings yet
Prac 12 Model Selection Solution
10 pages
Speech Enhancement Using A Minimum-Mean Square Error Short-Time Spectral Amplitude estimator-dKR PDF
No ratings yet
Speech Enhancement Using A Minimum-Mean Square Error Short-Time Spectral Amplitude estimator-dKR PDF
13 pages
Application of Artificial Intelligence Techniques to Predict Strip Foundation Capacity Near Slope Surfaces
No ratings yet
Application of Artificial Intelligence Techniques to Predict Strip Foundation Capacity Near Slope Surfaces
24 pages
Forecast Exercise
No ratings yet
Forecast Exercise
9 pages
Heat
No ratings yet
Heat
14 pages
BUSD2027 QualityMgmt Module2
No ratings yet
BUSD2027 QualityMgmt Module2
168 pages
Types of Neural Networks
No ratings yet
Types of Neural Networks
7 pages
Introductory Statistical Learning
No ratings yet
Introductory Statistical Learning
87 pages
Ijacs M98
No ratings yet
Ijacs M98
4 pages
How To Develop LSTM Models For Time Series Forecasting
100% (1)
How To Develop LSTM Models For Time Series Forecasting
188 pages
Crow Search Optimization Based Approach
No ratings yet
Crow Search Optimization Based Approach
5 pages
Class 12 AI 2024-2025
No ratings yet
Class 12 AI 2024-2025
9 pages
07dec Bacon MBA
No ratings yet
07dec Bacon MBA
108 pages
Machine Learning With Boosting
100% (1)
Machine Learning With Boosting
212 pages
Demand Forecasting in A Supply Chain: True/False
No ratings yet
Demand Forecasting in A Supply Chain: True/False
10 pages
A Radar Target Tracking Algorithm Based On
No ratings yet
A Radar Target Tracking Algorithm Based On
5 pages
Laptop Price Analysis
No ratings yet
Laptop Price Analysis
37 pages
AI-based Smart Prediction of Clinical Disease Using Random Forest Classifier and Naive Bayes
No ratings yet
AI-based Smart Prediction of Clinical Disease Using Random Forest Classifier and Naive Bayes
22 pages
Carewell CADD
No ratings yet
Carewell CADD
153 pages

Model Validation and Perf Metrics - v2 - Noman - 08 - 06 - 24

Uploaded by

Model Validation and Perf Metrics - v2 - Noman - 08 - 06 - 24

Uploaded by

Model Valuation and

• Mean Squared Error (MSE)

• Root Mean Squared Error (RMSE)

Log Reg Curve Predicted as obese

Predictions on test data with tp = 0.5

Predictions on test data with tp = eg 0.1

Predictions on test data with tp = eg 0.9

Predictions on test data with tp = eg 0.9

House price prediction

You might also like