0% found this document useful (0 votes)
13 views

Model Validation and Perf Metrics - v2 - Noman - 08 - 06 - 24

Model Validation of supervised learning of data science

Uploaded by

ilias ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Model Validation and Perf Metrics - v2 - Noman - 08 - 06 - 24

Model Validation of supervised learning of data science

Uploaded by

ilias ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Model Valuation and

Performance Metrics
Model Valuation and Performance Metrics

1.Data Preparation: Splitting the dataset into training, validation, and test sets.
2.Model Fitting: Creating multiple models.
3.Model Evaluation:
- for Reg Problem: Mean Absolute error (MAE), Calculation of Mean Squared Error
(MSE) and Root Mean Squared Error (RMSE) on the validation set for both models.
- for Classification Problem: Using confusion matrix, accuracy, precision, recall
(sensitivity), specificity, F1-score etc on the validation set.
1.Cross-Validation: Performing cross-validation on the training set.
2.Final Model Selection and Evaluation: Applying the best model on the test set.
Model Valuation and Performance Metrics

• Mean Squared Error (MSE)


• MSE is a measure of the average of the squares of the errors—that is, the average squared difference between the estimated
values and the actual value. It's a common measure of the estimation accuracy of a predictive model in regression tasks.

• Root Mean Squared Error (RMSE)


• RMSE is the square root of the MSE. It's a widely used measure of the differences between values predicted by a model or an
estimator and the values observed. The RMSE represents the sample standard deviation of the differences between predicted and
observed values.
Model Valuation and Performance Metrics
• Accuracy
• Most commonly used metrics for evaluating classification models. It measures the proportion of total correct predictions (both
true positives and true negatives) out of all predictions made.
• Accuracy=Number of Correct Predictions / Total Number of Predictions
Or, using the terms of the confusion matrix:
• Accuracy= (TP + TN) / (TP + FP + FN + TN)
• Specificity
• Specificity measures the proportion of actual negatives that are correctly identified as such (e.g., the percentage of healthy people
who are correctly identified as not having the condition, in the medical context). It's a key metric when the cost of false positives is
high. Specificity=True Negatives (TN)/ (True Negatives (TN) + False Positives (FP))​
• Recall (Sensitivity)
• Recall, also known as sensitivity, is the ratio of true positive predictions to the total actual positives. It answers the question: "Of all
the actual positive instances, how many did we correctly classify as positive?"
Recall=True Positives (TP) / (True Positives (TP) + False Negatives (FN))​
Model Valuation and Performance Metrics
• Precision
• Precision is the ratio of true positive predictions to the total positive predictions (including both true
positives and false positives). It answers the question: "Of all instances classified as positive, how many
are actually positive?"
Precision=True Positives (TP)/(True Positives (TP) + False Positives (FP))

• F1-Score
• The F1-score is the harmonic mean of precision and recall. It provides a single score that balances both
the precision and recall. It's particularly useful when you need to balance both precision and recall, such
as in imbalanced datasets.
F1-score=2×(Precision×Recall) / (Precision+Recall) ​
Model Valuation and Performance Metrics
Log Reg training and Thresholds

Log Reg Curve Predicted as obese

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
Perf of Log Reg on test data with diff Thresholds

Predictions on test data with tp = 0.5

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
Perf of Log Reg on test data with diff Thresholds

Predictions on test data with tp = eg 0.1

Think about an infectious disease. This is very important to correctly predict all the “yes” infected cases

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
Perf of Log Reg on test data with diff Thresholds

Predictions on test data with tp = eg 0.9

This is better than 0.5 for sure But which threshold is the best?

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
Perf of Log Reg on test data with diff Thresholds

Predictions on test data with tp = eg 0.9

This is better than 0.5 for sure But which threshold is the best?

https://www.youtube.com/watch?v=4jRBRDbJemM&t=936s
ROC (Receiver Operator Curve) Curve and
AUC (Area Under Curve)

https://mlwhiz.com/images/roc-auc-curves-
https://www.researchgate.net/profile/Jeffrey_Cochran2/publication/327954265/figure/fig5/AS:8936 explained/5_hu84485d9ac406291b53c5109d7ec0e2a3_312173_1500x0_resize_box_2.png
52507312129@1590074769668/ROC-Area-Under-the-Curve-AUC-ROC-curves-plot-the-true-positive-
rate-vs-the-false.ppm
ROC (Receiver Operator Curve) Curve and
AUC (Area Under Curve)
KNN (K Nearest Neighbour)

https://matlab1.com/wp-content/uploads/2017/11/knn-concept.jpg
Cartesian/Manhattan/Cosine etc
2D or 3D or ND
Mean or voting
K = ? (HP tuning)
KNN (K Nearest Neighbour) Which/How many IVs should we consider?
Any feature engineering? (stan/norm/unit transform etc)
No. of Bedrooms

House price prediction


https://www.jeremyjordan.me/content/images/2017/06/Screen-Shot-2017-06-17-at-9.30.39-AM-1.png
SVM (Support Vector Machine)
SVM (Support Vector Machine)

https://medium.com/@vk.viswa/support-vector-regression-unleashing-the-power-of-non-linear-predictive-modeling-
https://www.geeksforgeeks.org/introduction-to-support-vector-machines-svm/ d4495836884
SVM (Support Vector Machine) – Kernel trick

https://gregorygundersen.com/blog/2019/12/10/kernel-trick/
SVM (Support Vector Machine) – Kernel trick

https://towardsdatascience.com/svm-and-kernel-svm-fed02bef1200
SVM (Support Vector Machine) – Kernel trick

https://towardsdatascience.com/svm-and-kernel-svm-fed02bef1200
Random Forest
Bagging
Boosting
Regularization: Lasso vs Ridge vs Elastic

You might also like