Machine Learning Terminology

The document discusses machine learning terminology like training and test datasets used to build and evaluate models. It also covers concepts like cross validation, confusion matrices, performance metrics like recall, precision, accuracy and F1 score. The document further explains AUC-ROC curves, variance, bias-variance tradeoff, overfitting and underfitting.

Uploaded by

aditidhepe25

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Machine Learning Terminology

Uploaded by

aditidhepe25

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

MACHINE LEARNING

TERMINOLOGY
TRAINING & TESTING SET
Training Dataset
1. The sample of data used to fit the model.
2. The model sees and learns from this data.
Test Dataset
1. The sample of data used to provide an unbiased evaluation of a final
model fit on the training dataset.
CROSS VALIDATION (K-FOLD)

Cross-validation is a technique in which we train

our model using the subset of the data-set and then
evaluate using the complementary subset of the
data-set.

The three steps involved in cross-validation are as

follows :
 Reserve some portion of sample data-set.
 Using the rest data-set train the model.
 Test the model using the reserve portion of the data-set.
CONFUSION MATRIX
Confusion Matrix is a tool to determine the performance of classifier. It contains
information about actual and predicted classifications.

1. True Positive (TP)

2. False Negative (FN)

3. False positive (FP)
4. True Negative (TN)
RECALL (SENSITIVITY)
It is measure of positive examples labeled as positive by classifier. It should
be higher.

Sensitivity = 45/(45+20) = 69.23%

SPECIFICITY (TRUE NEGATIVE
RATE)
It is measure of negative examples labeled as negative by classifier. There
should be high specificity.

Specificity = 30/(30+5) = 85.71% .

PRECISION
Precision is ratio of total number of correctly classified positive examples and
the total number of predicted positive examples.

Precision = 45/(45+5)= 90%

ACCURACY
Accuracy is the proportion of the total number of predictions that are correct.

Accuracy = (45+30)/(45+20+5+30) = 75%

F1 SCORE
F1 score is a weighted average of the recall (sensitivity) and precision.
SUMMARY OF CONFUSION
MATRIX
WHAT IS THE AUC-ROC
CURVE?
The Receiver Operator Characteristic (ROC) curve is an evaluation metric
for binary classification problems. It is a probability curve that plots the TPR
against FPR at various threshold values.
The Area Under the Curve (AUC) is the measure of the ability of a
classifier to distinguish between classes and is used as a summary of the ROC
curve.
VARIANCE
•When a model does not perform as well as it does with the trained data set, there is a
possibility that the model has a variance.
•It basically tells how scattered the predicted values are from the actual values.
BIAS-VARIANCE TRADE-OFF
Bias: Error in training data

Variance: Error in test data

OVERFITTING
A statistical model is said to be overfitted when we feed it a lot more data than
necessary.
Training Data Accuracy is high and Test Data Accuracy is low.
UNDERFITTING
In order to avoid overfitting, we could stop the training at an earlier stage.
Training Data Acc is low and Test Data Acc is low
Thank you…

Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
MACHINELEARNING
No ratings yet
MACHINELEARNING
20 pages
Module 6
No ratings yet
Module 6
24 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
49 pages
Cofusion Matrix Cross- Validation
No ratings yet
Cofusion Matrix Cross- Validation
34 pages
Model Performance Assessment
No ratings yet
Model Performance Assessment
13 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
10 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Cross Validation
No ratings yet
Cross Validation
10 pages
Chapter 3 Model Evaluation Final
No ratings yet
Chapter 3 Model Evaluation Final
30 pages
Lecture 5 Evaluation_Classifer
No ratings yet
Lecture 5 Evaluation_Classifer
61 pages
Evaluation of Predictive Models Final
No ratings yet
Evaluation of Predictive Models Final
6 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
DL_IT324a_4
No ratings yet
DL_IT324a_4
52 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Confusion Matrix: Dr. P. K. Chaurasia
No ratings yet
Confusion Matrix: Dr. P. K. Chaurasia
13 pages
Machine Learning: B.Tech (CSBS) V Semester
No ratings yet
Machine Learning: B.Tech (CSBS) V Semester
9 pages
Confusion Matrix in Machine Learning fgvbn
No ratings yet
Confusion Matrix in Machine Learning fgvbn
4 pages
T1 ML QB Soln
No ratings yet
T1 ML QB Soln
23 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
37 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
Evaluation Metrics For Machine Learning: Negative (Actual) 98 Positive (Actual) 1
No ratings yet
Evaluation Metrics For Machine Learning: Negative (Actual) 98 Positive (Actual) 1
2 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
Lec 8
No ratings yet
Lec 8
35 pages
Bernd Klein Python and Machine Learning Letter
No ratings yet
Bernd Klein Python and Machine Learning Letter
453 pages
Accuracy and error measures
No ratings yet
Accuracy and error measures
14 pages
CS585 Lecture October03rd
No ratings yet
CS585 Lecture October03rd
146 pages
Confusion Matrix
No ratings yet
Confusion Matrix
21 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
Lecture 2.3
No ratings yet
Lecture 2.3
9 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
UNIT4 Confusion Matrix
No ratings yet
UNIT4 Confusion Matrix
12 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
CIVI6731 Lecture (Week9)
No ratings yet
CIVI6731 Lecture (Week9)
18 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
20 pages
Analytic Method:: Model Evaluation
No ratings yet
Analytic Method:: Model Evaluation
17 pages
Lecture 11 Model Evaluation
No ratings yet
Lecture 11 Model Evaluation
11 pages
ML CM
No ratings yet
ML CM
17 pages
Unit 2 Chap 4
No ratings yet
Unit 2 Chap 4
14 pages
Unit II_2.9_Confusion Matrix in ML @ CSJMU_6 Slides Handouts
No ratings yet
Unit II_2.9_Confusion Matrix in ML @ CSJMU_6 Slides Handouts
2 pages
Confusion Matrix
No ratings yet
Confusion Matrix
11 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
IT 138 - Lecture 4
No ratings yet
IT 138 - Lecture 4
30 pages
Exp7_MLAI2
No ratings yet
Exp7_MLAI2
8 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
Chapitre_2
No ratings yet
Chapitre_2
26 pages
Machine Learning Chapter3
No ratings yet
Machine Learning Chapter3
27 pages
ML-Lecture-12 (Evaluation Metrics For Classification)
No ratings yet
ML-Lecture-12 (Evaluation Metrics For Classification)
15 pages
Lecture 9 - Evaluations
No ratings yet
Lecture 9 - Evaluations
68 pages
Simple Guide To Confusion Matrix Terminology
No ratings yet
Simple Guide To Confusion Matrix Terminology
5 pages
Risk Security and Regulatory Compliance
No ratings yet
Risk Security and Regulatory Compliance
12 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
Ai&ml 2
No ratings yet
Ai&ml 2
15 pages
Module 7 - Evaluation Measures
No ratings yet
Module 7 - Evaluation Measures
27 pages
Module 2
No ratings yet
Module 2
151 pages
ERROR and Confusion Matrix
No ratings yet
ERROR and Confusion Matrix
29 pages
Exercises of Statistical Inference
From Everand
Exercises of Statistical Inference
Simone Malacrida
No ratings yet
Module 5 Decision Tree Part2
No ratings yet
Module 5 Decision Tree Part2
47 pages
Confusion Matrix
No ratings yet
Confusion Matrix
13 pages
Multi Layer Perceptron 1
No ratings yet
Multi Layer Perceptron 1
54 pages
ML 11 Decision Trees
No ratings yet
ML 11 Decision Trees
4 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
24 pages
ml_cheatsheet
No ratings yet
ml_cheatsheet
4 pages
SL Classification For Data Science..
No ratings yet
SL Classification For Data Science..
4 pages
ML.4-Classification Techniques (Week 5,6,7)
No ratings yet
ML.4-Classification Techniques (Week 5,6,7)
56 pages
For Classification Models
No ratings yet
For Classification Models
47 pages
Assignment 8 Solution
No ratings yet
Assignment 8 Solution
15 pages
Maneesha Nidigonda Minor Project .Ipynb
No ratings yet
Maneesha Nidigonda Minor Project .Ipynb
35 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
Classification Pros Cons
No ratings yet
Classification Pros Cons
1 page
Machine Learning Deep Learning
No ratings yet
Machine Learning Deep Learning
2 pages
Jntuk R20 ML
No ratings yet
Jntuk R20 ML
43 pages
Proposal Defense v6
No ratings yet
Proposal Defense v6
55 pages
Reg. No.: 39110009 Colab Notebook Link: Name: Abivirshan Suresh
No ratings yet
Reg. No.: 39110009 Colab Notebook Link: Name: Abivirshan Suresh
27 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
24 pages
Ensemble Learning: Martin Sewell
No ratings yet
Ensemble Learning: Martin Sewell
16 pages
Comprehensive Guide To Multiclass Classification With Sklearn - Towards Data Science
No ratings yet
Comprehensive Guide To Multiclass Classification With Sklearn - Towards Data Science
19 pages
Tugas JST #Individual Task 1. MLP (Multi Layer Perceptron)
No ratings yet
Tugas JST #Individual Task 1. MLP (Multi Layer Perceptron)
3 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
22 pages
Mid Term
No ratings yet
Mid Term
2 pages
Chapter1 (Classification)
No ratings yet
Chapter1 (Classification)
16 pages
LAB # 07 KNN_Iris Dataset.ipynb - Colab
No ratings yet
LAB # 07 KNN_Iris Dataset.ipynb - Colab
8 pages
Handout9 Trees Bagging Boosting
100% (1)
Handout9 Trees Bagging Boosting
23 pages
Cluster Analysis: Concepts and Techniques - Chapter 7
100% (1)
Cluster Analysis: Concepts and Techniques - Chapter 7
60 pages
Lecture 2 - Clustering Methods
No ratings yet
Lecture 2 - Clustering Methods
19 pages
MACHINE LEARNING
No ratings yet
MACHINE LEARNING
2 pages
Data Mining List of Important Question
No ratings yet
Data Mining List of Important Question
4 pages