0% found this document useful (0 votes)
4 views

unit 4

Uploaded by

for181fun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

unit 4

Uploaded by

for181fun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Model Evaluation and

Selection
Unit 4
Model Evaluation
 Model evaluation is the process of using different evaluation metrics
to understand a machine learning model's performance, as well as its
strengths and weaknesses.
 It is a crucial step in the development and deployment of machine
learning systems.
 The primary goal of model evaluation is to determine how well the
model generalizes to unseen data and whether it meets the desired
objectives.
Model Evaluation techniques

 Data mining and Machine learning


 Independent and dependent variable
 Splitting of data
 Under fitting and over fitting
 Model evaluation and selection
Data mining:

 Data mining is the process of discovering patterns, trends, and


insights from large sets of data.
 It involves extracting useful information and knowledge from raw
data by using various techniques, including statistical analysis,
machine learning, and artificial intelligence.
 Data mining helps businesses and researchers make informed
decisions, predict future trends, identify relationships between
variables, and gain a deeper understanding of their data.
 It is widely used across industries such as finance, marketing,
healthcare, and telecommunications to uncover valuable
insights that can lead to improved strategies, increased
efficiency, and better decision-making.
Independent and dependent variable

Independent Variables: The variable that are not affected by the


other variables are called independent variables
Dependent Variables: The variables which depend on other
variables or factors
Machine Learning:
Training Data, Validation Data, Testing Data

 Training Data:
 Training data are collections of examples or samples that are
used to 'teach' or 'train the machine learning model.
 The model uses a training data set to understand the patterns
and relationships within the data, thereby learning to make
predictions or decisions without being explicitly programmed to
perform a specific task.
 It is the set of data that is used to train and make the model
learn the hidden features/patterns present in the data.
 Validation Data:
 The validation data is a set of data that is used to validate the
model performance during training.
 This data is held aside during the modelling process and used
only to evaluate a model after the modelling is complete.
 After training a machine learning model using the training data,
the model's performance is evaluated using the validation data.
 This evaluation typically involves measuring metrics such as
accuracy, precision, recall, F1 score, or other relevant
performance indicators, depending on the nature of the
problem being solved.
 Testing Data:
 The testing data is used to evaluate the accuracy of the trained
algorithm.
 Data that held aside during the modelling process and used
only to evaluate a model after the modelling is complete.
 Test data has the same variables as the training data, the same
set of independent variables and the dependent variable.
Overfitting:
 Definition:
 Overfitting occurs when a model learns the training data too well,
capturing noise or random fluctuations in the data as if they were
genuine patterns. Consequently, the model performs well on the
training data but fails to generalize to new, unseen data.
 Characteristics:
 Low bias: The model has low bias as it fits the training data very
closely. Bias is the inability for ML model to get a proper relationship
between variables.
 High variance: However, it has high variance because it fails to
generalize well to unseen data. In ML the difference in fits between
data sets is called variance.
 It will have excellent performance on training data but poor
performance on test data.
 Causes:
 Using a too complex model or algorithm.
 Having too many features relative to the amount of training data.
 Insufficient regularization. Regularization refers to techniques that
are used to compare machine learning models in order to minimize
the adjusted loss function and prevent overfitting or under fitting.
Using Regularization, we can fit our machine learning model
appropriately on a given test set and hence reduce the errors in it.
Loss functions are a measurement of how good your model is in
terms of predicting the expected outcome.
 Regularization in machine learning is a technique used to prevent
overfitting and improve the generalization ability of a model
 Remedies:
 Simplify the model by reducing the number of features or
decreasing its complexity.
 Cross-validation to tune hyper parameters and prevent
overfitting.
 Early stopping during training to prevent the model from
learning noise in the data.
Under fitting
Definition: Under fitting occurs when a model is too simple to capture the underlying
structure of the data. In other words, the model fails to learn the patterns in the
training data, resulting in poor performance not only on the training data but also on
unseen data (test data).
Characteristics:
 High bias: The model is biased toward a certain set of assumptions and fails to
capture the complexity of the data.
 Poor performance: Both on training and test data, the model's performance is
poor.
Causes:
 Using a too simple model or algorithm.
 Insufficient training data.
 Insufficient training time.
Remedies:
 Increase model complexity by adding more features or
increasing the model's capacity.
 Use more advanced algorithms that can capture complex
patterns.
 Gather more training data.
 Train the model for longer periods.
How to overcome over fitting and under fitting
in model?

 Introduce a validation set


 Variance-bias tradeoff
 Cross-validation
 Hyper parameter tuning
 Regularization
Introduce Validation set:
 A validation set is a set of data used to train the model with the goal of
finding and optimizing the best model to solve a given problem.
 The training set is used to train the model. The validation set is used to
fine-tune the model's hyper parameters. The test set serves as a
benchmark to assess the model's performance on new data.

Variance-bias tradeoff:
 If the algorithm is too simple then it may be on high bias and low variance
condition and thus is error-prone. If algorithms fit too complex then it may
be on high variance and low bias.
 The ideal model lies between these two extremes.
 If we make the model more complex (to reduce bias), you risk increasing
variance.
 If you simplify the model (to reduce variance), you risk increasing bias.
 The challenge is to find a balance where the model is complex enough to
capture important patterns but simple enough to generalize well.
Cross-validation
 Cross-validation is a technique used to evaluate the
performance of a machine learning model by splitting the data
into multiple parts. Instead of using just one training and one
test set, the data is divided into "folds," and the model is
trained and tested on different combinations of these folds.
 How It Works:
1. Split the Data: The dataset is divided into k equal parts (folds).
2. Train and Test:
The model is trained on k-1 folds.
It is tested on the remaining 1 fold.
3. Repeat: This process is repeated k times, with each fold used as the
test set once.
4. Average Results: The final model performance is the average of the
results from all folds.
Example:

K-Fold Cross-Validation (k=5)


• Split data into 5 parts.
• Train on 4 parts and test on the 5th part.
• Repeat this 5 times, using a different part as the test set
each time.
• Average the results to get a reliable performance estimate

Prevents Overfitting:

•By training the model on different parts of the data and


validating on unseen data, it ensures that the model doesn’t
memorize the training data.
•This helps in identifying models that perform well on both
training and test data.
Advantages :
1. Improved Model Performance: Provides a reliable estimate of model
performance by testing on multiple data splits.
2. Reduces Overfitting: Prevents memorization by validating on different
subsets of data.
3. Effective for Hyperparameter Tuning: Helps identify optimal model
parameters through repeated evaluation.
4. Ensures Stability and Reliability: Produces consistent and unbiased
performance estimates.

Disadvantages :
1. Computationally Expensive: Involves multiple training and testing cycles,
increasing resource usage.
2. Time-Consuming: Can be slow for large datasets or complex models.
3. Not Always Necessary: May be excessive for small datasets or already well-
performing models.
4. Risk of Data Leakage: Improper splitting can introduce information from
test data into training.
Hyper parameter tuning:
Hyperparameters are external configurations that are not learned from
the data but set before training.

Examples:
• Learning Rate: Controls how much the model adjusts during training.
• Number of Trees: In decision trees or random forests.
• Batch Size: Number of samples processed before updating the model.

• Hyperparameter tuning is the process of finding the best settings for a


machine learning model to improve its accuracy and performance.

• Poorly tuned hyperparameters can lead to underfitting or overfitting.

• Tuning helps identify the best combination to achieve higher accuracy


and better generalization.
Model Evaluation Metrics
 Model evaluation is the process of using different evaluation metrics
to understand a machine learning model's performance, as well as
its strengths and weaknesses.
 To evaluate the performance of a classification model, different
metrics are used, and some of them are as follows:
1. Accuracy
2. Confusion Matrix
3. Precision
4. Recall
5. F-Score
6. AUC(Area Under the Curve)-ROC
1.Confusion Matrix
 Classification is the process of categorizing a given set of
data into different categories.
 In machine learning, to measure the performance of the
classification model, we use the confusion matrix
 The confusion matrix is a tool used to evaluate the
performance of a model and is visually represented as a
table.
 It provides a deeper layer of insight to data practitioners on
the model's performance, errors, and weaknesses.
The Confusion Matrix Structure
Let’s learn about the basic structure of a
confusion matrix, using the example of
identifying an email as spam or not spam.
 True Positive (TP) - Your model predicted the
positive class. For example, identifying a
spam email as spam.
 True Negative (TN) - Your model correctly
predicted the negative class. For example,
identifying a regular email as not spam.
 False Positive (FP) - Your model incorrectly
predicted the positive class. For example,
identifying a regular email as spam.
 False Negative (FN) - Your model incorrectly
predicted the negative class. For example,
identifying a spam email as a regular email.
 In general, the table is divided into four
terminologies, which are as follows:
 True Positive(TP): In this case, the
prediction outcome is true, and it is
true in reality, also.
 True Negative(TN): in this case, the
prediction outcome is false, and it is
false in reality, also.
 False Positive(FP): In this case,
prediction outcomes are true, but they
are false in actuality.
 False Negative(FN): In this case,
predictions are false, and they are true
in actuality.
1.Accuracy
 Accuracy is a metric that measures how often a machine learning model correctly predicts
the outcome.
 You can calculate accuracy by dividing the number of correct predictions by the total
number of predictions
2.Precision
 The precision metric is used to overcome the limitation of Accuracy. The
precision determines the proportion of positive prediction that was
actually correct
 Precision is defined as the ratio of correctly classified positive samples
(True Positive) to a total number of classified positive samples (either
correctly or incorrectly).
 Precision measures the accuracy of the positive predictions made by the
model.
 A high precision indicates that the model has a low false positive rate,
meaning it is good at avoiding misclassifying negative instances as
positive.
Recall:
 It is also similar to the Precision metric; however, it aims to calculate the
proportion of actual positive that was identified incorrectly
 It is the ratio of true positive predictions to all actual positive instances in
the dataset, including both true positives and false negatives.
 Recall measures the ability of the model to correctly identify all positive
samples.
 A high recall indicates that the model has a low false negative rate,
meaning it is good at capturing positive instances without missing many.
F1 Score
 The F1 score is the harmonic mean of precision and recall.
 It balances both precision and recall, making it a useful metric for
situations where there is an imbalance between the classes or when both
false positives and false negatives are equally important.

 The F1 score ranges from 0 to 1, where 1 indicates perfect precision and


recall, and 0 indicates the worst possible performance.
 It's used as a single measure to compare different models or to tune
model parameters in classification tasks.
Difference between Precision and Recall in Machine
Learning:
Stochastic Gradient Descent (SGD)

 Stochastic Gradient Descent (SGD) is an optimization


algorithm in machine learning, particularly when dealing with
large datasets.
 It is a variant of the traditional gradient descent algorithm but
offers several advantages in terms of efficiency and
scalability, making it the go-to method for many deep-
learning tasks.
 Gradient Descent is an iterative optimization algorithm
used to minimize a loss function, which represents how far
the model’s predictions are from the actual values. The main
goal is to adjust the parameters of a model (weights, biases,
etc.) so that the error is minimized.
How SGD works:

Step 1: Initialize the model parameters randomly or with some starting values.
Step 2: Randomly select one data point (or a mini-batch of data points).
Step 3: Calculate the gradient of the loss function with respect to the model
parameters using that single data point.
Step 4: Update the model parameters by moving in the opposite direction of the
gradient (to minimize the loss). The update rule is typically:

θ=θ−η⋅∇L(θ)
where:
θ are the model parameters,
η is the learning rate (step size),
∇L(θ) is the gradient of the loss function with respect to the parameters.

Step 5: Repeat this process for a specified number of iterations (epochs), going
through the dataset multiple times.
Advantages of SGD:

1. Faster Convergence:
Quicker updates: SGD updates parameters using a single data point (or
mini-batch) at a time, leading to faster parameter adjustments compared to
traditional gradient descent, which requires computing gradients over the
entire dataset.
Frequent updates: With each data point processed, the model receives
immediate feedback, speeding up convergence in the early stages.
Resource efficiency: SGD doesn't require loading the entire dataset into
memory, making it faster and more computationally efficient.

2. Ability to Handle Large Datasets:


Memory efficiency: SGD processes data one point at a time, avoiding the
need to store the full dataset in memory, which is useful for large-scale
datasets.
Scalability: As datasets grow, SGD remains efficient by working
incrementally, avoiding the memory and performance issues faced by batch
gradient descent.

You might also like