0% found this document useful (0 votes)
6 views

ML Lectures - 33 34

Uploaded by

Abdhullah Afthah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

ML Lectures - 33 34

Uploaded by

Abdhullah Afthah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

TE (CS)

Spring Semester 2024

Machine Learning (CS-324)

Lecture #33-34
Bias & Variance Tradeoff

Dr Syed Zaffar Qasim


Assistant Professor (CIS)

Bias & Variance


▪ Algorithm selection - an important step in forming an
accurate prediction model but
o deploying an algorithm with a high accuracy can be
a difficult balancing act.
▪ The fact that each algorithm can produce vastly
different models
o based on the hyperparameters provided
o can lead to dramatically different results.
▪ Hyperparameters are the algorithm’s settings,
o similar to the controls on the dashboard of an airplane
o except hyperparameters are lines of code!`

CS-324: Machine Learning 1


Bias & Variance
▪ A constant challenge in machine learning
o navigating underfitting and overfitting,
o describe how closely the model follows the actual
patterns of the dataset.
▪ To understand underfitting and overfitting,
o one must first understand a model’s bias and variance
o the two fundamental causes of prediction error.

Bias & Variance


▪ Bias refers to the gap between the predicted value and
the actual value.

Fig 1

▪ In the case of high bias,


o predictions likely to be skewed in a certain direction
o away from the actual values.
▪ Variance describes how scattered the different
predicted values are with respect to each other.

CS-324: Machine Learning 2


Bias & Variance
▪ Assume there are many training sets, all unique, but
equally representative of the population.
▪ A model with a high bias will produce similar errors
for an input regardless of the training set it was
trained with;
o the model biases its own assumptions about the real
relationship
o over the relationship demonstrated in the training
data.
▪ A model with high variance, conversely, will produce
different errors for an input depending on the
training set it was trained with.

Bias & Variance


▪ A model with high bias is inflexible.
▪ A model with high variance may be so flexible that it
models the noise in the training set.
▪ A model with high variance over-fits the training data,
o while a model with high bias under-fits the training data.

CS-324: Machine Learning 3


Visualize Bias and Variance at a dartboard.
▪ Imagine that the center of the target, or the bull’s-eye,
perfectly predicts the correct value of your model.

Fig 2

▪ The more the dots deviate from the bull’s-eye, the


higher the bias.
▪ A model with high bias but low variance will throw darts
that are far from the bull's eye, but tightly clustered.
▪ A model with high bias and high variance will throw darts
all over the board; the darts are far from the bull's eye and
each other. 7

Visualize Bias and Variance at a dartboard.

▪ Let X be a sample from a population specified up to a


parameter θ, and let d = d(X) be an estimator of θ.
▪ To evaluate the quality of this estimator, we can
measure how much it is different from θ, that is,
(d(X)−θ) 2.
▪ But since it is a random variable,
o we need to average this over possible X and
o consider r(d, θ), the mean square error of the estimator
d defined as
𝑟 𝑑, 𝜃 = 𝐸[ 𝑑 𝑋 − 𝜃 2 ]

CS-324: Machine Learning 4


Visualize Bias and Variance at a dartboard.
▪ The mean square error can be rewritten as follows—d
is short for d(X):

▪ We then write error as the sum of these two terms, the


variance and the square of the bias:
r(d, θ) = Var(d) + (bθ(d))2

Underfitting and Overfitting


▪ Mismanaging the bias-variance trade-off can result in
the model becoming
o overly simple and inflexible (underfitting) or
o overly complex and flexible (overfitting).

▪ Underfitting results from low variance, high bias


▪ Overfitting means high variance, low bias.

10

CS-324: Machine Learning 5


Underfitting
▪ Underfitting is when the model is overly simple than
the real complexity of patterns in the data.
o e.g. when trying to fit a line to data sampled from a
third-order polynomial.

▪ It can lead to inaccurate predictions for both the


training data and test data.
▪ Common causes of underfitting
o insufficient training data to cover all possible patterns
o the training and test data not properly randomized.
11

Overfitting
▪ A natural temptation
o add complexity to the model
o in order to improve accuracy,
o can, in turn, lead to overfitting.

▪ Overfitting typically occurs when


o a model, besides learning the underlying function,
o also perfectly learns to classify noisy training
examples.
12

CS-324: Machine Learning 6


Overfitting
▪ A model that memorizes noise or coincidence in the
data fails to achieve the generalization ability.
▪ For example, when fitting a sixth-order polynomial to
noisy data sampled from a third-order polynomial.
▪ An overfitted model will yield accurate predictions
from the training data but prove less accurate at
formulating predictions from the test data.
▪ Overfitting can also occur if
o the training and test data aren’t randomized before
they are split and
o patterns in the data aren’t distributed across the two
segments of data.

13

Overfitting in polynomial classifiers


▪ The eight training examples in Fig. fall into two groups.

▪ The two classes are linearly separable, but noise has caused
one negative example to be mislabeled as positive.
▪ The high-order polynomial on the right overfits the data,
o ignoring the possibility of noise,
o in an attempt to avoid any error on the training set.
▪ The ideal solution often somewhere between the extremes
of linear classifiers and high-order polynomials.
▪ The best choice can be determined experimentally.
14

CS-324: Machine Learning 7


Overcoming the underfitting and overfitting
▪ To eradicate both underfitting and overfitting,
o modify the model’s hyperparameters to ensure that
they fit patterns
o in both the training and test data and
o not just one-half of the data.
▪ This may also mean re-randomizing the training and
test data or adding new data points so as to better
detect underlying patterns.
▪ However, in most instances, we probably need to
consider switching algorithms or
o modifying hyperparameters based on trial and error
o to minimize and manage the issue of bias-variance trade-
off.
15

Underfitting and Overfitting

▪ Specifically, this might entail switching from linear


regression to non-linear regression to reduce bias by
increasing variance.
▪ Or it could mean increasing “k” in k-NN to reduce
variance (by averaging together more neighbors).
▪ A third example could be reducing variance by
switching from a single decision tree (which is prone
to overfitting) to a random forest with many decision
trees.

16

CS-324: Machine Learning 8


Regularization
▪ Another effective strategy to combat overfitting and
underfitting (Breiman 1998).
▪ In this approach, we write an augmented error function
E = error_on_data + λ × model_complexity
▪ The second term penalizes complex models with large
variance, where λ gives the weight of this penalty.
▪ When we minimize the augmented error function, E,
o instead of the error on data only,
o we penalize complex models and
o thus decrease variance
o thus amplifying the bias error.

17

Regularization
▪ If λ is taken too large,
o only very simple models are allowed and
o we risk introducing bias.
▪ λ is optimized using cross-validation.
▪ In effect, this add-on parameter provides a warning alert
o to keep high variance in check
o while the original parameters are being optimized.

18

CS-324: Machine Learning 9

You might also like