0% found this document useful (0 votes)
3 views

unit-online-1.2

The document discusses the concept of Empirical Risk Minimization (ERM) in neural networks and deep learning, emphasizing its importance in selecting models that minimize empirical risk based on training data. It explains the distinction between empirical risk and true risk, as well as the impact of bias and variance errors on model performance. Additionally, it highlights regularization techniques to prevent overfitting and improve model generalization on unseen data.

Uploaded by

aakilalig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

unit-online-1.2

The document discusses the concept of Empirical Risk Minimization (ERM) in neural networks and deep learning, emphasizing its importance in selecting models that minimize empirical risk based on training data. It explains the distinction between empirical risk and true risk, as well as the impact of bias and variance errors on model performance. Additionally, it highlights regularization techniques to prevent overfitting and improve model generalization on unseen data.

Uploaded by

aakilalig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

NEURAL NETWORKS & DEEP LEARNING

(21MCA24DB3)

Prepared & Presented By:


Dr. Balkishan
Assistant Professor
Department of Computer Science & Applications
Maharshi Dayanand University
Rohtak
Empirical Risk Minimization
The empirical risk minimization principle states
that the learning algorithm should choose a
function/model/hypothesis which minimizes
the empirical risk
Understanding the concept of risk

• What is loss function


• Given a set of inputs and outputs, loss function
measures the difference between the predicted
output and the true output.
• But this is applicable only to the given set of inputs
and outputs.
• We want to know what the loss is over all the
possibilities.
• This is where “true risk” comes into picture.
• True risk computes the average loss over all the
possibilities.
What exactly is Empirical Risk Minimization

• If we compute the loss using the data points in our


dataset, it’s called empirical risk.
• It’s “empirical (experimental)” and not “true”
because we are using a dataset that’s a subset of
the whole population.
• When we build our learning model, we need to pick
the function that minimizes the empirical risk i.e.
the difference between the predicted output and
the actual output for the data points in our dataset.
• This process of finding the function that minimizes
the empirical risk is called empirical risk
minimization.
Importance of Empirical Risk Minimization

• ERM is essential to understanding the limits


of machine learning algorithms and to form a
good basis for practical problem-solving
skills.
Empirical risk minimization (ERM)
• It is a principle in statistical learning theory which
defines a family of learning algorithms and is used to
give theoretical bounds on their performance.
• The idea is that we don’t know exactly how well an
algorithm will work in practice (the true "risk")
because we don't know the true distribution of data
that the algorithm will work on but as an alternative
we can measure its performance on a known set of
training data.
• We assumed that our samples come from this
distribution and use our dataset as an approximation.
Example of Empirical Risk Minimization
• Example: We want to build a model that can differentiate
between a male and a female based on specific features.
• If we select 150 random people where women are really
short, and men are really tall, then the model might
incorrectly assume that height is the differentiating feature.
• For building a truly accurate model, we have to gather all
the women and men in the world to extract differentiating
features.
• Unfortunately, that is not possible! So we select a small
number of people and hope that this sample is
representative of the whole population.
• If we compute the loss using the data points in our dataset,
it’s called empirical risk.
• It is “empirical” and not “true” because we are using a
dataset that’s a subset of the whole population.
• When our learning model is built, we have to pick a
function that minimizes the empirical risk that is the delta
between predicted output and actual output for data points
in the dataset.
• This process of finding this function is called empirical
risk minimization (ERM).
• We want to minimize the true risk.
Training and Testing of Model
Training and Testing of Model
Model Fitting
Model (Function) Fitting
• How well a model performs on training /evaluation
datasets will define its characteristics

Underfit Overfit Good Fit

Training Dataset Poor Very Good Good

Evaluation Very Poor Poor Good


Dataset
Model Fitting – Visualization

Variations of model fitting


Errors in Machine Learning

• In machine learning, an error is a


measure of how accurately an
algorithm can make predictions for
the previously unknown dataset.
• On the basis of these errors, the
machine learning model is selected
that can perform best on the
particular dataset.
Machine Learning Errors
• Reducible errors: These errors can be
reduced to improve the model accuracy.
Such errors can further be classified into
bias and Variance.
• Irreducible errors: These errors will
always be present in the model
regardless of which algorithm has been
used. The cause of these errors is
unknown variables whose value can't be
reduced.
What is Bias

• In general, a machine learning model


analyses the data, find patterns in it
and make predictions.
• While training, the model learns these
patterns in the dataset and applies
them to test data for prediction.
• While making predictions, a
difference occurs between
prediction values made by the
model and actual values/expected
values, and this difference is known
as bias errors or Errors due to bias.
What is a Variance Error

• The variance would specify the amount of


variation in the prediction if the different training
data was used.
• In simple words, variance tells that how much
a random variable is different from its
expected value.
• Ideally, a model should not vary too much from
one training dataset to another, which means the
algorithm should be good in understanding the
hidden mapping between inputs and output
variables.
• Variance errors are either of low variance or
high variance.

Over-fitted model where we see model performance on, a)


training data b) new data
Regularizing a Deep Network
(Technique to prevent overfitting)
• Regularization is a technique which makes
slight modifications to the learning algorithm
such that the model generalizes better.
• This in turn improves the model’s
performance on the unseen data.
• Reduce the complexity of the model

You might also like