0% found this document useful (0 votes)
32 views

19 Assessing Model Accuracy

The document discusses assessing the accuracy of models. It explains that mean squared error (MSE) is used to measure how well models fit training and test data, with lower MSE indicating better fit. Overfitting occurs when a model has very low training MSE but high test MSE, indicating it has learned patterns from the training data that do not generalize. The bias-variance tradeoff is described as the relationship between model flexibility, bias, and variance, with the goal being to minimize both bias and variance for best accuracy on new data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

19 Assessing Model Accuracy

The document discusses assessing the accuracy of models. It explains that mean squared error (MSE) is used to measure how well models fit training and test data, with lower MSE indicating better fit. Overfitting occurs when a model has very low training MSE but high test MSE, indicating it has learned patterns from the training data that do not generalize. The bias-variance tradeoff is described as the relationship between model flexibility, bias, and variance, with the goal being to minimize both bias and variance for best accuracy on new data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Assessing Model Accuracy

Prof. Asim Tewari


IIT Bombay

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
Measuring the Quality of Fit
Mean Squared Error (MSE):

We are interested in the accuracy of the


predictions that we obtain when we apply our
method to previously unseen data (Test MSE)
and not on the trained data (Training MSE).

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
Overfitting
- Test MSE
-Training MSE

Left: Data simulated from f, shown in black. Three estimates of f are shown:
the linear regression line (orange curve), and two smoothing spline fits (blue
and green curves). Right: Training MSE (grey curve), test MSE (red curve), and
minimum possible test MSE over all methods (dashed line). Squares represent
the training and test MSEs for the three fits shown in the left-hand panel.
When a given method yields a small training MSE but a large test MSE, we are said to
be overfitting the data (The U-shape in test MSE)
Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
Overfitting
• When a given method yields a small training MSE but a
large test MSE, we are said to be overfitting the data.
• Statistical learning procedure may be picking up some
patterns that are just caused by random chance rather
than by true properties of the unknown function f.
• Regardless of whether or not overfitting has occurred,
we almost always expect the training MSE to be
smaller than the test MSE because most statistical
learning methods either directly or indirectly seek to
minimize the training MSE.
• Overfitting refers specifically to the case in which a less
flexible model would have yielded a smaller test MSE.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
Overfitting examples
- Test MSE
-Training MSE

A different true f that is much closer to linear. In this


setting, linear regression provides a very good fit to the
data.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
Overfitting examples
- Test MSE
-Training MSE

A different f that is far from linear. In this setting, linear


regression provides a very poor fit to the data.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
Overfitting examples
- Test MSE - Test MSE
-Training MSE -Training MSE

A different f that is far from linear. In this setting, linear


regression provides a very poor fit to the data.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
The Bias-Variance Trade-Off
The U-shape observed in the test MSE curves is a
result of two competing properties of statistical
learning methods.

It can be shown that that the expected test MSE, for


a given value x0 is given by:

In order to minimize the expected test error, we


need to select a statistical learning method that
simultaneously achieves low variance and low bias.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
Training and Testing

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
The Bias-Variance Trade-Off
The U-shape observed in the test MSE curves is a
result of two competing properties of statistical
learning methods.

It can be shown that that the expected test MSE, for


a given value x0 is given by:

In order to minimize the expected test error, we


need to select a statistical learning method that
simultaneously achieves low variance and low bias.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
The Bias-Variance Trade-Off
• Variance refers to the amount by which
መ 0 )would change if we estimated it using a
𝑓(𝑥
different training data set.
• Ideally the estimate for f should not vary too
much between training sets.
• In general, more flexible statistical methods
have higher variance.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
The Bias-Variance Trade-Off
• Bias refers to the error that is introduced by
approximating a real-life problem, which may
be extremely complicated, by a much simpler
model.
• Bias will (on an average) either over predict or
under predict the results.
• Generally, more flexible methods result in less
bias.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
The Bias-Variance Trade-Off
- Test MSE
-Training MSE

The vertical dotted line indicates the flexibility level corresponding to the smallest test MSE.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
The Bias-Variance Trade-Off
- Test MSE
-Training MSE

The vertical dotted line indicates the flexibility level corresponding to the smallest test MSE.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
The Bias-Variance Trade-Off

The vertical dotted line indicates the flexibility level corresponding to the smallest test MSE.

Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications
Asim Tewari, IIT Bombay ME 781: Engineering Data Mining and Applications

You might also like