0% found this document useful (0 votes)

3 views

lecture03b_overfitting

The document discusses the concepts of underfitting and overfitting in machine learning, particularly focusing on linear models. It explains how underfitting occurs when a model is too simple to capture the underlying function, while overfitting happens when a model is too complex and fits the noise in the data. The document also suggests augmenting input features to improve model performance and provides strategies to mitigate overfitting, such as increasing the amount of data.

Uploaded by

Quan Nguyen

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

lecture03b_overfitting

Uploaded by

Quan Nguyen

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Machine Learning Course - CS-433

Underfitting and Overfitting

Sept 24, 2024

Martin Jaggi
Last updated on: September 24, 2024
credits to Mohammad Emtiyaz Khan & Rüdiger Urbanke
Motivation
Models can be too limited or they can be too rich. In the
first case we cannot find a function that is a good fit for the
data in our model. We then say that we underfit. In the
second case we have such a rich model family that we do not
just fit the underlying function but we in fact fit the noise
in the data as well. We then talk about an overfit. Both of
these phenomena are undesirable. This discussion is made
more difficult since all we have is data and so we do not know
a priori what part is the underlying signal and what part is
noise.

Underfitting with Linear Models

It is easy to see that linear models might underfit. Consider
a scalar case as shown in the figure below.

1 M =0
t

−1

0 x 1

The solid curve is the underlying function and the circles

are the actual data. E.g., we assume that there is a scalar
function g(x) but that we do not observe g(xn) directly but
only a noisy version of it, yn = g(xn) + Zn, where Zn is
the noise. The noise might be due for example to some
measurement inaccuracies. The yn are shown as blue circles.
If our model family consists of only linear functions of the
scalar input x, i.e., H = {fw (x) = wx}, where w is a scalar
constant (the slope of the function), then it is clear that
we cannot match the given function accurately, regardless of
how many samples we get and how small the noise is. We
therefore will underfit.

Extended/Augmented Feature Vectors From the above

example it might seem that linear models are too simple to
ever overfit. But in fact, linear models are highly prone to
overfitting, much more so than complicated models like neu-
ral nets.
Since linear models are inherently not very rich the following
is a standard “trick” to make them more powerful.
In order to increase the representational power of linear mod-
els we typically “augment” the input. E.g., if the input (fea-
ture) is one-dimensional we might add a polynomial basis (of
arbitrary degree M ),

ϕ(xn) := [1, xn, x2n, x3n, . . . , xM

n ]

so that we end up with an extended feature vector.

We then fit a linear model to this extended feature vector
ϕ(xn):
⊤
yn ≈ w0 + w1xn + w2x2n + . . . + wM xM
n =: ϕ(x n ) w.
Overfitting with Linear Models
In the following four figures, circles are data points, the green
line represents the “true function”, and the red line is the
model. The parameter M is the maximum degree in the
polynomial basis.

1 M =0 1 M =1
t t

0 0

−1 −1

0 x 1 0 x 1

1 M =3 1 M =9
t t

0 0

−1 −1

0 x 1 0 x 1

For M = 0 (the model is a constant) the model is under-

fitting and the same is true for M = 1. For M = 3 the
model fits the data fairly well and is not yet so rich as to fit
in addition the small “wiggles” caused by the noise. But for
M = 9 we now have such a rich model that it can fit every
single data point and we see severe overfitting taking place.
What can we do to avoid overfitting? If you increase the
amount of data (increase N , but keep M fixed), overfitting
might reduce. This is shown in the following two figures
where we again consider the same model complexity M = 9
but we have extra data (N = 15 or even N = 100).

1 N = 15 1 N = 100
t t

0 0

−1 −1

0 x 1 0 x 1

A Word About Notation

If it is important to distinguish the original input x from
the augmented input then we will use ϕ(x) to denote this
augmented input vector. But we can consider this augmen-
tation as part of the pre-processing, and then we might sim-
ply write x to denote the input. This will save us a lot of
notation.

Additional Materials
Read about overfitting in the paper by Pedro Domingos (Sections 3 and 5
of “A few useful things to know about machine learning”).

The Hundred-Page Machine Learning Book - Andriy Burkov
No ratings yet
The Hundred-Page Machine Learning Book - Andriy Burkov
16 pages
MSDS Mk50 PSTASS
No ratings yet
MSDS Mk50 PSTASS
4 pages
lecture03b_overfitting_annotated
No ratings yet
lecture03b_overfitting_annotated
5 pages
Lec3 Linear Regression With Multiple Vars
No ratings yet
Lec3 Linear Regression With Multiple Vars
30 pages
w1d_linear_regression_regularization
No ratings yet
w1d_linear_regression_regularization
4 pages
The Problem of Overfitting - Coursera
No ratings yet
The Problem of Overfitting - Coursera
1 page
Overfitting
No ratings yet
Overfitting
7 pages
Machine Learning and Pattern Recognition Week 3 Intro - Classification
No ratings yet
Machine Learning and Pattern Recognition Week 3 Intro - Classification
5 pages
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
No ratings yet
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
11 pages
Lecture 3-Linear-Regression-Part2
No ratings yet
Lecture 3-Linear-Regression-Part2
45 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
116 pages
Lec-6
No ratings yet
Lec-6
31 pages
Regression Analysis
No ratings yet
Regression Analysis
11 pages
Lecture5
No ratings yet
Lecture5
26 pages
ML 01
No ratings yet
ML 01
24 pages
Week 15
No ratings yet
Week 15
41 pages
ML 19.03 Sidenotes
No ratings yet
ML 19.03 Sidenotes
30 pages
Data Science Concepts Overfitting Underfitting
No ratings yet
Data Science Concepts Overfitting Underfitting
8 pages
Support Vector Machine
No ratings yet
Support Vector Machine
35 pages
Bias and Variance in Machine Learning
No ratings yet
Bias and Variance in Machine Learning
3 pages
U&O Fitting
No ratings yet
U&O Fitting
6 pages
ML Lec SVM Linear
No ratings yet
ML Lec SVM Linear
19 pages
Lab Manual 05
No ratings yet
Lab Manual 05
13 pages
DSA5105 Lecture1
No ratings yet
DSA5105 Lecture1
51 pages
Overfitting & Feature Engineering.pptx
No ratings yet
Overfitting & Feature Engineering.pptx
37 pages
ML Answer Key (M.tech)
No ratings yet
ML Answer Key (M.tech)
31 pages
Linear Regression
No ratings yet
Linear Regression
31 pages
Python Learning
No ratings yet
Python Learning
21 pages
Lecture9_ML-Algorithms
No ratings yet
Lecture9_ML-Algorithms
22 pages
DL UNIT2
No ratings yet
DL UNIT2
22 pages
Statistical Learning Theory
No ratings yet
Statistical Learning Theory
4 pages
Data science unit-I notes
No ratings yet
Data science unit-I notes
3 pages
DSA5102X_lecture1
No ratings yet
DSA5102X_lecture1
51 pages
ML 3 & 4 Notes
No ratings yet
ML 3 & 4 Notes
18 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
Introduction To ML
No ratings yet
Introduction To ML
55 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Underfitting and Overfitting Slides and Transcript
No ratings yet
Underfitting and Overfitting Slides and Transcript
13 pages
CS115 01
No ratings yet
CS115 01
38 pages
Lecture1
No ratings yet
Lecture1
56 pages
Ds 2
No ratings yet
Ds 2
27 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
linear+regression+with+multiple+variable
No ratings yet
linear+regression+with+multiple+variable
30 pages
Classification
No ratings yet
Classification
53 pages
Machine Leafning
No ratings yet
Machine Leafning
5 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
18ai61-Model Question Paper Solutions
No ratings yet
18ai61-Model Question Paper Solutions
71 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
CH 1
No ratings yet
CH 1
24 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
ML & DL
No ratings yet
ML & DL
19 pages
5.3 Model
No ratings yet
5.3 Model
26 pages
EE5434 Regression
No ratings yet
EE5434 Regression
96 pages
Machine Learning - Open Elective - Part III
No ratings yet
Machine Learning - Open Elective - Part III
90 pages
ML Bu
No ratings yet
ML Bu
31 pages
Lecture 05
No ratings yet
Lecture 05
10 pages
07 - Evaluating Performance
No ratings yet
07 - Evaluating Performance
46 pages
EE2211 Lecture 7
No ratings yet
EE2211 Lecture 7
43 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
RISC-V Instruction Set Summary
No ratings yet
RISC-V Instruction Set Summary
4 pages
Inf Theory 3
No ratings yet
Inf Theory 3
76 pages
Chapter4 Slides
No ratings yet
Chapter4 Slides
42 pages
Chapter6 Slides
No ratings yet
Chapter6 Slides
28 pages
08 Giaigandung Hephuongtrinh BG Tuan8 Editted
No ratings yet
08 Giaigandung Hephuongtrinh BG Tuan8 Editted
18 pages
07PPLapdon - BG - T7 Editted
No ratings yet
07PPLapdon - BG - T7 Editted
21 pages
CM Digital Transformation Harvard Business Review Analyst Paper f22537 202003 en 0
100% (1)
CM Digital Transformation Harvard Business Review Analyst Paper f22537 202003 en 0
16 pages
Datasheet 8071 2D 47581 Valve Controller SSR JUL2015 Rev4
No ratings yet
Datasheet 8071 2D 47581 Valve Controller SSR JUL2015 Rev4
7 pages
Introduction To Object Oriented Database: Unit-I
No ratings yet
Introduction To Object Oriented Database: Unit-I
67 pages
Wireless World 1982 10
No ratings yet
Wireless World 1982 10
132 pages
PKM Method and Style
100% (5)
PKM Method and Style
22 pages
Part Catalogue Yamaha YZF R25 2020 B4P1 MALAYSIA
No ratings yet
Part Catalogue Yamaha YZF R25 2020 B4P1 MALAYSIA
69 pages
Fire and Piping Design
No ratings yet
Fire and Piping Design
13 pages
11.6 Low Temperature Sensitisation and Low Temperature Embrittlement of Austenitic Stainless Steel at Reactor Operating Conditions: Effect of Residual Strain
No ratings yet
11.6 Low Temperature Sensitisation and Low Temperature Embrittlement of Austenitic Stainless Steel at Reactor Operating Conditions: Effect of Residual Strain
2 pages
ISO 27701 Service Sheet
No ratings yet
ISO 27701 Service Sheet
2 pages
Block 1
No ratings yet
Block 1
57 pages
Random Modulation A Review
No ratings yet
Random Modulation A Review
10 pages
Assessing the p
No ratings yet
Assessing the p
56 pages
P5 Final Report Margarita Gancayco Barcia 5064902
No ratings yet
P5 Final Report Margarita Gancayco Barcia 5064902
131 pages
Lab 2
No ratings yet
Lab 2
7 pages
Silo - Tips - Neuro Fuzzy Inference System For e Commerce Website Evaluation
No ratings yet
Silo - Tips - Neuro Fuzzy Inference System For e Commerce Website Evaluation
5 pages
Lesotho Prospectus
No ratings yet
Lesotho Prospectus
24 pages
API Sealing Capacity
No ratings yet
API Sealing Capacity
8 pages
FMS Course Brouchre
No ratings yet
FMS Course Brouchre
20 pages
Assessment Exam BSIT
No ratings yet
Assessment Exam BSIT
11 pages
Forwarder: A Logger S Best Friend
No ratings yet
Forwarder: A Logger S Best Friend
2 pages
Error CodesEnc
100% (1)
Error CodesEnc
7 pages
English Language - 1st Term 2
No ratings yet
English Language - 1st Term 2
1 page
The PCB Design Process: J. Ebden
No ratings yet
The PCB Design Process: J. Ebden
51 pages
Sap Abap Level1 Content
No ratings yet
Sap Abap Level1 Content
9 pages
Arnaboldi Managing A Public Sector Project The Cas
No ratings yet
Arnaboldi Managing A Public Sector Project The Cas
11 pages
Genset Gen Info Sheet - 19may2020 - JDC
No ratings yet
Genset Gen Info Sheet - 19may2020 - JDC
14 pages
Lecture: Metrics To Evaluate Performance
No ratings yet
Lecture: Metrics To Evaluate Performance
15 pages
LG Flatron W1943ss-Pfi
No ratings yet
LG Flatron W1943ss-Pfi
32 pages
Daily European Equity Opening News - 30th October 2024
No ratings yet
Daily European Equity Opening News - 30th October 2024
3 pages

lecture03b_overfitting

Uploaded by

lecture03b_overfitting

Uploaded by

Machine Learning Course - CS-433

Underfitting and Overfitting

Sept 24, 2024

Underfitting with Linear Models

The solid curve is the underlying function and the circles

Extended/Augmented Feature Vectors From the above

ϕ(xn) := [1, xn, x2n, x3n, . . . , xM

so that we end up with an extended feature vector.

For M = 0 (the model is a constant) the model is under-

A Word About Notation

You might also like