0% found this document useful (0 votes)

12 views35 pages

L3 Assessingperformance Errors Biasvar Annotated

Uploaded by

meldakarakis0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views35 pages

L3 Assessingperformance Errors Biasvar Annotated

Uploaded by

meldakarakis0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

4/3/18

Regression:
Predicting House Prices
STAT/CSE 416: Intro to Machine Learning
Emily Fox
University of Washington
April 3, 2018
©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Feature x ML ŷ
Training
extraction model
Data

y ⌃
f
ML algorithm

Quality
metric
2 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

1
4/3/18

Inputs vs. features

Hi ML expert,
here is a data
table to analyze

3 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

h(x)
Feature ML ŷ
Training
extraction model
Data

y ŵ

ML algorithm

Quality
metric
4 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

2
4/3/18

Generic linear regression model

Model:
yi = w0 h0(xi) + w1 h1(xi) + … + wD hD(xi) + εi
D
X
= wj hj(xi) + εi
j=0

feature 1 = h0(x) … e.g., 1

feature 2 = h1(x) … e.g., x[1] = sq. ft.
feature 3 = h2(x) … e.g., x[2] = #bath
or, log(x[7]) x[2] = log(#bed) x #bath
…
feature D+1 = hD(x) … some other function of x[1],…, x[d]
5 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Simple linear regression model

Fit a line through the data
y yi = w0+w1 xi + εi
price ($)

f(x) = w0+w1 x

parameters
square feet (sq.ft.) x of model
6 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

3
4/3/18

Simple linear regression

Model:
yi = w0 + w1 xi+ εi

Input? Output?

feature 1 = parameter 1 = w0
feature 2 = parameter 2 = w1

7 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Even higher order polynomial

y
price ($)

f(x) = w0 + w1 x+ w2 x2 + … + wp xp

square feet (sq.ft.) x

4
4/3/18

Polynomial regression
Model:
yi = w0 + w1 xi+ w2 xi2 + … + wp xip + εi

Input? Output?

feature 1 = parameter 1 = w0
feature 2 = parameter 2 = w1
feature 3 = parameter 3 = w2
… …
feature p+1 = parameter p+1 = wp
9 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Capturing trends and seasonality

Model:
yi = w0 + w1 ti + w2 sin(2πti / 12) + w3 cos(2πti / 12) + εi
Linear
trend
Seasonal component =
Input? Output? Sin/cos with period 12
(resets annually)
feature 1 =
feature 2 =
feature 3 =
feature 4 =
10 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

5
4/3/18

Adding more inputs

y f(x) = w0 + w1 sq.ft.
+ w2 #bath
price ($)

x[2]

s
o m
ro
th
ba
#
square feet (sq.ft.) x[1]
11 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Example of linear regression with multiple inputs

Model:
yi = w0 + w1 xi[1] + w2 xi[2] + εi

Input? Output?

feature 1 = parameter 1 = w0
feature 2 = parameter 2 = w1
feature 3 = parameter 3 = w2

12 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

6
4/3/18

Generic linear regression model

Model:
yi = w0 h0(xi) + w1 h1(xi) + … + wD hD(xi) + εi
D
X
= wj hj(xi) + εi
j=0

feature 1 = h0(x) … e.g., 1

feature 2 = h1(x) … e.g., x[1] = sq. ft.
feature 3 = h2(x) … e.g., x[2] = #bath
or, log(x[7]) x[2] = log(#bed) x #bath
…
feature D+1 = hD(x) … some other function of x[1],…, x[d]
13 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

h(x)
Feature ML ŷ
Training
extraction model
Data

y ŵ

ML algorithm

Quality
metric
14 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

7
4/3/18

RSS for multiple regression

RSS(w) = (yi- )2
price ($)

x[2]

s
om
ro
th
ba
#

square feet (sq.ft.) x[1]

h(x)
Feature ML ŷ
Training
extraction model
Data

y ŵ

ML algorithm

Quality
metric
16 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

8
4/3/18

Gradient descent
Algorithm:

while not converged Δ

w(t+1) ß w(t) - η RSS(w(t))

17 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

h(x)
Feature ML ŷ
Training
extraction model
Data

y ŵ

ML algorithm

Quality
metric
18 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

9
4/3/18

Compact notation
D
X
f(xi) = w0 h0(xi) + w1 h1(xi) + … + wD hD(xi) = wj hj(xi)
j=0

1 0 0 0 5 3 0 0 1 0 0 0 0

3 0 0 0 2 0 0 1 0 1 0 0 0

19 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Interpreting the fitted function

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

10
4/3/18

Interpreting the coefficients –

Simple linear regression
y ŷ = ŵ0 + ŵ1 x
price ($)

predicted
change in $

1 sq. ft.

square feet (sq.ft.) x

Interpreting the coefficients –

Two linear features
ŷ = ŵ0 + ŵ1 x[1] + ŵ2 x[2]
fix
y
price ($)

x[2]
s
om
ro
th
ba
#

square feet x[1]

(sq.ft.)
22 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

11
4/3/18

Interpreting the coefficients –

Two linear features
ŷ = ŵ0 + ŵ1 x[1] + ŵ2 x[2]
fix
y
price ($)

predicted
change in $
For fixed
1 bathroom # sq.ft.!
# bathrooms x[2]
23 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Interpreting the coefficients –

Multiple linear features
ŷ = ŵ0 + ŵ1 x[1] + …+ŵj x[j] + … + ŵd x[d]
fix fix fix fix
y
price ($)

x[2]
s
om
ro
th
ba
#

square feet x[1]

(sq.ft.)
24 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

12
4/3/18

Interpreting the coefficients-

Polynomial regression
ŷ = ŵ0 + ŵ1x +… + ŵj xj + … + ŵp xp

y
price ($)

Can’t hold other

features fixed!

square feet x
25
(sq.ft.) STAT/CSE 416: Intro to Machine Learning
©2018 Emily Fox

BONUS: Influence of high leverage points

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

13
4/3/18

SWITCH TO IPYNB

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

BONUS: Asymmetric errors

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

14
4/3/18

Symmetric cost functions

y Residual sum of squares (RSS)
price ($)

RSS(w0,w1) = (yi-[w0+w1xi])2

Assumes cost of over-

estimating sales price is same
as under-estimating
square feet (sq.ft.) x
29 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Asymmetric cost functions

y
different solution
price ($)

What if cost of listing house

too high has bigger cost?
Too high à no offers ($=0)
Too low à offers for lower $
square feet (sq.ft.) x
30 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

15
4/3/18

Summary for regression

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

What you can do now…

• Describe the input (features) and output (real-valued predictions) of a
regression model
• Calculate a goodness-of-fit metric (e.g., RSS)
• Understand how gradient descent is used to estimate model parameters
by minimizing RSS
• Exploit the estimated model to form predictions
• Describe a regression model using multiple features
• Interpret coefficients in a regression model with multiple features
• Describe other applications where regression is useful

32 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

16
4/3/18

Assessing Performance
STAT/CSE 416: Intro to Machine Learning
Emily Fox
University of Washington
April 3, 2018
©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Make predictions, get $, right??

Model + algorithm à fitted function
Algorithm
Model Predictions à decisions à outcome

Fit f

34 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

17
4/3/18

Or, how much am I losing?

Example: Lost $ due to inaccurate listing price
- Too low à low offers
- Too high à few lookers + no/low offers

How much am I losing compared to perfection?

Perfect predictions: Loss = 0

My predictions: Loss = ???

35 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Measuring loss
Loss function: Cost of using ŵ at x
when y is true
L(y,fŵ(x))
actual
f(x) = predicted value ŷ
value

Examples: (assuming loss for underpredicting = overpredicting)

Absolute error: L(y,fŵ(x)) = |y-fŵ(x)|
Squared error: L(y,fŵ(x)) = (y-fŵ(x))2
36 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

18
4/3/18

Fit data with a line or … ?

y
price ($)

Dude, it’s
not a linear
relationship!
square feet (sq.ft.) x
37 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

What about a quadratic function?

y
price ($)

Dude, it’s
not a linear
relationship!
square feet (sq.ft.) x
38 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

19
4/3/18

Even higher order polynomial

y
price ($)

I can
minimize
your RSS
square feet (sq.ft.) x
39 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Do you believe this fit?

y
price ($)

My house
isn’t worth
so little

square feet (sq.ft.) x

20
4/3/18

Do you believe this fit?

y
price ($)

Minimizes RSS,
but bad predictions

square feet (sq.ft.) x

“Remember that all models are wrong; the

practical question is how wrong do they have
to be to not be useful.” George Box, 1987.

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

21
4/3/18

Assessing the loss

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Assessing the loss

Part 1: Training error

©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

22
4/3/18

Define training data

y
price ($)

square feet (sq.ft.) x

Define training data

y
price ($)

square feet (sq.ft.) x

23
4/3/18

Example:
Fit quadratic to minimize RSS
y
price ($)

ŵ minimizes
RSS of
training data
square feet (sq.ft.) x
47 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Compute training error

1. Define a loss function L(y,fŵ(x))
- E.g., squared error, absolute error,…

2. Training error
= avg. loss on houses in training set
N
1 X
= L(yi,fŵ(xi))
N i=1

fit using training data

48 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

24
4/3/18

Example:
Use squared error loss (y-fŵ(x))2
y
price ($)

Training error (ŵ) = 1/N *

[($train 1-fŵ(sq.ft.train 1))2
+ ($train 2-fŵ(sq.ft.train 2))2
+ ($train 3-fŵ(sq.ft.train 3))2
+ … include all
square feet (sq.ft.) x training houses]
49 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Example:
Use squared error loss (y-fŵ(x))2
y
Training error (ŵ) =
N
1 X
(yi-fŵ(xi))2
price ($)

N i=1

RMSE
v =
u N
u1 X
t (y -f (x ))2
N i=1 i ŵ i

square feet (sq.ft.) x

25
4/3/18

Training error vs. model complexity

y
Error

price ($)
square feet (sq.ft.) x
Model complexity

51 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

Training error vs. model complexity

y
Error

price ($)

square feet (sq.ft.) x

Model complexity

26
4/3/18

Training error vs. model complexity

y
Error

price ($)
square feet (sq.ft.) x
Model complexity

Training error vs. model complexity

y
Error

price ($)

square feet (sq.ft.) x

Model complexity

27
4/3/18

Training error vs. model complexity

Error

y Model complexity y

x STAT/CSE 416: Intro to Machine Learning

Is training error a good measure of predictive

performance?
How do we expect to perform on a new house?

y
price ($)

28
4/3/18

Is training error a good measure of predictive

performance?
Is there something particularly bad about having xt sq.ft.??

y
price ($)

Is training error a good measure of predictive

performance?
Issue:
Training error is overly optimistic…ŵ was fit to training data
y
price ($)

Small training error ≠> good predictions

unless training data includes everything you
might ever see
xt
58
square feet (sq.ft.) x
©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning

29
4/3/18

Assessing the loss

Part 2: Generalization (true) error

Generalization error

Really want estimate of loss over all possible ( ,$) pairs

Lots of houses
in neighborhood,
but not in dataset

30
4/3/18

Distribution over houses

In our neighborhood, houses of what # sq.ft. ( )
are we likely to see?

square feet (sq.ft.)

Distribution over sales prices

For houses with a given # sq.ft. ( ), what house prices $
are we likely to see?

For fixed
# sq.ft.

price ($)

31
4/3/18

Generalization error definition

Really want estimate of loss over all possible ( ,$) pairs

average over all possible

(x,y) pairs weighted by
Formally: how likely each is

generalization error = Ex,y[L(y,fŵ(x))]

fit using training data

Generalization error vs. model complexity

y
Error

fŵ
price ($)

square feet (sq.ft.) x

Model complexity

32
4/3/18

Generalization error vs. model complexity

y
Error

fŵ

price ($)
square feet (sq.ft.) x
Model complexity

Generalization error vs. model complexity

y fŵ
Error

price ($)

square feet (sq.ft.) x

Model complexity

33
4/3/18

Generalization error vs. model complexity

y fŵ
Error

price ($)
square feet (sq.ft.) x
Model complexity

Generalization error vs. model complexity

y
Error

fŵ
price ($)

square feet (sq.ft.) x

Model complexity

34
4/3/18

Generalization error vs. model complexity

Can’t
Error

compute!

y Model complexity y

x STAT/CSE 416: Intro to Machine Learning

ETI_Main[1]
No ratings yet
ETI_Main[1]
16 pages
Effect of Chat GPT Chapter4
No ratings yet
Effect of Chat GPT Chapter4
21 pages
Linear Regression1
No ratings yet
Linear Regression1
98 pages
Regression Modeling Multivariate Analysis
No ratings yet
Regression Modeling Multivariate Analysis
96 pages
Affairscloud.com-Current Affairs 3 June 2025
No ratings yet
Affairscloud.com-Current Affairs 3 June 2025
19 pages
DOC-20250619-WA0001
No ratings yet
DOC-20250619-WA0001
13 pages
1_Lab Manual (ML)
No ratings yet
1_Lab Manual (ML)
42 pages
Lecture03 Linear Regression
No ratings yet
Lecture03 Linear Regression
54 pages
OCS351_AIML_Unit_1
No ratings yet
OCS351_AIML_Unit_1
51 pages
Lecture slides - Linear Regression (2025)
No ratings yet
Lecture slides - Linear Regression (2025)
45 pages
cme250_lecture2
No ratings yet
cme250_lecture2
69 pages
2502.06227v1
No ratings yet
2502.06227v1
30 pages
End-to-End Machine Learning Project (Bootcamp)
No ratings yet
End-to-End Machine Learning Project (Bootcamp)
415 pages
3 Ai
No ratings yet
3 Ai
210 pages
Linear Regression - Univariate
No ratings yet
Linear Regression - Univariate
62 pages
retrieve (12) (1)
No ratings yet
retrieve (12) (1)
26 pages
AI Lec 3
No ratings yet
AI Lec 3
36 pages
Lecture W2c
No ratings yet
Lecture W2c
16 pages
Lec3 4 ML Project
No ratings yet
Lec3 4 ML Project
26 pages
Smart Computing and Communication
No ratings yet
Smart Computing and Communication
425 pages
ML Lecture # 04 Multiple Regression
No ratings yet
ML Lecture # 04 Multiple Regression
29 pages
03 Linear Regression Intuition
No ratings yet
03 Linear Regression Intuition
23 pages
Regression (1)-1-4
No ratings yet
Regression (1)-1-4
4 pages
L1 Intro
No ratings yet
L1 Intro
29 pages
AIMLlatestmodule 2Notes Removed
No ratings yet
AIMLlatestmodule 2Notes Removed
33 pages
L2b Regression Fitting Multiple Regression Annotated 3
No ratings yet
L2b Regression Fitting Multiple Regression Annotated 3
28 pages
Lecture Slides-Week9,10
No ratings yet
Lecture Slides-Week9,10
66 pages
module_2
No ratings yet
module_2
35 pages
sourav moocs a2 65
No ratings yet
sourav moocs a2 65
32 pages
Get Practical Artificial Intelligence With Swift From Fundamental Theory To Development of AI Driven Apps Mars Geldard Free All Chapters
100% (6)
Get Practical Artificial Intelligence With Swift From Fundamental Theory To Development of AI Driven Apps Mars Geldard Free All Chapters
62 pages
Ch02-Regression Handout
No ratings yet
Ch02-Regression Handout
22 pages
L03 The Regression Pipeline
No ratings yet
L03 The Regression Pipeline
94 pages
AI.5 Machine Learning (21 26)
No ratings yet
AI.5 Machine Learning (21 26)
176 pages
4 - Học Máy Cơ Bản - Hồi Quy Tuyến Tính
No ratings yet
4 - Học Máy Cơ Bản - Hồi Quy Tuyến Tính
113 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Lecture Slides-Week9
No ratings yet
Lecture Slides-Week9
46 pages
ML - 03 - Machine Learning Systems
No ratings yet
ML - 03 - Machine Learning Systems
60 pages
NSDC-Assessment Processes and Protocols - Guide For STT - Final
No ratings yet
NSDC-Assessment Processes and Protocols - Guide For STT - Final
88 pages
UNIV 1001 (5)
No ratings yet
UNIV 1001 (5)
3 pages
Act7
No ratings yet
Act7
18 pages
Lecture02. ML Pipeline (Chapter 2)
No ratings yet
Lecture02. ML Pipeline (Chapter 2)
50 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
55 pages
ML Cheatsheet
100% (1)
ML Cheatsheet
219 pages
Predictive Maintenance
No ratings yet
Predictive Maintenance
66 pages
Identifying Ethical Issues in AI Partners in Human-AI Co-Creation
No ratings yet
Identifying Ethical Issues in AI Partners in Human-AI Co-Creation
6 pages
AI lab7
No ratings yet
AI lab7
13 pages
2022 CholecT50
No ratings yet
2022 CholecT50
18 pages
C1 W2
No ratings yet
C1 W2
18 pages
Ethical Hacking
No ratings yet
Ethical Hacking
22 pages
ML-ToC
No ratings yet
ML-ToC
10 pages
3
No ratings yet
3
14 pages
Machine Learning Cheat Sheet ??? - ?
No ratings yet
Machine Learning Cheat Sheet ??? - ?
231 pages
Uteach Cs Principles
100% (1)
Uteach Cs Principles
743 pages
ML Cheatsheet PDF
100% (1)
ML Cheatsheet PDF
211 pages
01 02 Intro
No ratings yet
01 02 Intro
11 pages
Regression House Price
No ratings yet
Regression House Price
34 pages
AIA 6600 - Module 1 - Bank Forecasting Cash Needs
No ratings yet
AIA 6600 - Module 1 - Bank Forecasting Cash Needs
4 pages
ML%20PROJECT%20PROPOSAL.pdf
No ratings yet
ML%20PROJECT%20PROPOSAL.pdf
4 pages
Lec 03
No ratings yet
Lec 03
42 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
X-AI - 417-Guide 2024
No ratings yet
X-AI - 417-Guide 2024
105 pages
Mokksh kapur Resume 30-08-2024
No ratings yet
Mokksh kapur Resume 30-08-2024
1 page
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
Linear Regression Program So Far
No ratings yet
Linear Regression Program So Far
33 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
CS435 Ch6
No ratings yet
CS435 Ch6
14 pages
Regression:: Predicting House Prices
No ratings yet
Regression:: Predicting House Prices
42 pages
Machine Learning Regression
No ratings yet
Machine Learning Regression
64 pages
Machine Learning Cheat Sheet
100% (1)
Machine Learning Cheat Sheet
211 pages
Recruitment and Onboarding
No ratings yet
Recruitment and Onboarding
3 pages
C1 W1 Lab03 Model Representation Soln-Copy1
No ratings yet
C1 W1 Lab03 Model Representation Soln-Copy1
7 pages
Regression:: Emily Fox & Carlos Guestrin
No ratings yet
Regression:: Emily Fox & Carlos Guestrin
30 pages
Prediction of House Rent Using Multiple Linear Regression
No ratings yet
Prediction of House Rent Using Multiple Linear Regression
20 pages
Machine Learning The Basics
No ratings yet
Machine Learning The Basics
158 pages
Artificial Intelligence Syllabus
No ratings yet
Artificial Intelligence Syllabus
2 pages
Feature Engineering / Feature Selection
No ratings yet
Feature Engineering / Feature Selection
33 pages
Introduction: Geometric Models: - Page 1 of 25
No ratings yet
Introduction: Geometric Models: - Page 1 of 25
25 pages
Cidu2011 Banerjee Intro To ML 01
No ratings yet
Cidu2011 Banerjee Intro To ML 01
120 pages
BBBB
No ratings yet
BBBB
8 pages
Knowledge Management MCQ
No ratings yet
Knowledge Management MCQ
11 pages
Fruit-Classification Report
100% (1)
Fruit-Classification Report
17 pages
Project 3 Final Draft
No ratings yet
Project 3 Final Draft
5 pages
CSANAK Edit AI For FAshion TZG 2020
No ratings yet
CSANAK Edit AI For FAshion TZG 2020
8 pages
Gen Ai A Guide For Cfos
No ratings yet
Gen Ai A Guide For Cfos
8 pages