0% found this document useful (0 votes)
117 views42 pages

Regression:: Predicting House Prices

The document discusses using linear regression models to predict house prices based on features like square footage. It covers fitting a linear regression line, assessing model fit, overfitting, and strategies for improving the model like adding additional features or using higher-order polynomials.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views42 pages

Regression:: Predicting House Prices

The document discusses using linear regression models to predict house prices based on features like square footage. It covers fitting a linear regression line, assessing model fit, overfitting, and strategies for improving the model like adding additional features or using higher-order polynomials.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Regression:

Predicting House Prices


Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
1

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Predicting house prices

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

How much is my house worth?

I want to list
my house
for sale

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

How much is my house worth?

$$ ????

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Look at recent sales in my neighborhood


How much did they sell for?

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

price ($)

Plot recent house sales


(Past 2 years)
y

Terminology:

square feet (sq.ft.)


6

2015 Emily Fox & Carlos Guestrin

x feature,
covariate, or
predictor
y observation or
response
Machine Learning Specializa0on

price ($)

Predict your house by


similar houses
y

square feet (sq.ft.)


7

2015 Emily Fox & Carlos Guestrin

No house sold
recently had exactly
the same sq.ft.
Machine Learning Specializa0on

price ($)

Predict your house by


similar houses
y

square feet (sq.ft.)


8

2015 Emily Fox & Carlos Guestrin

Look at average
price in range
Still only 2 houses!
Throwing out info
from all other sales
Machine Learning Specializa0on

Linear regression

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Use a linear regression model


Fit a line through the data

price ($)

f(x) = w0+w1 x
square feet (sq.ft.)
10

2015 Emily Fox & Carlos Guestrin

parameters
of model
Machine Learning Specializa0on

Use a linear regression model


Fit a line through the data

price ($)

fw (x) = w0+w1 x
square feet (sq.ft.)
11

2015 Emily Fox & Carlos Guestrin

function
parameterized by
w = (w0 ,w1 )
Machine Learning Specializa0on

Which line?

price ($)

fw (x) = w0+w1 x
dierent parameters w
square feet (sq.ft.)
12

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Cost of using a given line


Residual sum of squares (RSS)

price ($)

RSS(w0,w1) =
($house 1-[w0+w1sq.ft.house 1])2
+ ($house 2-[w0+w1sq.ft.house 2])2
+ ($house 3-[w0+w1sq.ft.house 3])2
+ [include all houses]
square feet (sq.ft.)

13

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Find best line


Minimize cost over all
possible w0,w1

price ($)

RSS(w0,w1) =
($house 1-[w0+w1sq.ft.house 1])2
+ ($house 2-[w0+w1sq.ft.house 2])2
+ ($house 3-[w0+w1sq.ft.house 3])2
+ [include all houses]
square feet (sq.ft.)

14

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Predicting your house price


fw*(x) = 0 + 1 x

price ($)

Best guess of your


house price:
= 0 + 1 sq.ft.your house
square feet (sq.ft.)

15

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Adding higher order eects

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Fit data with a line or ?

price ($)

square feet (sq.ft.)


17

2015 Emily Fox & Carlos Guestrin

You show
your friend
your analysis
Machine Learning Specializa0on

Fit data with a line or ?

price ($)

Dude, its
not a linear
relationship!
square feet (sq.ft.)

18

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

What about a quadratic function?

price ($)

Dude, its
not a linear
relationship!
square feet (sq.ft.)

19

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

What about a quadratic function?

price ($)

fw(x) = w0 + w1 x+ w2 x2
square feet (sq.ft.)
20

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Even higher order polynomial

price ($)

I can
minimize
your RSS
square feet (sq.ft.)

21

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Do you believe this fit?

price ($)

y
My house
isnt worth
so little
square feet (sq.ft.)
22

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Evaluating overfitting via


training/test split

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Do you believe this fit?

price ($)

Minimizes RSS,
but bad predictions

square feet (sq.ft.)


24

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

What about a quadratic function?

price ($)

fw(x) = w0 + w1 x+ w2 x2
square feet (sq.ft.)
25

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

How to choose model


order/complexity

Want good predictions, but


cant observe future
Simulate predictions
1. Remove some houses
2. Fit model on remaining
3. Predict heldout houses
26

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Training/test split

Terminology: training set


test set
27

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Training error

price ($)

Minimize to
find
square feet (sq.ft.)
28

2015 Emily Fox & Carlos Guestrin

Training error (w) =


($train 1-fw(sq.ft.train 1))2
+ ($train 2-fw(sq.ft.train 2))2
+ ($train 3-fw(sq.ft.train 3))2
+ [include all
training houses]
x
Machine Learning Specializa0on

Test error

price ($)

Assess
predictions
using
square feet (sq.ft.)

29

2015 Emily Fox & Carlos Guestrin

Test error () =
($test 1-f(sq.ft.test 1))2
+ ($test 2-f(sq.ft.test 2))2
+ ($test 3-f(sq.ft.test 3))2
+ [include all
test houses]
x
Machine Learning Specializa0on

Error

Training/Test Curves

Model complexity
30

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Adding other features

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

price ($)

Predictions just based on


house size
y

Only 1 bathroom!
Not same as my
3 bathrooms
square feet (sq.ft.)

32

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Add more features

price ($)

fw(x) = w0 + w1 sq.ft.
+ w2 #bath

x2

square feet (sq.ft.)


33

2015 Emily Fox & Carlos Guestrin

x1
Machine Learning Specializa0on

How many features to use?


Possible choices:
-Square feet
-# bathrooms
-# bedrooms
-Lot size
-Year built
-

See Regression Course!


34

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Other regression examples

35

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Salary after ML specialization

hard work

How much will your salary be? (y = $$)


Depends on x = performance in courses, quality of
capstone project, # of forum responses,
36

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Salary after ML specialization

hard work

= 0 + 1 performance +
2 capstone + 3 forum
informed by other students who
completed specialization
37

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Stock prediction
Predict the price of a stock
Depends on
-Recent history of stock price
-News events
-Related commodities

38

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Tweet popularity
How many people will retweet your tweet?
Depends on # followers,
# of followers of followers,
features of text tweeted,
popularity of hashtag,
# of past retweets,

39

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Smart houses
Smart houses have many distributed sensors
Whats the temperature at your desk? (no sensor)
- Learn spatial function to predict temp

Also depends on
- Thermostat setting
- Blinds open/closed
or window tint
- Vents
- Temperature outside
- Time of day
40

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Summary for regression

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

What you can do now


Describe the input (features) and output (real-valued
predictions) of a regression model
Calculate a goodness-of-fit metric (e.g., RSS)
Estimate model parameters by minimizing RSS
(algorithms to come)
Exploit the estimated model to form predictions
Perform a training/test split of the data
Analyze performance of various regression models in
terms of test error
Use test error to avoid overfitting when selecting amongst
candidate models
Describe a regression model using multiple features
Describe other applications where regression is useful
42

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

You might also like