Linear Regression
Linear Regression
Machine Learning
Linear Regression
h: hypothesis function
How to represent hypothesis h?
θi are parameters
- θ0 is zero condition
- θ1 is gradient
θi are parameters
- θ0 is zero condition
- θ1 is gradient
Parameters:
Cost Function:
Goal:
(for fixed , this is a function of x) (function of the parameters )
500
400
Price ($)
in 1000’s 300
200
100
0
0 1000 2000 3000
Size in feet2 (x)
Contour plot or Contour figure
Minimizing a function
Want
Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum
J(θ0,θ1)
θ1
θ0
If the function has multiple local minima, where one starts
can decide which minimum is reached
J(θ0,θ1)
θ1
θ0
Gradient descent algorithm
update
and
simultaneously
“Batch” Gradient Descent
2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …
Multiple features (variables).
2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …
Notation:
= number of features
= input (features) of training example.
= value of feature in training example.
Hypothesis:
Previously:
Parameters:
Cost function:
Gradient descent:
Repeat
Repeat
(simultaneously update )
New algorithm :
Gradient Descent
Repeat
Previously (n=1):
Repeat
(simultaneously update )
Practical aspects of applying
gradient descent
Feature Scaling
Idea: Make sure features are on a similar scale.
Mean normalization:
Replace with to make features have approximately zero
mean (Do not apply to ).
Price
(y)
Size (x)