0% found this document useful (0 votes)
6 views

Lecture W2c

The lecture covers general conduct expectations for students and outlines the topics of linear regression with multiple variables, including dataset splits and performance evaluation metrics. Key concepts discussed include the loss function, gradient descent, and the importance of training, validation, and testing datasets. The lecture also emphasizes the evaluation of regression models using metrics like Mean Squared Error and R-Squared.

Uploaded by

Zainab Athar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Lecture W2c

The lecture covers general conduct expectations for students and outlines the topics of linear regression with multiple variables, including dataset splits and performance evaluation metrics. Key concepts discussed include the loss function, gradient descent, and the importance of training, validation, and testing datasets. The lecture also emphasizes the evaluation of regression models using metrics like Mean Squared Error and R-Squared.

Uploaded by

Zainab Athar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

CS-471: Machine Learning

Week 2 - Lecture 3
Instructor: Dr. Daud Abdullah
General Conduct
• Be respectful of others

• Only speak at your turn and preferably raise your hand if you want
to say something

• Do not cut off others when they are talking

• Join the class on time. Always close the door causing minimum
hindrance

2
Lecture Outline

• Recap of Previous Lecture

• Linear Regression with multiple variables

• Linear Regression Example

• Dataset Splits

• Regression Performance Evaluation

3
Recap of Previous Lecture
• What is the loss function used in linear regression?
• What is residual?
• What is the gradient of this loss?
• How does gradient descent find the optimum weights?
• What is the formula for gradient descent?
• What are the possible problems with gradient descent?

4
Training, Validation and Testing

• Dataset is usually split in to training set, validation set and testing


set
• Training set is used to train your model and estimate its
parameters
• Validation set is used to validate the performance of your model
and tune the hyper-parameters
• Testing set is used to check the accuracy of your final model
• We need our model to perform well on unseen data

5
Choosing Step Size (Learning Rate)

𝒘𝑛𝑒𝑤 = 𝒘𝑜𝑙𝑑 − 𝛼∇𝒘 𝑇𝑟𝑎𝑖𝑛𝐿𝑜𝑠𝑠 𝒘

• Could be constant

• Could be decreasing (1/𝑠𝑞𝑟𝑡(𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑢𝑝𝑑𝑎𝑡𝑒𝑠 𝑚𝑎𝑑𝑒 𝑠𝑜 𝑓𝑎𝑟)


Initial 1

6
Regression Model Performance Evaluation

• Mean Squared Error (MSE): Average squared difference between


predicted and actual values.
• Root Mean Squared Error (RMSE): Square Root of MSE.
• Mean Absolute Error (MAE): Average absolute difference between
predicted and actual values.
• R-Squared (R2): Proportion of variance in target variable explained
by model. Ranges between 0 and 1. Higher means better
performance.

7
Multiple features (variables)
Size in feet2 (𝑥) Price ($) in 1000’s (𝑦)

2104 400
1416 232
1534 315
852 178
… …

𝑓𝑤,𝑏 𝑥 = 𝑤𝑥 + 𝑏

Andrew Ng
Multiple features (variables)
Size in Number of Number of Age of home Price ($) in
feet2 bedrooms floors in years $1000’s

2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …
x𝑗 = 𝑗 𝑡ℎ feature
𝑛 = number of features
x 𝑖 = features of 𝑖 𝑡ℎ training example
𝑖
x𝑗 = value of feature 𝑗 in 𝑖 𝑡ℎ training example

Andrew Ng
Model:
Previously: 𝑓𝑤,𝑏 𝑥 = 𝑤𝑥 + 𝑏

𝑓𝑤,𝑏 x = 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯ + 𝑤𝑛 𝑥𝑛 + 𝑏

Andrew Ng
𝑓𝑤,𝑏 𝑥 = 𝑤1 𝑥1 + 𝑤2𝑥2 + ⋯ + 𝑤𝑛 𝑥𝑛 + 𝑏

𝑓w,𝑏 x = w ∙ x + 𝑏 =

multiple linear regression

Andrew Ng
Previous notation Vector notation
Parameters 𝑤1 , ⋯ , 𝑤𝑛
w = 𝑤1 ⋯ 𝑤𝑛
𝑏 𝑏
Model 𝑓w,𝑏 x = 𝑤1 𝑥1 + ⋯ + 𝑤𝑛 𝑥𝑛 + 𝑏 𝑓w,𝑏 x = w ∙ x + 𝑏

Cost function 𝐽 𝑤1 , ⋯ , 𝑤𝑛 , 𝑏 𝐽 w, 𝑏

Gradient descent
repeat { repeat {
𝜕 𝜕
𝑤𝑗 = 𝑤𝑗 − 𝛼𝜕𝑤 𝐽 𝑤1 , ⋯ , 𝑤𝑛 , 𝑏 𝑤𝑗 = 𝑤𝑗 − 𝛼𝜕𝑤 𝐽 w, 𝑏
𝑗 𝑗
𝜕 𝜕
𝑏=𝑏 − 𝛼𝜕𝑏 𝐽 𝑤1 , ⋯ , 𝑤𝑛 , 𝑏 𝑏 = 𝑏 − 𝛼𝜕𝑏 𝐽 w, 𝑏
} }
Andrew Ng
Gradient descent
One feature 𝑛 features 𝑛 ≥ 2
repeat {
𝑚 repeat { 𝑚
1 1 𝑖
𝑤 = 𝑤 − 𝛼 ෍ 𝑓𝑤,𝑏 𝑥 𝑖 −𝑦 𝑖 𝑥 𝑖 𝑤1 = 𝑤1 − 𝛼 ෍ 𝑓w,𝑏 x 𝑖 − 𝑦 𝑖
𝑥1
𝑚 𝑚
𝑖=1 𝑖=1
⋮ 𝜕
𝐽 w, 𝑏
𝜕 𝜕𝑤1
𝜕𝑤
𝐽 𝑤, 𝑏 𝑚
1 𝑖 𝑖 𝑖
𝑤𝑛 = 𝑤𝑛 − 𝛼 ෍ 𝑓w,𝑏 x −𝑦 𝑥𝑛
𝑚
𝑚 𝑖=1
𝑚 1
1 𝑖 𝑖 𝑏 = 𝑏 − 𝛼 ෍ 𝑓w,𝑏 x 𝑖 −𝑦 𝑖
𝑏 = 𝑏 − 𝛼 ෍ 𝑓𝑤,𝑏 𝑥 −𝑦 𝑚
𝑚 𝑖=1
𝑖=1 simultaneously update
simultaneously update 𝑤, 𝑏 𝑤𝑗 (for 𝑗 = 1, ⋯ , 𝑛) and 𝑏
} }

Andrew Ng
Linear Regression Example
• Training data given for linear regression is:

𝒙 𝑦
[1,0] 2
[1,0] 4
[0,1] 1

• Initialize weights as 0. Calculate the updated weights for this


problem using 2 iterations.

8
QUESTIONS???
AC K N OW L E D G E M E N T !
• Various contents in this presentation have been taken from different books,
lecture notes, and the web. These solely belong to their owners, and are here used
only for clarifying various educational concepts. Any copyright infringement is
not intended.

You might also like