0% found this document useful (0 votes)

16 views

Notes 04

Uploaded by

HAMXALA KHAN

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Notes 04

Uploaded by

HAMXALA KHAN

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

Machine Learning

EE514 – CS535

Linear Regression: Formulation, Solutions,

Polynomial Regression, Gradient Descent
and Regularization

Zubair Khalid

School of Science and Engineering

Lahore University of Management Sciences

https://www.zubairkhalid.org/ee514_2023.html
Outline

- Regression Set-up
- Linear Regression
- Polynomial Regression
- Underfitting/Overfitting
- Regularization
- Gradient Descent Algorithm
Regression
Regression: Quantitative Prediction on a continuous scale
- Given a data sample, predict a numerical value

Example: Linear relationship

x Process or System y
Input Observed
Output

Process or f (x)
x System
y
Input Noise n Observed
Output

Here, PROCESS or SYSTEM refers to any underlying physical or logical

phenomenon which maps our input data to our observed and noisy output data.
Regression
Overview:

x Process or System y
Input Observed
Output

One variable regression: 𝒚 is a scalar

Multi-variable regression: 𝐲പ is a vector

We will cover
Single feature regression: 𝐱 is a scalar

Multiple feature regression: 𝐱പ is a vector

Regression
Examples:

Single Feature:
- Predict score in the course given the number of hours of effort per week.
- Establish the relationship between the monthly e-commerce sales and the advertising costs.
Multiple Feature:
- Studying operational efficiency of machine given sensors (temperature, vibration) data.
- Predicting remaining useful life (RUL) of the battery from charging and discharging information.
- Estimate sales volume given population demographics, GDP indicators, climate data, etc.
- Predict crop yield using remote sensing (satellite images, gravity information).
- Dynamic Pricing or Surge Pricing by ride sharing applications (Uber).
- Rate the condition (fatigue or distraction) of the driver given the video.
- Rate the quality of driving given the data from sensors installed on car or driving patterns.
Regression
Model Formulation and Setup:
True Model:
We assume there is an inherent Process or y
x
but unknown relationship between System
input and output. Input Noise n Observed
Output

Goal:
Given noisy observations, we need to
estimate the unknown functional
relationship as accurately as possible.
True unknown function
Observations

𝐱
Regression
Model Formulation and Setup:
Process or
System
- Single Feature Regression, Example: Input Noise n Observed
Output

𝒚 Training Data
First Data Sample: x (1) , y (1)
Second Data Sample: x ( 2 ) , y ( 2 )
…
n-th Data Sample: x ( n ) , y ( n )

𝐱
Regression
Model Formulation and Setup:
Observed Output
We have: Input
Process or
System
Noise n Error

Model
Model Output
Linear Regression
Overview:
- Second learning algorithm of the course

- Scalar output is a linear function of the inputs

- Different from KNN: Linear regression adopts a modular approach which we will use
most of the times in the course.
- Select a model
- Defining a loss function
- Formulate an optimization problem to find the model parameters such that a loss
function is minimized.
- Employ different techniques to solve optimization problem or minimize loss function.
Linear Regression
Model:
Linear Regression
Model:
What is Linear?
𝐨𝐫 𝒚
Interpretation:

𝐨𝐫 𝒚
Linear Regression
Define Loss Function:
- Loss function should be a function of model parameters.

Observed values

𝒚
Residual error

True unknown function:

𝐱
Linear Regression
Define Loss Function:

- One minimizer for all loss functions.

Linear Regression
Define Loss Function:

How to solve?
Linear Regression
Define Loss Function:
Reformulation:

Model
Residual error Parameters

Consequently: Observations Inputs

Linear Regression
Solve Optimization Problem: (Analytical Solution employing Calculus)

- Very beautiful, elegant function we have here!

Linear Regression
Solve Optimization Problem: (Analytical Solution employing Calculus)
Gradient of a function: Overview

Examples:
Linear Regression
Solve Optimization Problem: (Analytical Solution employing Calculus)
Linear Regression
So far and moving forward:
- We assumed that we know the structure of the model, that is, there is a linear

relationship between inputs and output.

- Number of parameters = dimension of the feature space + 1 (bias parameter)

- Formulated loss function using residual error.

- Formulated optimization problem and obtain analytical solution.

- Linear regression is one of the models for which we can obtain an analytical solution.

- We will shortly learn an algorithm to solve optimization problem numerically.

Outline

- Regression Set-up
- Linear Regression
- Polynomial Regression
- Underfitting/Overfitting
- Regularization
- Gradient Descent Algorithm
Polynomial Regression
Overview:
𝒚

- If the relationship between the inputs and output is not linear,

Is it linear ?

we can use a polynomial to model the relationship.

- We will formulate the polynomial regression model for single

feature regression problem.

- Polynomial Regression is often termed as Non-linear

𝐱
Regression or Linear in Parameter Regression.

- We will also revisit the concept of ‘over-fitting’.

Polynomial Regression
Single Feature Regression:
Formulation:
Polynomial Regression
Single Feature Regression:
Formulation:

We have seen
this before.
&
We are capable
to solve this!
Polynomial Regression
Single Feature Regression:
Example (Ref: CB. Section 1.1):
Polynomial Regression
Single Feature Regression:
Example:

Underfitting:
Model is too
simple

Overfitting:
Model is too
complex
Polynomial Regression
Single Feature Regression:
Example:
Overfitting

Good choice
of M

Solution 1:
Polynomial Regression
Single Feature Regression:
Example:
Polynomial Regression
Single Feature Regression:
How to Handle Overfitting?
- The polynomial degree M is the hyper-parameter of our model, like we had k in kNN,
and controls the complexity of the model.
- If we stick with M=3 model, this is the restriction on the number of parameters.
- We encounter overfitting for M=9 because we do not have sufficient data.
Solution 2: Take more data points to avoid over-fitting.

Solution 3: Regularization
Outline

- Regression Set-up
- Linear Regression
- Polynomial Regression
- Underfitting/Overfitting
- Regularization
- Gradient Descent Algorithm
Regularization
Regularization overview:
- The concept is broad but we will see in the context of linear regression or polynomial
regression which we formulated as linear regression.
- Encourages the model coefficients to be small by adding a penalty term to the error.
- We had the loss function of the following form that we minimize to find the coefficients:

See linear regression

formulation.

- We add a ‘penalty term’, known as regularizer, in the loss function as

Regularized Loss function Regularizer

Regularization
L2 Least-squares Regularization – Ridge Regression:
- Since we require to discourage the model coefficients from reaching large values; we can
use the following simple regularizer:

- For this choice, regularized loss function becomes

- This regularization term maintains a trade-off between ‘fit of the model to the data’
and ‘square of norm of the coefficients’.
- If model is fitted poorly, the first term is large.
- If coefficients have high values, the second term (penalty term) is large.

Intuitive Interpretation: We want to minimize the error while

keeping the norm of the coefficients bounded.
Regularization
L2 Least-squares Regularization – Ridge Regression:
- Regularized loss function is still quadratic, and we can find closed form solution.
Regularization
L2 Least-squares Regularization – Ridge Regression:
Example:

No regularization Too much regularization

Regularization
L2 Least-squares Regularization – Ridge Regression:
Example:
Regularization
L2 Least-squares Regularization – Ridge Regression:
Graphical Visualization:
Regularization
L1 Least-squares Regularization – Lasso Regression
Graphical Visualization:
Regularization
Elastic Net Regression, L1 vs L2
Outline

- Regression Set-up
- Linear Regression
- Polynomial Regression
- Underfitting/Overfitting
- Regularization
- Gradient Descent Algorithm
Gradient Descent Algorithm
Optimization and Gradient Descent - Overview
Gradient Descent Algorithm
Optimization and Gradient Descent - Overview
Gradient Descent Algorithm
Formulation:
Gradient Descent Algorithm
Algorithm:
Overall:

Pseudo-code:

Note: Simultaneous update.

Convergence and Step size:

Gradient Descent Algorithm
Linear Regression Case:

Gradient Descent:

Note:
Simultaneous update.
Gradient Descent Algorithm
Linear Regression Case:
Visualization:

Surface plot Contour plot

Gradient Descent Algorithm
Linear Regression Case:

Gradient Descent:

Note:
Simultaneous update.
Gradient Descent Algorithm
Notes:

Why?

Stochastic Gradient Descent:

Gradient Descent Algorithm
Stochastic Gradient Descent (SGD) - Rationale:
Gradient Descent Algorithm
Stochastic Gradient Descent (SGD):

Pros:
Gradient Descent Algorithm
SGD for Linear Regression Case:

Iteration Epoch
Gradient Descent Algorithm
Mini-batch Stochastic Gradient Descent (SGD) :
Batch Gradient Descent Stochastic Gradient Descent

Lab Manual 8
No ratings yet
Lab Manual 8
12 pages
Reference Model Method
No ratings yet
Reference Model Method
30 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
Chap 5-1 - Machine Learning Basics - Jinwook Kim
No ratings yet
Chap 5-1 - Machine Learning Basics - Jinwook Kim
39 pages
Parameter Estimation Techniques - IVC 1997 - Official
No ratings yet
Parameter Estimation Techniques - IVC 1997 - Official
18 pages
SL_LMRG
No ratings yet
SL_LMRG
32 pages
NN For Intelligent Control Final
No ratings yet
NN For Intelligent Control Final
48 pages
Theme 3 - 3 Regulators v2 PDF
No ratings yet
Theme 3 - 3 Regulators v2 PDF
65 pages
SLCT 07 Linearization Slide
No ratings yet
SLCT 07 Linearization Slide
24 pages
ML Lab 06 Manual - Linear Regression 1 (Version 6)
No ratings yet
ML Lab 06 Manual - Linear Regression 1 (Version 6)
8 pages
ML Lab 07 Manual - Linear Regression 2 (Updated Version 4)
No ratings yet
ML Lab 07 Manual - Linear Regression 2 (Updated Version 4)
8 pages
Lecture 10: Object-Oriented Analysis (System Sequence Diagrams, Conceptual Model)
No ratings yet
Lecture 10: Object-Oriented Analysis (System Sequence Diagrams, Conceptual Model)
58 pages
2024 Revised
No ratings yet
2024 Revised
5 pages
Lecture 4 Introduction to Calculus (Part 1)
No ratings yet
Lecture 4 Introduction to Calculus (Part 1)
45 pages
Exercise - 3: DS203-2024-S1 Roll Number: 23B2215
No ratings yet
Exercise - 3: DS203-2024-S1 Roll Number: 23B2215
25 pages
Function Activate
No ratings yet
Function Activate
9 pages
21function_241208_110433
No ratings yet
21function_241208_110433
40 pages
ML Lab 08 Manual - Logisitic Regression (Ver7)
No ratings yet
ML Lab 08 Manual - Logisitic Regression (Ver7)
9 pages
AI & Machine Learning
No ratings yet
AI & Machine Learning
47 pages
Control Systems Lab (SC4070) : Laboratory Sessions
No ratings yet
Control Systems Lab (SC4070) : Laboratory Sessions
2 pages
CSL0777 L15
No ratings yet
CSL0777 L15
24 pages
SLCT 06 Analysis Slide
No ratings yet
SLCT 06 Analysis Slide
16 pages
f8194544 Microsoft PowerPoint DeepLearning
No ratings yet
f8194544 Microsoft PowerPoint DeepLearning
28 pages
Eem520l3 2023
No ratings yet
Eem520l3 2023
25 pages
SE 3
No ratings yet
SE 3
40 pages
DL Answers
No ratings yet
DL Answers
24 pages
Experiment 2.5 DL
No ratings yet
Experiment 2.5 DL
3 pages
Week 9 PDF
No ratings yet
Week 9 PDF
70 pages
SE5072_Process integration
No ratings yet
SE5072_Process integration
45 pages
EPS-DL-Handout7-Ex2 ANN Model for Binary Classification
No ratings yet
EPS-DL-Handout7-Ex2 ANN Model for Binary Classification
17 pages
SLRG
No ratings yet
SLRG
13 pages
Unit - III: ANN Applications
No ratings yet
Unit - III: ANN Applications
31 pages
DL Unit2 HD
No ratings yet
DL Unit2 HD
141 pages
Workbook of Pattern Recognition
No ratings yet
Workbook of Pattern Recognition
11 pages
2 - Multiple Linear Regression
No ratings yet
2 - Multiple Linear Regression
71 pages
Adquisición de Datos y Control Neuronal Usando LabVIEW
No ratings yet
Adquisición de Datos y Control Neuronal Usando LabVIEW
6 pages
Deep Learning
No ratings yet
Deep Learning
40 pages
NN MTH404
No ratings yet
NN MTH404
9 pages
Lecture NN 2005
No ratings yet
Lecture NN 2005
137 pages
SC Exp1
No ratings yet
SC Exp1
19 pages
EXPE7
No ratings yet
EXPE7
10 pages
shortnotedeeplearning (2)
No ratings yet
shortnotedeeplearning (2)
11 pages
Supervised and Unsupervised Learning
No ratings yet
Supervised and Unsupervised Learning
92 pages
Expt-1
No ratings yet
Expt-1
6 pages
M2- Autoencoders
No ratings yet
M2- Autoencoders
25 pages
How To Choose An Activation Function For Deep Learning
No ratings yet
How To Choose An Activation Function For Deep Learning
15 pages
Deterministic STR-3
No ratings yet
Deterministic STR-3
35 pages
01 Intro
No ratings yet
01 Intro
49 pages
Aditya Jain NN Assignment
No ratings yet
Aditya Jain NN Assignment
13 pages
CH 04
No ratings yet
CH 04
46 pages
Machine Learning
No ratings yet
Machine Learning
43 pages
Msai349 Project Final Report
No ratings yet
Msai349 Project Final Report
5 pages
Controls Group Assignment
No ratings yet
Controls Group Assignment
20 pages
Experiment 2
No ratings yet
Experiment 2
7 pages
‏لقطة شاشة 2024-12-14 في 9.55.00 م
No ratings yet
‏لقطة شاشة 2024-12-14 في 9.55.00 م
69 pages
Lab 03 - Linear Regression
No ratings yet
Lab 03 - Linear Regression
5 pages
Lec 3-5 (Function Approximation)
No ratings yet
Lec 3-5 (Function Approximation)
34 pages
Linear Regression
No ratings yet
Linear Regression
104 pages
CS601 Machine Learning Unit 2 Notes 1672759753
No ratings yet
CS601 Machine Learning Unit 2 Notes 1672759753
14 pages
Nonlinear Control Feedback Linearization Sliding Mode Control
From Everand
Nonlinear Control Feedback Linearization Sliding Mode Control
Mourad Boufadene
No ratings yet
V11I1-1160
No ratings yet
V11I1-1160
9 pages
ThesisTobiasRuizMoreno MiMAnalyticsv2
No ratings yet
ThesisTobiasRuizMoreno MiMAnalyticsv2
104 pages
Machine Learning Training Program - March 24
No ratings yet
Machine Learning Training Program - March 24
8 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Analysis and Prediction of Soccer Games - An Application To The Kaggle European Soccer Database
No ratings yet
Analysis and Prediction of Soccer Games - An Application To The Kaggle European Soccer Database
6 pages
LLM ML Interview Q
No ratings yet
LLM ML Interview Q
43 pages
18CSO106T Data Analysis Using Open Source Tool: Question Bank
No ratings yet
18CSO106T Data Analysis Using Open Source Tool: Question Bank
26 pages
Complete Guide To Parameter Tuning in XGBoost (With Codes in Python) PDF
No ratings yet
Complete Guide To Parameter Tuning in XGBoost (With Codes in Python) PDF
20 pages
Building LSTM-Based Model For Solar Energy Forecasting - by Dr. Saptarsi Goswami - Towards Data Science
No ratings yet
Building LSTM-Based Model For Solar Energy Forecasting - by Dr. Saptarsi Goswami - Towards Data Science
7 pages
ESN
No ratings yet
ESN
9 pages
Intro To Data Science Summary
No ratings yet
Intro To Data Science Summary
17 pages
ESWA-D-24-18957_reviewer (1)
No ratings yet
ESWA-D-24-18957_reviewer (1)
32 pages
A Data Analytics Tutorial Building Predictive
No ratings yet
A Data Analytics Tutorial Building Predictive
15 pages
The Machine Learning Audit Andrew Clark
No ratings yet
The Machine Learning Audit Andrew Clark
30 pages
UNIT 3-Bayesian Statistics
No ratings yet
UNIT 3-Bayesian Statistics
80 pages
Deep Learning
No ratings yet
Deep Learning
11 pages
Best Machine Learning Interview Questions and Answers
No ratings yet
Best Machine Learning Interview Questions and Answers
38 pages
Behavioural Based Detection of Android Ransomware Using Machine Learning Techniques
No ratings yet
Behavioural Based Detection of Android Ransomware Using Machine Learning Techniques
34 pages
3. DM_AI22C07_UNIT 3 (1)
No ratings yet
3. DM_AI22C07_UNIT 3 (1)
272 pages
Unit 1 ML
No ratings yet
Unit 1 ML
70 pages
Package Automl': R Topics Documented
No ratings yet
Package Automl': R Topics Documented
12 pages
Decision Trees: A Recent Overview: S. B. Kotsiantis
No ratings yet
Decision Trees: A Recent Overview: S. B. Kotsiantis
23 pages
Unit Iir20
No ratings yet
Unit Iir20
22 pages
ERERER
No ratings yet
ERERER
1 page
Intellipaat-Com-Bl
No ratings yet
Intellipaat-Com-Bl
24 pages
NCA-AIIO - 2
No ratings yet
NCA-AIIO - 2
11 pages
P-2.1.2 Cross Validation and Regularization
No ratings yet
P-2.1.2 Cross Validation and Regularization
37 pages
ML Question Papers
No ratings yet
ML Question Papers
8 pages
The Role of Machine Learning Algorithms For Diagnosing Diseases
No ratings yet
The Role of Machine Learning Algorithms For Diagnosing Diseases
10 pages
A Mobile-Optimized Convolutional Neural Network Approach For Real-Time Batik Pattern Recognition
No ratings yet
A Mobile-Optimized Convolutional Neural Network Approach For Real-Time Batik Pattern Recognition
10 pages

Notes 04

Uploaded by

Notes 04

Uploaded by

Machine Learning

Linear Regression: Formulation, Solutions,

School of Science and Engineering

Example: Linear relationship

Here, PROCESS or SYSTEM refers to any underlying physical or logical

One variable regression: 𝒚 is a scalar

Multi-variable regression: 𝐲പ is a vector

Multiple feature regression: 𝐱പ is a vector

- Scalar output is a linear function of the inputs

True unknown function:

- One minimizer for all loss functions.

Consequently: Observations Inputs

- Very beautiful, elegant function we have here!

relationship between inputs and output.

- Number of parameters = dimension of the feature space + 1 (bias parameter)

- Formulated loss function using residual error.

- Formulated optimization problem and obtain analytical solution.

- We will shortly learn an algorithm to solve optimization problem numerically.

- If the relationship between the inputs and output is not linear,

we can use a polynomial to model the relationship.

- We will formulate the polynomial regression model for single

feature regression problem.

- Polynomial Regression is often termed as Non-linear

- We will also revisit the concept of ‘over-fitting’.

See linear regression

- We add a ‘penalty term’, known as regularizer, in the loss function as

Regularized Loss function Regularizer

- For this choice, regularized loss function becomes

Intuitive Interpretation: We want to minimize the error while

No regularization Too much regularization

Note: Simultaneous update.

Convergence and Step size:

Surface plot Contour plot

Stochastic Gradient Descent:

You might also like