Notes 04
Notes 04
EE514 – CS535
Zubair Khalid
https://www.zubairkhalid.org/ee514_2023.html
Outline
- Regression Set-up
- Linear Regression
- Polynomial Regression
- Underfitting/Overfitting
- Regularization
- Gradient Descent Algorithm
Regression
Regression: Quantitative Prediction on a continuous scale
- Given a data sample, predict a numerical value
Process or f (x)
x System
y
Input Noise n Observed
Output
x Process or System y
Input Observed
Output
Single Feature:
- Predict score in the course given the number of hours of effort per week.
- Establish the relationship between the monthly e-commerce sales and the advertising costs.
Multiple Feature:
- Studying operational efficiency of machine given sensors (temperature, vibration) data.
- Predicting remaining useful life (RUL) of the battery from charging and discharging information.
- Estimate sales volume given population demographics, GDP indicators, climate data, etc.
- Predict crop yield using remote sensing (satellite images, gravity information).
- Dynamic Pricing or Surge Pricing by ride sharing applications (Uber).
- Rate the condition (fatigue or distraction) of the driver given the video.
- Rate the quality of driving given the data from sensors installed on car or driving patterns.
Regression
Model Formulation and Setup:
True Model:
We assume there is an inherent Process or y
x
but unknown relationship between System
input and output. Input Noise n Observed
Output
Goal:
Given noisy observations, we need to
estimate the unknown functional
relationship as accurately as possible.
True unknown function
Observations
𝐱
Regression
Model Formulation and Setup:
Process or
System
- Single Feature Regression, Example: Input Noise n Observed
Output
𝒚 Training Data
First Data Sample: x (1) , y (1)
Second Data Sample: x ( 2 ) , y ( 2 )
…
n-th Data Sample: x ( n ) , y ( n )
𝐱
Regression
Model Formulation and Setup:
Observed Output
We have: Input
Process or
System
Noise n Error
Model
Model Output
Linear Regression
Overview:
- Second learning algorithm of the course
- Different from KNN: Linear regression adopts a modular approach which we will use
most of the times in the course.
- Select a model
- Defining a loss function
- Formulate an optimization problem to find the model parameters such that a loss
function is minimized.
- Employ different techniques to solve optimization problem or minimize loss function.
Linear Regression
Model:
Linear Regression
Model:
What is Linear?
𝐨𝐫 𝒚
Interpretation:
𝐨𝐫 𝒚
Linear Regression
Define Loss Function:
- Loss function should be a function of model parameters.
Observed values
𝒚
Residual error
𝐱
Linear Regression
Define Loss Function:
How to solve?
Linear Regression
Define Loss Function:
Reformulation:
Model
Residual error Parameters
Examples:
Linear Regression
Solve Optimization Problem: (Analytical Solution employing Calculus)
Linear Regression
So far and moving forward:
- We assumed that we know the structure of the model, that is, there is a linear
- Linear regression is one of the models for which we can obtain an analytical solution.
- Regression Set-up
- Linear Regression
- Polynomial Regression
- Underfitting/Overfitting
- Regularization
- Gradient Descent Algorithm
Polynomial Regression
Overview:
𝒚
We have seen
this before.
&
We are capable
to solve this!
Polynomial Regression
Single Feature Regression:
Example (Ref: CB. Section 1.1):
Polynomial Regression
Single Feature Regression:
Example:
Underfitting:
Model is too
simple
Overfitting:
Model is too
complex
Polynomial Regression
Single Feature Regression:
Example:
Overfitting
Good choice
of M
Solution 1:
Polynomial Regression
Single Feature Regression:
Example:
Polynomial Regression
Single Feature Regression:
How to Handle Overfitting?
- The polynomial degree M is the hyper-parameter of our model, like we had k in kNN,
and controls the complexity of the model.
- If we stick with M=3 model, this is the restriction on the number of parameters.
- We encounter overfitting for M=9 because we do not have sufficient data.
Solution 2: Take more data points to avoid over-fitting.
Solution 3: Regularization
Outline
- Regression Set-up
- Linear Regression
- Polynomial Regression
- Underfitting/Overfitting
- Regularization
- Gradient Descent Algorithm
Regularization
Regularization overview:
- The concept is broad but we will see in the context of linear regression or polynomial
regression which we formulated as linear regression.
- Encourages the model coefficients to be small by adding a penalty term to the error.
- We had the loss function of the following form that we minimize to find the coefficients:
- This regularization term maintains a trade-off between ‘fit of the model to the data’
and ‘square of norm of the coefficients’.
- If model is fitted poorly, the first term is large.
- If coefficients have high values, the second term (penalty term) is large.
- Regression Set-up
- Linear Regression
- Polynomial Regression
- Underfitting/Overfitting
- Regularization
- Gradient Descent Algorithm
Gradient Descent Algorithm
Optimization and Gradient Descent - Overview
Gradient Descent Algorithm
Optimization and Gradient Descent - Overview
Gradient Descent Algorithm
Formulation:
Gradient Descent Algorithm
Algorithm:
Overall:
Pseudo-code:
Gradient Descent:
Note:
Simultaneous update.
Gradient Descent Algorithm
Linear Regression Case:
Visualization:
Gradient Descent:
Note:
Simultaneous update.
Gradient Descent Algorithm
Notes:
Why?
Pros:
Gradient Descent Algorithm
SGD for Linear Regression Case:
Iteration Epoch
Gradient Descent Algorithm
Mini-batch Stochastic Gradient Descent (SGD) :
Batch Gradient Descent Stochastic Gradient Descent