0% found this document useful (0 votes)
0 views

Regression and Optimization in ML

Uploaded by

omvati343
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Regression and Optimization in ML

Uploaded by

omvati343
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Regression and Optimization

Techniques in ML
Trainer: Ms. Nidhi Grover Raheja
Define Linear Regression
• Linear regression is a fundamental statistical method used to
understand the relationship between two variables: one independent
variable (often denoted as 𝑋) and one dependent variable (often
denoted as 𝑦).
• It assumes that there is a linear relationship between these variables,
which means that a change in 𝑋 causes a proportional change in 𝑦.
Regression Basics
• For performing Regression we need to find a function that maps some features or
variables to others sufficiently well.
• The dependent features are called the dependent variables, outputs, or responses.
• The independent features are called the independent variables, inputs,
or predictors.
Similar to
Y=m*x+c
f(Xi)

= Yi

f(xi)
Ordinary Least Squares Method (OLS)
• The estimated or predicted response, 𝑓(𝐱ᵢ), for each observation 𝑖
= 1, …, 𝑛, should be as close as possible to the corresponding actual
response 𝑦ᵢ.
• The differences 𝑦ᵢ - 𝑓(𝐱ᵢ) for all observations 𝑖 = 1, …, 𝑛, are called
the residuals.
• Regression is about determining the best predicted weights, that
is the weights corresponding to the smallest residuals.
• To get the best weights, you usually minimize the sum of squared
residuals (SSR) for all observations 𝑖 = 1, …, 𝑛: SSR = Σᵢ(𝑦ᵢ - 𝑓(𝐱ᵢ))².
• This approach is called the method of ordinary least squares.
Minimize SSR = Σᵢ(𝑦ᵢ - 𝑓(𝐱ᵢ))²
for all observations 𝑖 = 1, …, 𝑛
Evaluate the performance of Linear Regression
When we should use Regression?
• Typically, you need regression to answer whether and how some
phenomenon influences the other or how several variables are
related.
• For example, you can use it to determine if and to what extent the
experience or gender impact salaries.
• Regression is also useful when you want to forecast a
response using a new set of predictors.
Linear Regression
• Linear regression may be defined as the statistical model that
analyzes the linear relationship between a dependent variable with
given set of independent variables.
• Linear relationship between variables means that when the value of
one or more independent variables will change (increase or
decrease), the value of dependent variable will also change
accordingly (increase or decrease).
Source: Linear Regression using Gradient Descent | by Adarsh Menon | Towards Data Science
Optimization in Linear Regression
• Optimization in machine learning refers to the process of fine-tuning
a model to achieve the best possible performance.
• These techniques vary in complexity and applicability depending on
the nature of the problem, dataset size, computational resources, and
desired model performance. Choosing the right optimization
technique often involves experimentation and tuning to achieve
optimal results in machine learning tasks.
• Here's a list of some commonly used optimization techniques:
Gradient Descent Optimization
• Gradient Descent is an optimization algorithm and it finds out the local
minima of a differentiable function. It is a minimization algorithm that
minimizes a given function.
• Thus, an optimization algorithm used to minimize a function by
iteratively moving in the direction of the steepest descent (negative
gradient). In relevance to Linear Regression we must use gradient
descent to minimize the Loss function i.e. MSE.
Loss Function:
• The loss is the error in our predicted value of m and c. Our goal is to
minimize this error to obtain the most accurate value of m and c. We
will use the Mean Squared Error function to calculate the loss.
• Imagine a valley and a
person with no sense of
direction who wants to
get to the bottom of the
valley.

• He goes down the slope


and takes large steps
when the slope is steep
and small steps when the
slope is less steep.

• He decides his next


position based on his
current position and
stops when he gets to the
bottom of the valley
which was his goal.
• Here, the minima is
the origin(0, 0). The
slope here is Tanθ.

• So the slope on the


right side is positive
as 0<θ<90 and its
Tanθ is a positive
value.

• The slope on the


left side is negative
as 90<θ<180 and its
Tanθ is a negative
value.
One important observation
in the graph is that the
slope changes its sign from
positive to negative at
minima. As we move closer
to the minima, the slope
reduces.
• Now going back to our analogy, m can
be considered the current position of
the person.
• D is equivalent to the steepness of the
slope and L can be the speed with which
he moves.
• Now the new value of m that we
calculate using the above equation will
be his next position, and L×D will be the
size of the steps he will take.
• When the slope is more steep (D is
more) he takes longer steps and when it
is less steep (D is less), he takes smaller
steps.
• Finally he arrives at the bottom of the
valley which corresponds to our loss = 0.
Step-by-step Gradient Descent
General Approach of Gradient Descent Optimization
Gradient Descent in Linear Regression
Learning Rate (denoted by ‘r’ or ‘alpha’)
• Learning Rate is a hyperparameter or tuning parameter that
determines the step size at each iteration while moving towards
minima in the function.
• For example, if r = 0.1 in the initial step, it can be taken as r=0.01 in
the next step.
• Likewise it can be reduced exponentially as we iterate further. It is
used more effectively in deep learning.
• When we keep r as constant, we end up with an oscillation problem.
So, we have to reduce the “r” value with each iteration. Reduce the r
value as the iteration step increases.

You might also like