0% found this document useful (0 votes)

20 views

2. Linear Regression, Polynomical, Gradiant Descent

The document provides an overview of regression in machine learning, detailing various types of regression techniques such as linear, polynomial, and logistic regression. It discusses concepts such as univariate and multivariate regression, performance measures, gradient descent, and the bias-variance trade-off. Additionally, it outlines methods to reduce bias and variance in models to improve their predictive accuracy.

Uploaded by

aparnajagatramka.15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

2. Linear Regression, Polynomical, Gradiant Descent

Uploaded by

aparnajagatramka.15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 42

Regression

Regression in machine learning consists of mathematical methods

that allow data scientists to predict a continuous outcome (y) based
on the value of one or more predictor variables (x).
Ex.

Predicting the salary of an

employee on the basis of the year
of experience.
Types of Regression Techniques/Models

There are some regression models

 Linear Regression
 Polynomial Regression
 Ridge Regression
 Lasso Regression
 Logistic Regression
Univariate and Multivariate
• to predict a single value that is univariate
• to predict multiple values , that would be a multivariate
regression problem.
Simple Linear Regression
Ex.
Size in feet(X) Price (Y)

2104 460

1416 232

1534 315

852 178

Notation :
m= Number of Training example
X’s=Input variable/Features
Y’s=Output variable/target Variable
Simple Linear Model

General Equation for Linear Regression Model is written as:

Simple Linear Model
Hypothesis function in Linear Regression
Hypothesis function in Linear Regression
Linear regression model prediction (vectorized
form)
Best Fit Line
• The best Fit Line equation provides a straight line that represents
the relationship between the dependent and independent
variables.
• primary objective while using linear regression is to locate the
best-fit line, which implies that the error between the predicted
and actual values should be kept to a minimum
Performance Measures
1. Root mean square error (RMSE)

2. Mean absolute error (MAE

Cost function
Cost function is the error or difference between prediction value and
actual values
Gradient Descent
Gradient descent is a generic optimization algorithm capable of
finding optimal solutions to a wide range of problems.

Idea : Start by filling θ with random values (this is called random

initialization). Then improve it gradually, taking one baby step at a
time, each step attempting to decrease the cost function (e.g., the MSE),
until the algorithm converges to a minimum .
Gradient Descent
Step Size and Learning Rate
• If the learning rate is too small, then the algorithm will have to go
through many iterations to converge, which will take a long time
• If the learning rate is too high, you might jump across the valley
and end up on the other side, possibly even higher up than you
were before.
Step Size and Learning Rate
Local and Global Minima
Step Size and Learning Rate
• If the learning rate is too small, then the algorithm will have to go
through many iterations to converge, which will take a long time
• If the learning rate is too high, you might jump across the valley
and end up on the other side, possibly even higher up than you
were before.
Types of Gradient Descent
• Batch Gradient Descent
• Stochastic Gradient Descent
• Mini-Batch Gradient Descent
Batch Gradient Descent
In Batch Gradient Descent, all the training data is taken into
consideration to take a single step.
Stochastic Gradient Descent

In Stochastic Gradient Descent (SGD), we consider just one

example at a time to take a single step.
The algorithm much faster because it has very little data to
manipulate at every iteration.
Mini Batch Gradient Descent

MinibatchGD computes the gradients on small random sets of

instances called minibatches. The main advantage of mini-batch
GD over stochastic GD is that you can get a performance boost
from hardware optimization of matrix operations, especially when
using GPUs.
Polynomial Regression

• Add powers of each

feature as new features,
then train a linear model
on this extended set of
features. This technique is
called polynomial
regression.
Polynomial Regression

• Add powers of each

feature as new features,
then train a linear model
on this extended set of
features. This technique is
called polynomial
regression.
Overfitting and Underfitting

This high-degree polynomial

regression model is severely
overfitting the training data,
while the linear model is
underfitting it.
Learning Curve

• learning curves, which are

plots of the model’s training
error and validation error as a
function of the training
iteration.
Learning Curve

• Overfitting of model
• Note: One way to improve an
overfitting model is to feed it
more training data until the
validation error reaches the
training error.
Cross Validation

• Cross validation is a technique used

in machine learning to evaluate the
performance of a model on unseen
data. It involves dividing the
available data into multiple folds or
subsets, using one of these folds as
a validation set, and training the
model on the remaining folds.
The bias/variance trade-off

The model’s generalization error can be expressed as the sum of

three very different errors:
Bias
Variance
Irreducible error
What is Bias?
• This part of the generalization error is due to wrong assumptions
• Being high in biasing gives a large error in training as well as testing
data. It recommended that an algorithm should always be low-biased
to avoid the problem of underfitting.
• By high bias, the data predicted is in a straight line format, thus not
fitting accurately in the data in the data set. Such fitting is known as
the Underfitting of Data.
• This happens when the hypothesis is too simple or linear in nature.
Graph given below for an example of such a situation.

In such a problem, a hypothesis looks like follows

A model has either of the two situations:
• Low bias – Low bias value implies fewer assumptions have been
made to build the target function. In this scenario, the model will
closely match the training dataset.
• High bias – High bias value implies more assumptions have been
made to build the target function. In this scenario, the model will not
match the dataset closely.
A high-bias model will be unable to capture the dataset trend. It has a
high error rate and is considered an underfitting model. This happens
because of a very simplified algorithm. For instance, a linear regression
model might be biased if the data has a non-linear relationship.
Ways To Reduce High Bias
Since we have discussed some disadvantages of having high bias, here
are some ways to reduce high bias in machine learning.
• Use a complex model: The extremely simplified model is the main
cause of high bias. It is incapable of capturing the data complexity. In
such scenarios, the model can be made more complex.
• Increase the training data size: Increasing the training data size can
help reduce bias. This is because the model is being provided with
more examples to learn from the dataset.
• Increase the features: Increasing the number of features will
increase the complexity of the model. This improves the ability of
the model to capture the underlying data patterns.
• Reduce regularisation of the model: L1 and L2 regularisation can
help prevent overfitting and improve the model’s generalisation
ability. Reducing the regularisation or removing it completely can
help improve the performance.
What is Variance?
• This part is due to the model’s excessive sensitivity to small variations
in the training data. A model with many degrees of freedom (such as a
highdegree polynomial model) is likely to have high variance and thus
overfit the training data.
• When a model is high on variance, it is then said to as Overfitting of
Data.
• Overfitting is fitting the training set accurately via complex curve and
high order hypothesis but is not the solution as the error with unseen
data is high.
• While training a data model variance should be kept low.
The high variance data looks as follows.

In such a problem, a hypothesis looks like

follows.
Variance error is either low or high:
• Low variance: Low variance implies that the ML model is less
sensitive to changes in the training data. The model will be able to
produce consistent estimates for the target function using different
data subsets of the same distribution. This is underfitting, where the
model can’t generalize on test and training data.
• High variance: High variance implies that the ML model is
susceptible to changes in the training data. When trained on various
subsets of data from the same distribution, the ML model can
significantly change the target function estimation. This scenario is
known as overfitting when the ML model does well on the training
data but not on any new data.
Ways To Reduce High Variance
Here are some ways high variance can be reduced:
• Simplifying the model: Decreasing the number of parameters of
neural network layers can help reduce the complexity of the model.
This, in turn, helps in reducing the variance of the model.
• Ensemble methods: Boosting, stacking and bagging are common
ensemble techniques that can help reduce the variance of an ML
model and improve the generalisation performance.
• Early stopping: This is a technique used for preventing overfitting by
putting a stop to the deep learning model training when the validation
set performance stops improving.
Bias Variance Tradeoff
• If the algorithm is too simple (hypothesis with linear equation) then it
may be on high bias and low variance condition and thus is error-
prone.
• If algorithms fit too complex (hypothesis with high degree equation)
then it may be on high variance and low bias.
• In the latter condition, the new entries will not perform well. Well,
there is something between both of these conditions, known as a
Trade-off or Bias Variance Trade-off.
• This tradeoff in complexity is why there is a tradeoff between bias and
variance. An algorithm can’t be more complex and less complex at the
same time.
1.Low-Bias, Low-Variance:
The combination is an ideal machine learning model. However, it is not
possible practically.
2.Low-Bias, High-Variance: This is a case of overfitting where model
predictions are inconsistent and accurate on average. The predicted
values will be accurate(average) but will be scattered.
3.High-Bias, Low-Variance: This is a case of underfitting where
predictions are consistent but inaccurate on average. The predicted
values will be inaccurate but will be not scattered.
4.High-Bias, High-Variance:
With high bias and high variance, predictions are inconsistent and also
inaccurate on average.

Orhan Gazi - Principles of Signals and Systems-Springer (2023)
No ratings yet
Orhan Gazi - Principles of Signals and Systems-Springer (2023)
376 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
CryptoNote v1 (Archive - Org)
No ratings yet
CryptoNote v1 (Archive - Org)
16 pages
Collaborative Learning Exercise 4 PDF
No ratings yet
Collaborative Learning Exercise 4 PDF
2 pages
MACHINE LEARNING NOTES ANNA UNIVERSITY
No ratings yet
MACHINE LEARNING NOTES ANNA UNIVERSITY
9 pages
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
Ensemble Method
No ratings yet
Ensemble Method
12 pages
module 3 modified
No ratings yet
module 3 modified
48 pages
[Technical] Machine Learning U3-6 [2019 Pattern]
No ratings yet
[Technical] Machine Learning U3-6 [2019 Pattern]
101 pages
Unit 4
No ratings yet
Unit 4
50 pages
DL_Unit1 (1)
100% (1)
DL_Unit1 (1)
79 pages
Machine Learning-2
No ratings yet
Machine Learning-2
87 pages
Chapter2 1 22
No ratings yet
Chapter2 1 22
9 pages
Machine Learning Math Essentials _12.02.2025
No ratings yet
Machine Learning Math Essentials _12.02.2025
88 pages
Regularization Linear Models
No ratings yet
Regularization Linear Models
23 pages
unit-1.2-Perceptron-2024
No ratings yet
unit-1.2-Perceptron-2024
107 pages
Bias and Variance
No ratings yet
Bias and Variance
7 pages
Lec 3
No ratings yet
Lec 3
13 pages
Merge +1
No ratings yet
Merge +1
107 pages
ML 21-22 Sem
No ratings yet
ML 21-22 Sem
10 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
Unit 2
No ratings yet
Unit 2
97 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
ML Decode
No ratings yet
ML Decode
130 pages
Linear Regression Summary
No ratings yet
Linear Regression Summary
57 pages
Bias and Variance
No ratings yet
Bias and Variance
36 pages
emsemble methods-pages-deleted
No ratings yet
emsemble methods-pages-deleted
2 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
11 July Unit 1 - Copy
No ratings yet
11 July Unit 1 - Copy
47 pages
Machine Learning Models
No ratings yet
Machine Learning Models
54 pages
Session 3
No ratings yet
Session 3
26 pages
ML-1-PPT-UNIT-1
No ratings yet
ML-1-PPT-UNIT-1
93 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Deep Learning[1]
No ratings yet
Deep Learning[1]
26 pages
Machine Learning Volume I 280820241047
No ratings yet
Machine Learning Volume I 280820241047
4 pages
Bias_and_Variance
No ratings yet
Bias_and_Variance
4 pages
ML Decode
No ratings yet
ML Decode
130 pages
module3_DS_ppt
No ratings yet
module3_DS_ppt
68 pages
4 - Bias-Variance Tradeoff
No ratings yet
4 - Bias-Variance Tradeoff
28 pages
Model Evaluation
No ratings yet
Model Evaluation
29 pages
Bais and Variance
No ratings yet
Bais and Variance
4 pages
40_Machine_Learning_Interview_Questions
No ratings yet
40_Machine_Learning_Interview_Questions
55 pages
Bias, Variance, and Tradeoff
No ratings yet
Bias, Variance, and Tradeoff
8 pages
Bias Variance Overfitting
No ratings yet
Bias Variance Overfitting
3 pages
Underfitting and Overfitting
No ratings yet
Underfitting and Overfitting
4 pages
2.2 ML Session Bias Variance Tradeoffs
No ratings yet
2.2 ML Session Bias Variance Tradeoffs
38 pages
Unit - 2 Deep Learning
No ratings yet
Unit - 2 Deep Learning
26 pages
U&O Fitting
No ratings yet
U&O Fitting
6 pages
Bias and Variance in Machine Learning
No ratings yet
Bias and Variance in Machine Learning
3 pages
All DL
No ratings yet
All DL
72 pages
15-The Bias - Variance - Trade-Off-08-04-2024
No ratings yet
15-The Bias - Variance - Trade-Off-08-04-2024
23 pages
Bias and Variance in Machine Learning
100% (1)
Bias and Variance in Machine Learning
7 pages
Chapter 1-ML
No ratings yet
Chapter 1-ML
27 pages
ML - WEEK 06
No ratings yet
ML - WEEK 06
31 pages
Bias and Variance
No ratings yet
Bias and Variance
6 pages
Linear Regression
No ratings yet
Linear Regression
37 pages
Deep Learning - Summary - Deep - Learning
No ratings yet
Deep Learning - Summary - Deep - Learning
17 pages
Underfitting & Overfitting
No ratings yet
Underfitting & Overfitting
13 pages
vsat2k_ML_Ch1a Evaluation of Learning Algorithms - Jan 2025
No ratings yet
vsat2k_ML_Ch1a Evaluation of Learning Algorithms - Jan 2025
19 pages
Introduction To ML
No ratings yet
Introduction To ML
55 pages
Bias Variance dichotomy
No ratings yet
Bias Variance dichotomy
11 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
Approximations and Round-Off Errors
No ratings yet
Approximations and Round-Off Errors
16 pages
Theory of Computation
100% (1)
Theory of Computation
48 pages
More Moore/Mealy Machines
No ratings yet
More Moore/Mealy Machines
17 pages
String Notes
No ratings yet
String Notes
87 pages
Download full Understanding Regression Analysis A Conditional Distribution Approach 1st Edition Peter H. Westfall ebook all chapters
100% (2)
Download full Understanding Regression Analysis A Conditional Distribution Approach 1st Edition Peter H. Westfall ebook all chapters
49 pages
Reinforcement Car Racing With A3C
No ratings yet
Reinforcement Car Racing With A3C
8 pages
Bloom Filter: Algorithm Description
No ratings yet
Bloom Filter: Algorithm Description
11 pages
IandF CT6 201704 ExaminersReport
No ratings yet
IandF CT6 201704 ExaminersReport
12 pages
InfyTQ Practice Problem - Day 23
No ratings yet
InfyTQ Practice Problem - Day 23
3 pages
FENG IEM Make Up Exams Schedule Fall 2024 2025
No ratings yet
FENG IEM Make Up Exams Schedule Fall 2024 2025
4 pages
Reading 1 Multiple Regression
No ratings yet
Reading 1 Multiple Regression
31 pages
Image Compression Using DCT Implementing Matlab
50% (2)
Image Compression Using DCT Implementing Matlab
23 pages
COA Lec-Multiplication, Division and Comparator
No ratings yet
COA Lec-Multiplication, Division and Comparator
19 pages
Deep Learning Xilinx
No ratings yet
Deep Learning Xilinx
11 pages
Cryptography and Network Security
No ratings yet
Cryptography and Network Security
56 pages
Neural Engineering Computation Representation and
No ratings yet
Neural Engineering Computation Representation and
86 pages
Quick Sort Algorithm
No ratings yet
Quick Sort Algorithm
23 pages
Floquetlecture - Konrad Viebahn
No ratings yet
Floquetlecture - Konrad Viebahn
12 pages
Finding Optimal Neural Network Architecture Using Genetic Algorithms
No ratings yet
Finding Optimal Neural Network Architecture Using Genetic Algorithms
10 pages
Lecture 15method of Variation of Parameters
No ratings yet
Lecture 15method of Variation of Parameters
16 pages
Particle Mechanics
No ratings yet
Particle Mechanics
3 pages
Year - B.Sc. (Data Science) (NEP Pattern) First Year Semester-I Subject - BSCDS011 - Data Structure and Algorithm Using Python
No ratings yet
Year - B.Sc. (Data Science) (NEP Pattern) First Year Semester-I Subject - BSCDS011 - Data Structure and Algorithm Using Python
2 pages
δ ∆ ∇=∇ ∆ ∆ f (x) for f (x) =x
No ratings yet
δ ∆ ∇=∇ ∆ ∆ f (x) for f (x) =x
4 pages
5 MUST Watch Youtube Channels For ML
No ratings yet
5 MUST Watch Youtube Channels For ML
13 pages
Comp Prob
No ratings yet
Comp Prob
8 pages
Artificial Intelligence in Drilling 1686676588
No ratings yet
Artificial Intelligence in Drilling 1686676588
30 pages
EE3350 Lecture MatlabIntro Yuan Hu
No ratings yet
EE3350 Lecture MatlabIntro Yuan Hu
31 pages

2. Linear Regression, Polynomical, Gradiant Descent

Uploaded by

2. Linear Regression, Polynomical, Gradiant Descent

Uploaded by

Regression

Regression in machine learning consists of mathematical methods

Predicting the salary of an

There are some regression models

General Equation for Linear Regression Model is written as:

2. Mean absolute error (MAE

Idea : Start by filling θ with random values (this is called random

In Stochastic Gradient Descent (SGD), we consider just one

MinibatchGD computes the gradients on small random sets of

• Add powers of each

• Add powers of each

This high-degree polynomial

• learning curves, which are

• Cross validation is a technique used

The model’s generalization error can be expressed as the sum of

In such a problem, a hypothesis looks like follows

In such a problem, a hypothesis looks like

You might also like