0% found this document useful (0 votes)

123 views

Regression: Unit Iii

Uploaded by

Janhvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

123 views

Regression: Unit Iii

Uploaded by

Janhvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 54

UNIT III

REGRESSION

CO-3 :Compare different types of classification models

and their relevant application

CONTENTS:
 Regression: Introduction, Univariate Regression – Least-
Square Method, Model Representation, Cost Functions: MSE,
MAE, R-Square, Performance Evaluation, Optimization of
Simple Linear Regression with Gradient Descent - Example.
Estimating the values of the regression coefficients
 Multivariate Regression: Model Representation

 Introduction to Polynomial Regression: Generalization-

Overfitting Vs. Underfitting, Bias Vs. Variance.

INTRODUCTION
 Regression is a supervised learning technique
 It falls under supervised learning wherein the algorithm is trained with
both input features and output labels
 Linear regression establishes the linear relationship between two variables based on a
line of best fit. Linear regression is thus graphically depicted using a straight line with the
slope defining how the change in one variable impacts a change in the other. The y-intercept
of a linear regression relationship represents the value of one variable when the value of the
other is zero.
 "Regression shows a line or curve that passes through all the datapoints on target-
predictor graph in such a way that the vertical distance between the datapoints and the
regression line is minimum." The distance between datapoints and line tells whether a model
has captured a strong relationship or not.

 Some examples of regression can be as:

 Prediction of rain using temperature and other factors
 Determining Market trends
 Prediction of road accidents due to rash driving.
LINEAR MODEL:
 Linear regression is a linear approach for modelling the relationship
between a scalar response and one or more explanatory variables (also
known as dependent and independent variables).
 The case of one explanatory variable is called
simple linear regression; for more than one, the process is
called multiple linear regression.
 Regression models a target prediction value based on
independent variables.
 It is mostly used for finding out the relationship between
variables and forecasting.
 Different regression models differ based on – the kind of
relationship between dependent and independent variables
they are considering, and the number of independent
variables getting used.
LINEAR MODEL:
 In simple linear regression analysis, each observations has two variables.
 Multiple linear regression analysis consist of two or more independent
variables.
 The case of one explanatory variable is called
simple linear regression; for more than one, the process is
called multiple linear regression.
 Regression models a target prediction value based on
independent variables.
 It is mostly used for finding out the relationship between
variables and forecasting.
 Different regression models differ based on – the kind of
relationship between dependent and independent variables
they are considering, and the number of independent
variables getting used.
LINEAR MODEL:

x: input training data (univariate – one input

variable(parameter))
y: labels to data (supervised learning)
θ1: intercept(Constant)
θ2: coefficient of x
LINEAR MODEL:
 Positive relationship

 Negative relationship
LINEAR MODEL:
 Linear regression is a linear approach for modelling the relationship
between a scalar response and one or more explanatory variables (also
known as dependent and independent variables).
TERMINOLOGIES RELATED TO THE REGRESSION
ANALYSIS:
 Dependent Variable: The main factor in Regression analysis which we want to
predict or understand is called the dependent variable. It is also called target
variable.

 Independent Variable: The factors which affect the dependent variables or which are
used to predict the values of the dependent variables are called independent variable,
also called as a predictor.

 Outliers: Outlier is an observation which contains either very low value or very high
value in comparison to other observed values. An outlier may hamper the result, so it
should be avoided.
 Outliers are defined as abnormal values in a dataset that don't go with the regular
distribution and have the potential to significantly distort any regression model.

 Multicollinearity: If the independent variables are highly correlated with each other
than other variables, then such condition is called Multicollinearity. It should not be
present in the dataset, because it creates problem while ranking the most affecting
variable.
WHY DO WE USE REGRESSION ANALYSIS?
 Regression estimates the relationship between the target and the independent
variable.
 It is used to find the trends in data.
 It helps to predict real/continuous values.
 By performing the regression, we can confidently determine the most important
factor, the least important factor, and how each factor is affecting the other
factors.

 Types of Regression
 Linear Regression
 Logistic Regression
 Polynomial Regression
 Support Vector Regression
 Decision Tree Regression
 Random Forest Regression
 Ridge Regression
 Lasso Regression:
LINEAR REGRESSION

 Linear regression is one of the easiest and most popular

Machine Learning algorithms.
 It is a statistical method that is used for predictive analysis.

Linear regression makes predictions for continuous/real or

numeric variables such as sales, salary, age, product price, etc.
 Linear regression algorithm shows a linear relationship between

a dependent (y) and one or more independent (x) variables,

hence called as linear regression.
 Since linear regression shows the linear relationship, which

means it finds how the value of the dependent variable is

changing according to the value of the independent variable.
LINEAR REGRESSION
LINEAR REGRESSION
 Mathematically, we can represent a linear regression as:
 y= a +a x+ ε
0 1
 Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of
freedom)
a1 = Linear regression coefficient (scale factor to each input
value).
ε = random error
 The values for x and y variables are training datasets for Linear

Regression model representation.

TYPES OF LINEAR REGRESSION

 Linear regression can be further divided into two types of the

algorithm:
 Simple Linear Regression:
If a single independent variable is used to predict the value of a
numerical dependent variable, then such a Linear Regression
algorithm is called Simple Linear Regression.
 Multiple Linear regression:
If more than one independent variable is used to predict the
value of a numerical dependent variable, then such a Linear
Regression algorithm is called Multiple Linear Regression.
FINDING THE BEST FIT LINE

 Finding the best fit line:

 When working with linear regression, our main goal is to find the best
fit line that means the error between predicted values and actual
values should be minimized. The best fit line will have the least
error.
 The different values for weights or the coefficient of lines (a 0, a1) gives
a different line of regression, so we need to calculate the best values
for a0 and a1 to find the best fit line, so to calculate this we use cost
function.
FINDING THE BEST FIT LINE
UNIVARATE REGRESSION- LEAST SQUARE METHOD:

 Univariate linear regression focuses on determining relationship between one

independent (explanatory variable) variable and one dependent variable.
 Univariate data is the type of data in which the result depends only on one
variable.
UNIVARATE REGRESSION:

 It is also called simple linear regression.

Features of best fit regression line

 Regression line results in minimum sum of errors.
 It does not need to go through all data points.
 It does not need same number of data points above and below.
MODEL REPRESENTATION:
COST FUNCTIONS:
 It is a mechanism utilized in supervised machine learning .
 The cost function returns the error between predicted outcomes compared
with the actual outcomes.
 In other words, it estimates the cost of production.
 Cost function is a measure of how wrong the model is in terms of its
ability to estimate the relationship between x and y.
 Loss function: Used when we refer to the error for a single training
example.
 Cost function: Used to refer to an average of the loss functions over an
entire training dataset.
WHY ON EARTH DO WE NEED A COST FUNCTION? :
 Why on earth do we need a cost function? Consider a scenario where we
wish to classify data. Suppose we have the height & weight details of
some cats & dogs.
WHY ON EARTH DO WE NEED A COST FUNCTION? :
 Blue dots are cats & red dots are dogs. Following are some solutions
to the above classification problem.
 Essentially all three classifiers have very high accuracy but the third
solution is the best because it does not misclassify any point. The
reason why it classifies all the points perfectly is that the line is
almost exactly in between the two groups, and not closer to any one
of the groups. This is where the concept of cost function comes in.
Cost function helps us reach the optimal solution. The cost
function is the technique of evaluating “the performance of our
algorithm/model”.
REGRESSION COST FUNCTION :
 Regression models deal with predicting a continuous value
for example salary of an employee, price of a car, loan
prediction, etc. A cost function used in the regression
problem is called “Regression Cost Function”. They are
calculated on the distance-based error as follows:
 Error = y-y’
 Where,
 Y – Actual Input
 Y’ – Predicted output

The most used Regression cost functions are below:

 Mean Squared Error (MSE)

 Mean Absolute Error (MAE)

 Root Mean Squared Error (RMSE)

 R- Sqaured(R^2) Error
MEAN SQUARED ERROR (MSE) :
 This improves the drawback we encountered in Mean Error above. Here a
square of the difference between the actual and predicted value is calculated
to avoid any possibility of negative error.
 It is measured as the average of the sum of squared differences between
predictions and actual observations.

 MSE = (sum of squared errors)/n

 It is also known as L2 loss.
 In MSE, since each error is squared, it helps to penalize even small
deviations in prediction when compared to MAE. But if our dataset has
outliers that contribute to larger prediction errors, then squaring this error
further will magnify the error many times more and also lead to higher MSE
error.
 Hence we can say that it is less robust to outliers.
MEAN ABSOLUTE ERROR (MAE):
 This cost function also addresses the shortcoming of mean
error differently. Here an absolute difference between the
actual and predicted value is calculated to avoid any
possibility of negative error.
 So in this cost function, MAE is measured as the average of
the sum of absolute differences between predictions and
actual observations.


 It is also known as L1 Loss.

 It is robust to outliers thus it will give better results even
when our dataset has noise or outliers.
Regression Metrices
EXAMPLE:
Suppose we have a regression model that predicts
house prices based on certain features. We collected
data on actual house price and the corresponding
predicted prices for a set of 5 houses:
Actual Prices(y): [200,300,400,500,600]
Predicted Prices(y^):[220,320,420,520,590]

Calculate MAE,MSE,RMSE?
MAE

Calculate the absolute errors:

∣200−220∣=20
∣300−320∣=20
∣400−420∣=20
∣500−520∣=20
∣600−590∣=10

Sum of absolute errors:

20+20+20+20+10=90
Number of observations (n) is 5. Thus:
MAE=90/5 =18
2. MEAN SQUARED ERROR (MSE):
 Calculate the squared errors:
These metrices provide insights into the model’s performance in terms
of prediction accuracy and error magnitude.

Lower Values of MAE and RMSE indicated better model

Performance in minimizing prediction error relative to the
actual value.
GRADIENT DESCENT:
 Gradient Descent:
 Gradient descent is used to minimize the MSE by

calculating the gradient of the cost function.

 A regression model uses gradient descent to update

the coefficients of the line by reducing the cost

function.
 It is done by a random selection of values of coefficient

and then iteratively update the values to reach the

minimum cost function.

 Model Performance:
 The Goodness of fit determines how the line of

regression fits the set of observations. The process of

finding the best model out of various models is
called optimization. It can be achieved by below
method:
MEAN SQUARED ERROR (MSE)
 Mean squared error (MSE) is the average of sum of
squared difference between actual value and the predicted
or estimated value. It is also termed as mean squared
deviation (MSD). This is how it is represented
mathematically:

 The value of MSE is always positive or greater than zero. A

value close to zero will represent better quality of the
estimator / predictor (regression model). An MSE of zero
(0) represents the fact that the predictor is a
perfect predictor. When you take a square root of
MSE value, it becomes root mean squared error
(RMSE). In the above equation, Y represents the actual
value and the Y’ is predicted value. Here is the
diagrammatic representation of MSE:
MEAN SQUARED ERROR (MSE)
MEAN ABSOLUTE ERROR (MAE)
 MAE is a very simple metric which calculates the
absolute difference between actual and predicted
values.
 To better understand, let’s take an example you

have input data and output data and use Linear

Regression, which draws a best-fit line.
 Now you have to find the MAE of your model which

is basically a mistake made by the model known

as an error. Now find the difference between the
actual value and predicted value that is an
absolute error but we have to find the mean
absolute of the complete dataset.
 so, sum all the errors and divide them by a

total number of observations and this is

MAE. And we aim to get a minimum MAE because
MEAN ABSOLUTE ERROR (MAE)
R-SQUARED METHOD:
 R-squared is a statistical method that determines the
goodness of fit.
 It measures the strength of the relationship between the
dependent and independent variables on a scale of 0-100%.
 0% indicates that the model explains none of the variability
of the response data around its mean.
 100% indicates that the model explains all the variability of
the response data around its mean.
 The high value of R-square determines the less difference
between the predicted values and actual values and hence
represents a good model.
 It is also called a coefficient of
determination, or coefficient of multiple
determination for multiple regression.
 It can be calculated from the below formula:
LEAST-SQUARE METHOD
 The least-squares method is a form of mathematical
regression analysis used to determine the
line of best fit for a set of data, providing a visual
demonstration of the relationship between the data
points. Each point of data represents the
relationship between a known independent variable
and an unknown dependent variable.

 Here x̅ is the mean of all the values in the

input X and ȳ is the mean of all the values in the
desired output Y. This is the Least Squares method..
SIMPLE LINEAR REGRESSION
 Simple Linear Regression is a type of Regression algorithms
that models the relationship between a dependent variable
and a single independent variable. The relationship shown
by a Simple Linear Regression model is linear or a sloped
straight line, hence it is called Simple Linear Regression.
 The key point in Simple Linear Regression is that
the dependent variable must be a continuous/real
value. However, the independent variable can be
measured on continuous or categorical values.
 Simple Linear regression algorithm has mainly two
objectives:
 Model the relationship between the two
variables. Such as the relationship between Income and
expenditure, experience and Salary, etc.
 Forecasting new observations. Such as Weather
forecasting according to temperature, Revenue of a
company according to the investments in a year, etc.
SIMPLE LINEAR REGRESSION MODEL:
 The Simple Linear Regression model can be represented
using the below equation:
 a0 + a1.x + e
 Where,
 a0= It is the intercept of the Regression line (can be
obtained putting x=0)
a1= It is the slope of the regression line, which tells
whether the line is increasing or decreasing.
ε = The error term. (For a good model it will be
negligible)
MULTIVARIATE REGRESSION
 Multivariate Regression is a supervised machine learning algorithm
involving multiple data variables for analysis. A Multivariate regression
is an extension of multiple regression with one dependent variable and
multiple independent variables. Based on the number of independent
variables, we try to predict the output.
 Multivariate regression tries to find out a formula that can explain how
factors in variables respond simultaneously to changes in others.

 The equation for a model with two input variables can be written as:
 y = β0 + β1.x1 + β2.x2

 The equation for a model with three input variables can be written as:
 y = β0 + β1.x1 + β2.x2 + β3.x3

 Below is the generalized equation for the multivariate regression

model-
 y = β0 + β1.x1 + β2.x2 +….. + βn.xn
 Where n represents the number of independent variables, β0~ βn
represents the coefficients and x1~xn, is the independent variable.
COST FUNCTION
 In simple words it is a function that assigns a cost to instances where the
model deviates from the observed data. In this case, our cost is the sum of
squared errors. The cost function for multiple linear regression is given by:

 We can understand this equation as the summation of square of difference

between our predicted value and the actual value divided by twice of length
of data set. A smaller mean squared error implies a better performance.
Generally a cost function is used along with the Gradient Descent algorithm
to find the best parameters.
 Cost functions are used to estimate how badly models are performing. Put
simply, a cost function is a measure of how wrong the model is in terms of its
ability to estimate the relationship between X and y. This is typically
expressed as a difference or distance between the predicted value and the
actual value.
POLYNOMIAL REGRESSION
 Polynomial Regression is a regression algorithm that models the
relationship between a dependent(y) and independent variable(x) as nth
degree polynomial. The Polynomial Regression equation is given below:
y= b0+b1x1+ b2x12+ b2x13+...... bnx1n
 It is also called the special case of Multiple Linear Regression in ML.
Because we add some polynomial terms to the Multiple Linear
regression equation to convert it into Polynomial Regression.
 It is a linear model with some modification in order to increase the
accuracy.
 The dataset used in Polynomial regression for training is of non-linear
nature.
 It makes use of a linear regression model to fit the complicated and non-
linear functions and datasets.
 Hence, "In Polynomial regression, the original features are converted
into Polynomial features of required degree (2,3,..,n) and then modeled
using a linear model."
NEED FOR POLYNOMIAL REGRESSION:
 The need of Polynomial Regression in ML can be
understood in the below points:
 If we apply a linear model on a linear dataset,

then it provides us a good result as we have seen

in Simple Linear Regression, but if we apply the
same model without any modification on a non-
linear dataset, then it will produce a drastic
output. Due to which loss function will increase,
the error rate will be high, and accuracy will be
decreased.
 So for such cases, where data points are

arranged in a non-linear fashion, we need

the Polynomial Regression model. We can
understand it in a better way using the below
comparison diagram of the linear dataset and non-
NEED FOR POLYNOMIAL REGRESSION:
 In the image below, we have taken a dataset which is arranged
non-linearly. So if we try to cover it with a linear model, then we
can clearly see that it hardly covers any data point. On the other
hand, a curve is suitable to cover most of the data points, which
is of the Polynomial model.
 Hence, if the datasets are arranged in a non-linear fashion, then

we should use the Polynomial Regression model instead of

Simple Linear Regression.
NEED FOR POLYNOMIAL REGRESSION:

 When we compare the above three equations, we

can clearly see that all three equations are
Polynomial equations but differ by the degree of
variables. The Simple and Multiple Linear equations
are also Polynomial equations with a single degree,
and the Polynomial regression equation is Linear
equation with the nth degree. So if we add a
degree to our linear equations, then it will be
converted into Polynomial Linear equations.
GENERALIZATION
 The main goal of each machine learning model
is to generalize well.
 Here generalization defines the ability of an ML

model to provide a suitable output by adapting the

given set of unknown input.
 It means after providing training on the dataset, it

can produce reliable and accurate output.

 Hence, the underfitting and overfitting are the two

terms that need to be checked for the

performance of the model and whether the model
is generalizing well or not.
BIAS AND VARIANCE
 Bias: Bias is a prediction error that is introduced in
the model due to oversimplifying the machine
learning algorithms. Or it is the difference between
the predicted values and the actual values.

 Variance: If the machine learning model performs

well with the training dataset, but does not
perform well with the test dataset, then variance
occurs.
BIAS VS. VARIANCE
BIAS-VARIANCE TRADEOFF
 The two are complementary to each other. In other
words, if the bias of a model is decreased, the
variance of the model automatically increases. The
vice-versa is also true, that is if the variance of a
model decreases, bias starts to increase.
 Hence, it can be concluded that it is nearly

impossible to have a model with no bias or no

variance since decreasing one increases the other.
This phenomenon is known as the Bias-Variance
Trade.
BIAS-VARIANCE TRADEOFF
 Another way of looking at the Bias-Variance Tradeoff graphically is to
plot the graphical representation for error, bias, and variance versus
the complexity of the model. In the graph shown below, the green
dotted line represents variance, the blue dotted line represents bias
and the red solid line represents the error in the prediction of the
concerned model.
 Since bias is high for a simpler model and decreases with an increase
in model complexity, the line representing bias exponentially
decreases as the model complexity increases.
 Similarly, Variance is high for a more complex model and is low for
simpler models. Hence, the line representing variance increases
exponentially as the model complexity increases.
 Finally, it can be seen that on either side, the generalization
error is quite high. Both high bias and high variance lead to a
higher error rate.
 The most optimal complexity of the model is right in the middle,
where the bias and variance intersect. This part of the graph is shown
to produce the least error and is preferred.
 Also, as discussed earlier, the model underfits for high-bias
situations and overfits for high-variance situations.
OVERFITTING
 Overfitting occurs when our machine learning
 model tries to cover all the data points or more than the
required data points present in the given dataset. Because of
this, the model starts caching noise and inaccurate values
present in the dataset, and all these factors reduce the
efficiency and accuracy of the model. The overfitted model
has low bias and high variance.
 The chances of occurrence of overfitting increase as much
we provide training to our model. It means the more we train
our model, the more chances of occurring the overfitted
model.
 Overfitting is the main problem that occurs in
supervised learning
UNDERFITTING
 Underfitting occurs when our machine learning
model is not able to capture the underlying trend
of the data. To avoid the overfitting in the model,
the fed of training data can be stopped at an early
stage, due to which the model may not learn
enough from the training data. As a result, it may
fail to find the best fit of the dominant trend in the
data.
 In the case of underfitting, the model is not able to

learn enough from the training data, and hence it

reduces the accuracy and produces unreliable
predictions.
 An underfitted model has high bias and low

variance.
ASSIGNMENT NO:03

Project Management Foundation: Subject Incharge: Dr. Rahul V. Dandage
100% (1)
Project Management Foundation: Subject Incharge: Dr. Rahul V. Dandage
74 pages
Presentation On: Presented To Dr. Vinay Pathak
89% (19)
Presentation On: Presented To Dr. Vinay Pathak
37 pages
Computer Science (Optional II) Grade 9-10: Micro Syllabus - Academic Year 2069
100% (1)
Computer Science (Optional II) Grade 9-10: Micro Syllabus - Academic Year 2069
6 pages
DAA Unit - 1
No ratings yet
DAA Unit - 1
68 pages
Data Mining: Concepts and Techniques: - Chapter 5
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 5
63 pages
Handling of Categorical Data
No ratings yet
Handling of Categorical Data
18 pages
OOSE Lab Report
No ratings yet
OOSE Lab Report
30 pages
DCCN Unit 1
No ratings yet
DCCN Unit 1
13 pages
Backup and Recovery
No ratings yet
Backup and Recovery
35 pages
Transaction With Replicated Data PDF
No ratings yet
Transaction With Replicated Data PDF
3 pages
Module 03. Project Planning and Scheduling - PM
100% (1)
Module 03. Project Planning and Scheduling - PM
43 pages
DBMS Unit4 Notes
No ratings yet
DBMS Unit4 Notes
14 pages
Lab Program
100% (1)
Lab Program
15 pages
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
12 pages
Module 4. Planning Projects - PM
100% (1)
Module 4. Planning Projects - PM
39 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
80 pages
DAP Lab Manual
No ratings yet
DAP Lab Manual
20 pages
LAB PROGRAMS With Screen Shot
No ratings yet
LAB PROGRAMS With Screen Shot
31 pages
R Programming Practical File (1319036)
No ratings yet
R Programming Practical File (1319036)
25 pages
Collision-Free Protocol
100% (1)
Collision-Free Protocol
3 pages
Lab Manual B.Sc. (CA) : Department of Computer Science Ccb-2P2: Laboratory Course - Ii
No ratings yet
Lab Manual B.Sc. (CA) : Department of Computer Science Ccb-2P2: Laboratory Course - Ii
31 pages
Data Analytics Lab File Rohit
No ratings yet
Data Analytics Lab File Rohit
23 pages
Practical Lab File Based ON Programing in C: Submitted by
No ratings yet
Practical Lab File Based ON Programing in C: Submitted by
6 pages
A Report of Six Weaks Industrial Training at BBSBEC, Fatehgarh Sahib
No ratings yet
A Report of Six Weaks Industrial Training at BBSBEC, Fatehgarh Sahib
24 pages
CS01207
No ratings yet
CS01207
3 pages
Advance Java Handwriting Notes
No ratings yet
Advance Java Handwriting Notes
5 pages
A Model For Network Security
No ratings yet
A Model For Network Security
1 page
MCA11: Mathematical Foundation For Computer Science 1: Example 2.13
100% (1)
MCA11: Mathematical Foundation For Computer Science 1: Example 2.13
3 pages
Da Unit-2
No ratings yet
Da Unit-2
23 pages
Compiler Lab
No ratings yet
Compiler Lab
5 pages
The Double Transposition Cipher
No ratings yet
The Double Transposition Cipher
2 pages
Objective: For One Dimensional Data Set (7,10,20,28,35), Perform Hierarchical Clustering
No ratings yet
Objective: For One Dimensional Data Set (7,10,20,28,35), Perform Hierarchical Clustering
13 pages
PPT1
No ratings yet
PPT1
93 pages
Requirement Modeling
No ratings yet
Requirement Modeling
79 pages
Cocomo Model1
100% (1)
Cocomo Model1
12 pages
Unit Ii
No ratings yet
Unit Ii
61 pages
toc mod 5 notes
No ratings yet
toc mod 5 notes
41 pages
DBMS Basic Concepts
No ratings yet
DBMS Basic Concepts
56 pages
Seminar: Cricket Score Board
No ratings yet
Seminar: Cricket Score Board
16 pages
AoA Important Question
100% (1)
AoA Important Question
3 pages
Daa Lab Manual PDF
No ratings yet
Daa Lab Manual PDF
22 pages
Principles of Compiler Design
No ratings yet
Principles of Compiler Design
36 pages
APMC Prachi Synopsis
No ratings yet
APMC Prachi Synopsis
6 pages
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
No ratings yet
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
12 pages
Wipro Aptitude Exam-Aptitude Paper1
No ratings yet
Wipro Aptitude Exam-Aptitude Paper1
4 pages
(Ebook) Essentials of Pattern Recognition: An Accessible Approach by Jianxin Wu ISBN 9781108483469, 1108483461 all chapter instant download
100% (8)
(Ebook) Essentials of Pattern Recognition: An Accessible Approach by Jianxin Wu ISBN 9781108483469, 1108483461 all chapter instant download
81 pages
7 - Classification
No ratings yet
7 - Classification
71 pages
Unit-1 STQA
No ratings yet
Unit-1 STQA
127 pages
Build and Fix Model (Also Referred To As An Ad Hoc Model), The Software Is Developed Without Any
No ratings yet
Build and Fix Model (Also Referred To As An Ad Hoc Model), The Software Is Developed Without Any
5 pages
Data Mining Using Python Lab
100% (1)
Data Mining Using Python Lab
63 pages
Proforma For The Approval of Mca Project Proposal (Mcsp-060)
No ratings yet
Proforma For The Approval of Mca Project Proposal (Mcsp-060)
1 page
Chapter 3 Solutions: Unit 1 Colutions Oomd 06CS71
No ratings yet
Chapter 3 Solutions: Unit 1 Colutions Oomd 06CS71
14 pages
Switching
No ratings yet
Switching
21 pages
Railway Reservation System
0% (1)
Railway Reservation System
15 pages
C - Operators and Expressions
No ratings yet
C - Operators and Expressions
7 pages
006 Practical List of DM-2023
No ratings yet
006 Practical List of DM-2023
1 page
Relational Algebra and SQL
No ratings yet
Relational Algebra and SQL
68 pages
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
Equity of Cybersecurity in the Education System: High Schools, Undergraduate, Graduate and Post-Graduate Studies.
From Everand
Equity of Cybersecurity in the Education System: High Schools, Undergraduate, Graduate and Post-Graduate Studies.
Joseph O. Esin
No ratings yet
DBMS Lab Manual
From Everand
DBMS Lab Manual
Jitendra Patel
1.5/5 (3)
Network Security: A Note On The Use of These PPT Slides
No ratings yet
Network Security: A Note On The Use of These PPT Slides
91 pages
2ndSeComputerGroupCurriculumCOIFCMCW 14220181720
No ratings yet
2ndSeComputerGroupCurriculumCOIFCMCW 14220181720
30 pages
Institute User Manual - Msbte Online Exam s21
No ratings yet
Institute User Manual - Msbte Online Exam s21
11 pages
Digital Temperature Sensor
No ratings yet
Digital Temperature Sensor
11 pages
Java Notes
No ratings yet
Java Notes
95 pages
8.current Affairs Q&A PDF - May 2022 by AffairsCloud New 1
No ratings yet
8.current Affairs Q&A PDF - May 2022 by AffairsCloud New 1
297 pages
6 - Malware Detection
No ratings yet
6 - Malware Detection
17 pages
Assignment - 3 - Lab 5.6 SS
No ratings yet
Assignment - 3 - Lab 5.6 SS
4 pages
enterpinership
No ratings yet
enterpinership
35 pages
GERRA v. BANKERS STANDARD INSURANCE COMPANY Complaint
No ratings yet
GERRA v. BANKERS STANDARD INSURANCE COMPANY Complaint
4 pages
Legislative Process Flowchart
No ratings yet
Legislative Process Flowchart
1 page
Res Sedimentation Handbook 1.05
No ratings yet
Res Sedimentation Handbook 1.05
805 pages
Team-Player-Interview-Questions-And-Answers
No ratings yet
Team-Player-Interview-Questions-And-Answers
8 pages
Health & Safety Manual
No ratings yet
Health & Safety Manual
202 pages
System Formwor
No ratings yet
System Formwor
55 pages
PTTechEnclosedWetBrakeBrochure2013 01b0
No ratings yet
PTTechEnclosedWetBrakeBrochure2013 01b0
2 pages
Material Safety Data Sheet: Cetrimonium Chloride
No ratings yet
Material Safety Data Sheet: Cetrimonium Chloride
2 pages
Effective Business Presentations in PowerPoint - Practice Activity
No ratings yet
Effective Business Presentations in PowerPoint - Practice Activity
13 pages
Meiko-Balgos Resumé
No ratings yet
Meiko-Balgos Resumé
2 pages
Welding Procedure Specification: PWPS No.: E 102 36/36
No ratings yet
Welding Procedure Specification: PWPS No.: E 102 36/36
1 page
Kumpulan Contoh Soal SBMPTN Bahasa Inggris Dan Pembahasannya 2016
No ratings yet
Kumpulan Contoh Soal SBMPTN Bahasa Inggris Dan Pembahasannya 2016
9 pages
Grade 6 PT 3 Revision Worksheet With Answers
100% (1)
Grade 6 PT 3 Revision Worksheet With Answers
7 pages
OpenText Archiving and Document Access For SAP Solutions CE 21.4 - Scenario Guide English (ER210400-CCS-En-02)
No ratings yet
OpenText Archiving and Document Access For SAP Solutions CE 21.4 - Scenario Guide English (ER210400-CCS-En-02)
760 pages
Isman Latest CV R1
No ratings yet
Isman Latest CV R1
3 pages
Bioprospecting of Microbial ST
No ratings yet
Bioprospecting of Microbial ST
22 pages
National Weather Service in Wakefield, VA
No ratings yet
National Weather Service in Wakefield, VA
2 pages
Industries, Locational Patterns and Problems
No ratings yet
Industries, Locational Patterns and Problems
14 pages
Study On Corporate Governance in India
No ratings yet
Study On Corporate Governance in India
2 pages
Snowflake Training Slide SANMs
67% (6)
Snowflake Training Slide SANMs
218 pages
Bamboo Shoot Longganisa Table of Contents
No ratings yet
Bamboo Shoot Longganisa Table of Contents
10 pages
EBS-1400S Spec Sheet & GA
No ratings yet
EBS-1400S Spec Sheet & GA
2 pages
4TH-QUARTER PeTa INFOGRAPHIC-POSTER
No ratings yet
4TH-QUARTER PeTa INFOGRAPHIC-POSTER
4 pages
2) Paula R. Newberg Judging The State
No ratings yet
2) Paula R. Newberg Judging The State
8 pages
Stem Grade 10
No ratings yet
Stem Grade 10
149 pages
Batteries 06 00027 With Cover
No ratings yet
Batteries 06 00027 With Cover
9 pages

Regression: Unit Iii

Uploaded by

Regression: Unit Iii

Uploaded by

UNIT III

CO-3 :Compare different types of classification models

and their relevant application

 Introduction to Polynomial Regression: Generalization-

Overfitting Vs. Underfitting, Bias Vs. Variance.

 Some examples of regression can be as:

x: input training data (univariate – one input

 Linear regression is one of the easiest and most popular

Linear regression makes predictions for continuous/real or

a dependent (y) and one or more independent (x) variables,

means it finds how the value of the dependent variable is

Regression model representation.

 Linear regression can be further divided into two types of the

 Finding the best fit line:

 Univariate linear regression focuses on determining relationship between one

 It is also called simple linear regression.

Features of best fit regression line

The most used Regression cost functions are below:

 Mean Absolute Error (MAE)

 Root Mean Squared Error (RMSE)

 MSE = (sum of squared errors)/n

 It is also known as L1 Loss.

Calculate the absolute errors:

Sum of absolute errors:

Lower Values of MAE and RMSE indicated better model

calculating the gradient of the cost function.

the coefficients of the line by reducing the cost

and then iteratively update the values to reach the

regression fits the set of observations. The process of

 The value of MSE is always positive or greater than zero. A

have input data and output data and use Linear

is basically a mistake made by the model known

total number of observations and this is

 Here x̅ is the mean of all the values in the

 Below is the generalized equation for the multivariate regression

 We can understand this equation as the summation of square of difference

then it provides us a good result as we have seen

arranged in a non-linear fashion, we need

we should use the Polynomial Regression model instead of

 When we compare the above three equations, we

model to provide a suitable output by adapting the

can produce reliable and accurate output.

terms that need to be checked for the

 Variance: If the machine learning model performs

impossible to have a model with no bias or no

learn enough from the training data, and hence it

You might also like