0% found this document useful (0 votes)

15 views22 pages

logisticregression

Uploaded by

yuvrajsharma56780

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views22 pages

logisticregression

Uploaded by

yuvrajsharma56780

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Logistic

Regression
Dr. Rupak Chakraborty
Techno India University, Kolkata
Introduction

Used to predict binary outcomes for a given set of

independent variables.

One of the algorithms used for classification as it

contains categorical values.

The name may be a little confusing because it has

‘regression’ in it, but it is actually used for
performing classification as the output is discrete
instead of continuous numerical value.

LOGISTIC REGRESSION 2
Explanation
Logistic Regression is a type of statistical model that is used to predict the probability
of a certain event happening. It works by taking input variables and transforming
them into a probability value between 0 and 1, where 0 represents a low probability
and 1 represents a high probability.
For example, imagine you want to predict whether someone will buy a product based
on their age and income. Logistic Regression would take these input variables and use
them to calculate the probability of the person buying the product.
It's called "logistic" because the transformation of the input variables is done using a
mathematical function called the logistic function, which creates an S-shaped curve.
Overall, Logistic Regression is a useful tool for making predictions and understanding
the relationship between variables in a dataset.

LOGISTIC REGRESSION 3
Example
Imagine it’s been several years since you service your car.

Probability of breakdown
One day you are wondering…
If your car will break down in near future or not.
So this is like classification, as we will have answers
either in ‘Yes’ or ‘No’.

As we can imagine that the no. of years that are on lower side like
1 year, 2 year, 3 year after the service, the chances of the car
breaking down is very limited.
Years since service
Here, the dependent variable’s output is discrete.

LOGISTIC REGRESSION 4
Why not Linear Regression?
Take for example,

Probability of getting pomotion

You ae given a data of Employee ratings along with the
probability of getting promotion.
If we are going to plot Linear Regression with Yes or
No (considering 0 as No and 1 as Yes) the graph will certainly
be look like this.
In the graph, we can see that the output is either 0 or 1,
there is nothing in between as the output is discrete in this
case.
Employee Rating
Whereas Employee rating is a continuous number so
there will not be any issue while plotting it on x-axis.

LOGISTIC REGRESSION 5
Why not Linear Regression?

As you can see that the graph doesn’t look very right.

Probability of getting pomotion

There would be lot of errors and RMSE would be very, very
high. Also, the values of output cannot go beyond 0 or 1.

Therefore, instead of using linear regression, we need

to come up with something different. So, the logistic model
came in picture.
Employee Rating

LOGISTIC REGRESSION 6
Odds of Success
To understand Logistic Regression, let’s talk about the odds of success.

𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑎𝑛 𝑒𝑣𝑒𝑛𝑡 ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔

Odds(θ) =
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑎𝑛 𝑒𝑣𝑒𝑛𝑡 𝑛𝑜𝑡 ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔

𝑝 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑔𝑒𝑡𝑡𝑖𝑛𝑔 𝑝𝑟𝑜𝑚𝑜𝑡𝑖𝑜𝑛

or, θ = ( )
1 −𝑝 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑛𝑜𝑡 𝑔𝑒𝑡𝑡𝑖𝑛𝑔 𝑝𝑟𝑜𝑚𝑜𝑡𝑖𝑜𝑛

The value of Odds range from 0 to α.

The values of probability ranges from 0 to 1.

If p = 0, θ = 0/(1-0) = 0/1 = 0
If p = 1, θ = 1/(1-1) = 1/0 = α

LOGISTIC REGRESSION 7
Predicting Odds of Success
𝑝 𝑥
log = β0 + β1x (β0 = constant)
1−𝑝 𝑥

Exponentiating both sides,

𝑝 𝑥
e^ln( ) = e^(β0 + β1x )
1−𝑝 𝑥
𝑝 𝑥
Or, = e^(β0 + β1x )
1−𝑝 𝑥

Let, Y = e^(β0 + β1x )

𝑝 𝑥
Then, =Y
1−𝑝 𝑥

LOGISTIC REGRESSION 8
Predicting Odds of Success
𝑝 𝑥
Then, =Y
1−𝑝 𝑥

or, p(x) = Y(1-p(x))

or, p(x) = Y – Y p(x)

or, p(x) + Y p(x) = Y

or, p(x) (1 + Y) = Y

𝑌
or, p(x) =
1+𝑌

LOGISTIC REGRESSION 9
Predicting Odds of Success
𝑌
or, p(x) =
1+𝑌

e^(β0 + β1x )
or, p(x) = [Sigmoid]
1+e^(β0 + β1x )

The equation of a sigmoid function,

e^(β0 + β1x )
p(x) =
1+e^(β0 + β1x )

1
p(x) =
1+e^−(β0 + β1x )

LOGISTIC REGRESSION 10
Sigmoid function curve

LOGISTIC REGRESSION 11
Sigmoid function contd..

LOGISTIC REGRESSION 12
Compare Linear regression
and Logistic regression
Linear Regression Logistic Regression

 Used to solve Regression problems.  Used to solve classification problems.

 The response variable is continuous in  The response variable is categorical in
nature. nature.
 It helps eliminate the dependent  It helps calculate the possibility of a
variable when there is a change in the particular event taking place.
independent variable.
 It is a S – curve. (S = Sigmoid)
 It is a straight line.

LOGISTIC REGRESSION 13
Compare Linear regression
and Logistic regression
Linear Regression Logistic Regression

 Example:  Example:
 Weather Prediction  Weather Prediction
 If we need to predict the  If we are going to predict
temperature of the coming whether it would be raining
week. tomorrow or not.
 Then it is a continuous number.  Then it is a discrete value.
 The predictions will be either
in ‘Yes’ or ‘No’

LOGISTIC REGRESSION 14
Compare Logistic Regression
and Classification
Logistic Regression Classification
Logistic regression is a statistical modeling technique used to Classification, on the other hand, is a machine
analyze and model the relationship between a dependent variable learning task that involves assigning an input
and one or more independent variables. to one of several predefined categories.
In logistic regression, the dependent variable is categorical (i.e., it Classification can be thought of as a kind of
takes on a limited number of values), but it is continuous in prediction problem, where the goal is to
nature. predict the class or category of a given input.
The goal of logistic regression is to predict the probability of an
event occurring (i.e., the dependent variable taking a certain
value) based on the values of the independent variables.

LOGISTIC REGRESSION 15
Applications of
Logistic Regression

1. Fraud Detection: 3. Emergency Detection:

Here, the binary detection Here, the binary detection
variable will be either ‘Detected’ or variable will be either ‘Emergency’ or
‘Not detected’. ‘Not Emergency’.

2. Disease Diagnosis: 4. Spam Filter:

Here, the outcome will be Here, the outcome will be
either ‘Positive’ or ‘Negative’ either ‘Spam’ or ‘Not Spam’

LOGISTIC REGRESSION 16
Logistic Regression
Assumptions
 Binary Outcome:
The dependent variable, also known as the outcome variable or response
variable, is binary in nature.
This means that it takes on one of two possible values, typically coded as 0
and 1, or as "success" and "failure", "yes" and "no", "true" and "false", or some
other binary coding.
The logistic regression model is designed to estimate the probability of the
"success" outcome as a function of one or more independent variables, also
known as predictors or covariates.
The logistic function, which transforms a linear combination of the
predictors into a probability between 0 and 1, is used to model the
relationship between the predictors and the outcome.

LOGISTIC REGRESSION 17
Logistic Regression
Assumptions
 No Multicollinearity:
The assumption of no or low multicollinearity among the independent variables is
important in logistic regression. Multicollinearity refers to a situation where two or
more independent variables are highly correlated with each other, which can lead to
problems in the estimation of the model parameters and in the interpretation of the
results.
Multicollinearity can cause unstable and imprecise estimates of the logistic
regression parameters, and may make it difficult to identify which independent
variable(s) are driving the observed effects on the outcome variable. One way to
check for multicollinearity is to calculate the correlation matrix between the
independent variables and look for high correlations (i.e., correlations greater than
0.7 or 0.8).

LOGISTIC REGRESSION 18
Logistic Regression
Assumptions
 Large Sample Size:
Sample size is an important consideration in logistic regression. A relatively large
sample size is typically required to ensure stable estimates and adequate
statistical power to detect meaningful effects.
The sample size requirements for logistic regression depend on several factors,
such as the number and complexity of the independent variables, the prevalence
of the outcome in the population, and the desired level of statistical power. As a
general rule of thumb, a sample size of at least 10-15 observations per
independent variable is often recommended.
If the sample size is too small, the logistic regression model may suffer from
issues such as overfitting, where the model fits the noise in the data instead of
the underlying signal, and underpowered statistical tests, where important
effects may be missed due to insufficient sample size.

LOGISTIC REGRESSION 19
Confusion Matrix
 A confusion matrix is a table used to evaluate the performance of a machine learning
algorithm for classification tasks. It is a square matrix that compares the actual and
predicted values of a classifier.
 Let's consider an example of a binary classification problem where we have a dataset
of 100 patients with diabetes, and we want to build a model that can predict whether
a patient has diabetes or not based on their medical data. The model output will be
either "Positive" or "Negative".
 By examining the values in the confusion matrix, we can calculate various
performance metrics, such as accuracy, precision, recall, and F1-score, which can help
us evaluate the model's performance. The confusion matrix provides a clear and
concise way of visualizing the model's performance in terms of its ability to correctly
classify positive and negative cases.

LOGISTIC REGRESSION 20
Confusion Matrix
Suppose the model has
 The values in the confusion matrix are as made predictions on the
follows: test set and we have the
 True Positives (TP): the number of cases following results:
that were correctly classified as positive
(60 in this case). Predicted Predicted
 False Positives (FP): the number of cases Positive Negative
that were incorrectly classified as Actual
60 10
positive (15 in this case). Positive
 True Negatives (TN): the number of Actual
15 15
cases that were correctly classified as Negative
negative (15 in this case).
Here, we have a 2x2 matrix, where the rows represent the
 False Negatives (FN): the number of actual values and the columns represent the predicted
cases that were incorrectly classified as values. The diagonal elements of the matrix represent the
negative (10 in this case). correctly classified cases, and the off-diagonal elements
represent the incorrectly classified cases

LOGISTIC REGRESSION 21
Thank you

Logistic Regression
No ratings yet
Logistic Regression
22 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Session9-LogisticRegression_a6c5bc556df30fa3eb779e22e464a08a - Copy
No ratings yet
Session9-LogisticRegression_a6c5bc556df30fa3eb779e22e464a08a - Copy
33 pages
FALLSEM2024-25 MMAT501L TH VL2024250107615 2024-09-24 Reference-Material-I
No ratings yet
FALLSEM2024-25 MMAT501L TH VL2024250107615 2024-09-24 Reference-Material-I
12 pages
Lecture 4-Logistic Regression
No ratings yet
Lecture 4-Logistic Regression
20 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
Sonia Jessica - 2022 - How Does Logistic Regression Work
No ratings yet
Sonia Jessica - 2022 - How Does Logistic Regression Work
4 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Lecture 22. Glm
No ratings yet
Lecture 22. Glm
41 pages
Linear and Logistic Regression
No ratings yet
Linear and Logistic Regression
21 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Logistic Regression[2]
No ratings yet
Logistic Regression[2]
36 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
7.logistics Regression - BDSM - Oct - 2020
No ratings yet
7.logistics Regression - BDSM - Oct - 2020
49 pages
Session 5 - Logistic Regression
No ratings yet
Session 5 - Logistic Regression
69 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
03 Logistic Regression
No ratings yet
03 Logistic Regression
23 pages
09_23ECE216_LogisticRegression
No ratings yet
09_23ECE216_LogisticRegression
40 pages
Chp2 Logistic Regression
No ratings yet
Chp2 Logistic Regression
6 pages
ai-tech-agency-infographics
No ratings yet
ai-tech-agency-infographics
65 pages
Chapter 4 Statistical Classification Methods
No ratings yet
Chapter 4 Statistical Classification Methods
63 pages
MLStackCafe QAS 1672810525772
No ratings yet
MLStackCafe QAS 1672810525772
12 pages
Logistic Regression Report
No ratings yet
Logistic Regression Report
39 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Logistic Regressions
No ratings yet
Logistic Regressions
11 pages
SMDS-Unit-5
No ratings yet
SMDS-Unit-5
21 pages
ML CLASS 5 Logistic Regression Algorithm
No ratings yet
ML CLASS 5 Logistic Regression Algorithm
16 pages
Practical - Logistic Regression
No ratings yet
Practical - Logistic Regression
84 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Logistic Regression
No ratings yet
Logistic Regression
20 pages
Ml Assignment Kv2
No ratings yet
Ml Assignment Kv2
10 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
21 pages
2-Logistic Regression
No ratings yet
2-Logistic Regression
15 pages
Experiment No 8
No ratings yet
Experiment No 8
4 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Interview Questions
No ratings yet
Interview Questions
26 pages
13 Logistic Regression Main
No ratings yet
13 Logistic Regression Main
14 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
B.Tech_V_KCS055_Unit2_2
No ratings yet
B.Tech_V_KCS055_Unit2_2
7 pages
Logestic Regression Model
No ratings yet
Logestic Regression Model
13 pages
Logistics Regression
No ratings yet
Logistics Regression
8 pages
Unit 3-2
No ratings yet
Unit 3-2
20 pages
Logesti For Biginners
No ratings yet
Logesti For Biginners
13 pages
Presentation (FA20 BCS 104)
No ratings yet
Presentation (FA20 BCS 104)
9 pages
1694600777-Unit2.2 Logistic Regression CU 2.0
100% (1)
1694600777-Unit2.2 Logistic Regression CU 2.0
37 pages
Classification Basics
No ratings yet
Classification Basics
14 pages
Intro to Linear and Logistic Reg
No ratings yet
Intro to Linear and Logistic Reg
5 pages
Business Analytics: Advance: Logistic Regression
100% (1)
Business Analytics: Advance: Logistic Regression
26 pages
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
No ratings yet
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
6 pages
ML 4
No ratings yet
ML 4
80 pages
3 - 1 Logistic Regression
No ratings yet
3 - 1 Logistic Regression
9 pages
Logistic Regression
No ratings yet
Logistic Regression
7 pages
Fai Module 3
No ratings yet
Fai Module 3
67 pages
Logistic Regression
No ratings yet
Logistic Regression
30 pages
compare & contrast Linear Vs Logistic Regression
No ratings yet
compare & contrast Linear Vs Logistic Regression
3 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
SAT Math: Master the Skills in 40 Pages
From Everand
SAT Math: Master the Skills in 40 Pages
Jennifer L Johnson
No ratings yet
T Test Prop Test Examples
No ratings yet
T Test Prop Test Examples
10 pages
Department of Mahtematics and Statistics University of Jaffna LEVEL 1S (2018/2019) Statistics For Computing I - CSC105S3 Tutorial - 03
100% (1)
Department of Mahtematics and Statistics University of Jaffna LEVEL 1S (2018/2019) Statistics For Computing I - CSC105S3 Tutorial - 03
3 pages
ApendiceTablassa
No ratings yet
ApendiceTablassa
2 pages
Data Considerations For Crossed Gage R
No ratings yet
Data Considerations For Crossed Gage R
11 pages
Ledoit-Wolf Shrinkage Variance Estimate
No ratings yet
Ledoit-Wolf Shrinkage Variance Estimate
1 page
Chi-Square Test For Association: Cramer's V Correlation
No ratings yet
Chi-Square Test For Association: Cramer's V Correlation
23 pages
Performance Task in Stats
100% (1)
Performance Task in Stats
11 pages
RAJASTHAN TECHNICAL UNIVERSITY Paper 2022
No ratings yet
RAJASTHAN TECHNICAL UNIVERSITY Paper 2022
2 pages
Lecture 8
No ratings yet
Lecture 8
14 pages
Credit Card Data - Final Project Proposal - Victor
No ratings yet
Credit Card Data - Final Project Proposal - Victor
1 page
Midterm Exam 1 - Specimen Paper - v3
No ratings yet
Midterm Exam 1 - Specimen Paper - v3
4 pages
Random Number Generation - Wikipedia
0% (1)
Random Number Generation - Wikipedia
6 pages
Formulae for COMP 233 Midterm W20
No ratings yet
Formulae for COMP 233 Midterm W20
3 pages
Immediate Download Business Analytics, 5e Jeffrey D. Camm Ebooks 2024
100% (14)
Immediate Download Business Analytics, 5e Jeffrey D. Camm Ebooks 2024
38 pages
Calza Bsa21 Laboratory-Activity-5
No ratings yet
Calza Bsa21 Laboratory-Activity-5
5 pages
Q Bank Six Sigma 26-07-2018
100% (1)
Q Bank Six Sigma 26-07-2018
17 pages
IQRM Book 2022-01-24a
No ratings yet
IQRM Book 2022-01-24a
304 pages
Worksheet 1 Econ. 2042
No ratings yet
Worksheet 1 Econ. 2042
3 pages
A Correlation Between Dynamic Cone Penetrometer Values and P
No ratings yet
A Correlation Between Dynamic Cone Penetrometer Values and P
8 pages
Assignment 1
No ratings yet
Assignment 1
17 pages
Econometrics I - Lecture 7 (Wooldridge)
No ratings yet
Econometrics I - Lecture 7 (Wooldridge)
34 pages
Introduction To Probability: Random Experiment Random Phenomenon Sample Space
No ratings yet
Introduction To Probability: Random Experiment Random Phenomenon Sample Space
29 pages
Corelation and Regression
No ratings yet
Corelation and Regression
137 pages
Standard Deviation
No ratings yet
Standard Deviation
3 pages
Sta404 Chapter 08
No ratings yet
Sta404 Chapter 08
120 pages
CI Problem
No ratings yet
CI Problem
3 pages
Review For Final Exam
No ratings yet
Review For Final Exam
4 pages
STAT 2 Random Sampling Estimation 1 1
No ratings yet
STAT 2 Random Sampling Estimation 1 1
29 pages
P-2.1.2 Cross Validation and Regularization
No ratings yet
P-2.1.2 Cross Validation and Regularization
37 pages
Business Decision Making
No ratings yet
Business Decision Making
5 pages

logisticregression

Uploaded by

logisticregression

Uploaded by

Logistic

Used to predict binary outcomes for a given set of

One of the algorithms used for classification as it

The name may be a little confusing because it has

Probability of getting pomotion

Probability of getting pomotion

Therefore, instead of using linear regression, we need

𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑎𝑛 𝑒𝑣𝑒𝑛𝑡 ℎ𝑎𝑝𝑝𝑒𝑛𝑖𝑛𝑔

𝑝 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑔𝑒𝑡𝑡𝑖𝑛𝑔 𝑝𝑟𝑜𝑚𝑜𝑡𝑖𝑜𝑛

The value of Odds range from 0 to α.

Exponentiating both sides,

Let, Y = e^(β0 + β1x )

or, p(x) = Y(1-p(x))

or, p(x) = Y – Y p(x)

or, p(x) + Y p(x) = Y

The equation of a sigmoid function,

 Used to solve Regression problems.  Used to solve classification problems.

1. Fraud Detection: 3. Emergency Detection:

2. Disease Diagnosis: 4. Spam Filter:

You might also like