0% found this document useful (0 votes)
6 views

Unit II Deep Learning

Unit 2 focuses on logistic regression, covering the distinction between regression and classification in machine learning, where regression predicts continuous values and classification predicts discrete values. It explains maximum likelihood estimation as a crucial concept for classification algorithms like logistic regression, which uses the sigmoid function to map inputs to probabilities. The unit also details types of logistic regression, assumptions, key terminologies, and how the logistic regression model operates.

Uploaded by

Ramprakash Reddy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Unit II Deep Learning

Unit 2 focuses on logistic regression, covering the distinction between regression and classification in machine learning, where regression predicts continuous values and classification predicts discrete values. It explains maximum likelihood estimation as a crucial concept for classification algorithms like logistic regression, which uses the sigmoid function to map inputs to probabilities. The unit also details types of logistic regression, assumptions, key terminologies, and how the logistic regression model operates.

Uploaded by

Ramprakash Reddy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

UNIT- 2 Deep Learning

UNIT 2 LOGISTIC REGRESSION 9 hours


Basic concepts of regression and classification problems, linear models addressing
regression and classification, maximum likelihood, logistic regression classifiers.

Regression vs. Classification in Machine


Learning.
Regression and Classification algorithms are Supervised Learning algorithms. Both the
algorithms are used for prediction in Machine learning and work with the labeled datasets.
But the difference between both is how they are used for different machine learning
problems.

The main difference between Regression and Classification algorithms that Regression
algorithms are used to predict the continuous values such as price, salary, age, etc. and
Classification algorithms are used to predict/Classify the discrete values such as Male
or Female, True or False, Spam or Not Spam, etc.

Consider the below diagram:

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

Classification:
Classification is a process of finding a function which helps in dividing the dataset into
classes based on different parameters. In Classification, a computer program is trained on
the training dataset and based on that training, it categorizes the data into different
classes.

The task of the classification algorithm is to find the mapping function to map the input(x)
to the discrete output(y).

Example: The best example to understand the Classification problem is Email Spam
Detection. The model is trained on the basis of millions of emails on different parameters,
and whenever it receives a new email, it identifies whether the email is spam or not. If the
email is spam, then it is moved to the Spam folder.

Types of ML Classification Algorithms:

Classification Algorithms can be further divided into the following types:

o Logistic Regression
o K-Nearest Neighbours
o Support Vector Machines
o Kernel SVM
o Naïve Bayes
o Decision Tree Classification
o Random Forest Classification

Regression:
Regression is a process of finding the correlations between dependent and independent
variables. It helps in predicting the continuous variables such as prediction of Market
Trends, prediction of House prices, etc.

The task of the Regression algorithm is to find the mapping function to map the input
variable(x) to the continuous output variable(y).

Example: Suppose we want to do weather forecasting, so for this, we will use the
Regression algorithm. In weather prediction, the model is trained on the past data, and
once the training is completed, it can easily predict the weather for future days.

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

Types of Regression Algorithm:

o Simple Linear Regression


o Multiple Linear Regression
o Polynomial Regression
o Support Vector Regression
o Decision Tree Regression
o Random Forest Regression

Difference between Regression and Classification


Regression Algorithm Classification Algorithm

In Regression, the output variable must In Classification, the output variable must be a
be of continuous nature or real value. discrete value.

The task of the regression algorithm is to The task of the classification algorithm is to map the
map the input value (x) with the input value(x) with the discrete output variable(y).
continuous output variable(y).

Regression Algorithms are used with Classification Algorithms are used with discrete data.
continuous data.

In Regression, we try to find the best fit In Classification, we try to find the decision boundary,
line, which can predict the output more which can divide the dataset into different classes.
accurately.

Regression algorithms can be used to Classification Algorithms can be used to solve


solve the regression problems such as classification problems such as Identification of spam
Weather Prediction, House price emails, Speech Recognition, Identification of cancer
prediction, etc. cells, etc.

The regression Algorithm can be further The Classification algorithms can be divided into
divided into Linear and Non-linear Binary Classifier and Multi-class Classifier.
Regression.

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

Maximum Likelihood
Introduction

Maximum likelihood is an approach commonly used for such density estimation


problems, in which a likelihood function is defined to get the probabilities of the
distributed data. It is imperative to study and understand the concept of
maximum likelihood as it is one of the primary and core concepts essential for
learning other advanced machine learning and deep learning techniques and
algorithms.

In this article, we will discuss the likelihood function, the core idea behind that,
and how it works with code examples. This will help one to understand the concept
better and apply the same when needed.

Let us dive into the likelihood first to understand the maximum likelihood
estimation.

What is the Likelihood?

In machine learning, the likelihood is a measure of the data observations up to


which it can tell us the results or the target variables value for particular data
points. In simple words, as the name suggests, the likelihood is a function that
tells us how likely the specific data point suits the existing data distribution.

For example. Suppose there are two data points in the dataset. The likelihood of
the first data point is greater than the second. In that case, it is assumed that
the first data point provides accurate information to the final model, hence being
likable for the model being informative and precise.

After this discussion, a gentle question may appear in your mind, If the working
of the likelihood function is the same as the probability function, then what is the
difference?

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

Difference Between Probability and Likelihood

Although the working and intuition of both probability and likelihood appear to be
the same, there is a slight difference, here the possibility is a function that defines
or tells us how accurate the particular data point is valuable and contributes to
the final algorithm in data distribution and how likely is to the machine learning
algorithm.

Whereas probability, in simple words is a term that describes the chance of some
event or thing happening concerning other circumstances or conditions, mostly
known as conditional probability.

Also, the sum of all the probabilities associated with a particular problem is one
and can not exceed it, whereas the likelihood can be greater than one.

What is Maximum Likelihood Estimation?

After discussing the intuition of the likelihood function, it is clear to us that a


higher likelihood is desired for every model to get an accurate model and has
accurate results. So here, the term maximum likelihood represents that we are
maximizing the likelihood function, called the Maximization of the Likelihood
Function.

Let us try to understand the same with an example.

Let us suppose that we have a classification dataset in which the independent


column is the marks of the students that they achieved in the particular exam,
and the target or dependent column is categorical, which has yes and No
attributes representing if students are placed on the campus placements or not.

Noe here, if we try to solve the same problem with the help of maximum likelihood
estimation, the function will first calculate the probability of every data point
according to every suitable condition for the target variable. In the next step, the
function will plot all the data points in the two-dimensional plots and try to find
the line that best fits the dataset to divide it into two parts. Here the best-fit line
will be achieved after some epochs, and once achieved, the line is used to classify
the data point by simply plotting it to the graph.

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

Maximum Likelihood: The Base

The maximum likelihood estimation is a base of some machine learning and deep
learning approaches used for classification problems. One example is logistic
regression, where the algorithm is used to classify the data point using the best-
fit line on the graph. The same approach is known as the perceptron trick
regarding deep learning algorithms.

As shown in the above image, all the data observations are plotted in a two-
dimensional diagram where the X-axis represents the independent column or the
training data, and the y-axis represents the target variable. The line is drawn to
separate both data observations, positives and negatives. According to the
algorithm, the observations that fall above the line are considered positive, and
data points below the line are regarded as negative data points.

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

Maximum Likelihood Estimation: Code Example

We can quickly implement the maximum likelihood estimation technique using


logistic regression on any classification dataset. Let us try to implement the same.

import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.linear_model import LogisticRegression
lr=LogisticRegression()
lr.fit(X_train,y_train)
lr_pred=lr.predict(X_test)
sns.regplot(x="X",y='lr_pred',data=df_pred ,logistic=True, ci=None)

The above code will fit the logistic regression for the given dataset and generate
the line plot for the data representing the distribution of the data and the best fit
according to the algorithm.

Key Takeaways
 Maximum Likelihood is a function that describes the data points and their likeliness to
the model for best fitting.
 Maximum likelihood is different from the probabilistic methods, where probabilistic
methods work on the principle of calculation probabilities. In contrast, the likelihood
method tries o maximize the likelihood of data observations according to the data
distribution.
 Maximum likelihood is an approach used for solving the problems like density
distribution and is a base for some algorithms like logistic regression.
 The approach is very similar and is predominantly known as the perceptron trick in
terms of deep learning methods.

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

Logistic Regression
Logistic regression is a supervised machine learning algorithm used
for classification tasks where the goal is to predict the probability that an instance
belongs to a given class or not. Logistic regression is a statistical algorithm which analyze
the relationship between two data factors. The article explores the fundamentals of
logistic regression, it’s types and implementations.

What is Logistic Regression?


Logistic regression is used for binary classification where we use sigmoid function, that
takes input as independent variables and produces a probability value between 0 and 1.

For example, we have two classes Class 0 and Class 1 if the value of the logistic function
for an input is greater than 0.5 (threshold value) then it belongs to Class 1 it belongs to
Class 0. It’s referred to as regression because it is the extension of linear regression but is
mainly used for classification problems.

Key Points:
 Logistic regression predicts the output of a categorical dependent variable.
Therefore, the outcome must be a categorical or discrete value.
 It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the
exact value as 0 and 1, it gives the probabilistic values which lie between 0 and
1.
 In Logistic regression, instead of fitting a regression line, we fit an “S” shaped
logistic function, which predicts two maximum values (0 or 1).
Logistic Function – Sigmoid Function
 The sigmoid function is a mathematical function used to map the predicted
values to probabilities.
 It maps any real value into another value within a range of 0 and 1. The value
of the logistic regression must be between 0 and 1, which cannot go beyond
this limit, so it forms a curve like the “S” form.
 The S-form curve is called the Sigmoid function or the logistic function.
 In logistic regression, we use the concept of the threshold value, which defines
the probability of either 0 or 1. Such as values above the threshold value tends
to 1, and a value below the threshold values tends to 0.
Types of Logistic Regression
On the basis of the categories, Logistic Regression can be classified into three types:
1. Binomial: In binomial Logistic regression, there can be only two possible types
of the dependent variables, such as 0 or 1, Pass or Fail, etc.

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

2. Multinomial: In multinomial Logistic regression, there can be 3 or more


possible unordered types of the dependent variable, such as “cat”, “dogs”, or
“sheep”
3. Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered
types of dependent variables, such as “low”, “Medium”, or “High”.
Assumptions of Logistic Regression
We will explore the assumptions of logistic regression as understanding these
assumptions is important to ensure that we are using appropriate application of the model.
The assumption include:
1. Independent observations: Each observation is independent of the other.
meaning there is no correlation between any input variables.
2. Binary dependent variables: It takes the assumption that the dependent variable
must be binary or dichotomous, meaning it can take only two values. For more
than two categories SoftMax functions are used.
3. Linearity relationship between independent variables and log odds: The
relationship between the independent variables and the log odds of the
dependent variable should be linear.
4. No outliers: There should be no outliers in the dataset.
5. Large sample size: The sample size is sufficiently large
Terminologies involved in Logistic Regression
Here are some common terms involved in logistic regression:
 Independent variables: The input characteristics or predictor factors applied to
the dependent variable’s predictions.
 Dependent variable: The target variable in a logistic regression model, which we
are trying to predict.
 Logistic function: The formula used to represent how the independent and
dependent variables relate to one another. The logistic function transforms the
input variables into a probability value between 0 and 1, which represents the
likelihood of the dependent variable being 1 or 0.
 Odds: It is the ratio of something occurring to something not occurring. it is
different from probability as the probability is the ratio of something occurring
to everything that could possibly occur.
 Log-odds: The log-odds, also known as the logit function, is the natural
logarithm of the odds. In logistic regression, the log odds of the dependent
variable are modeled as a linear combination of the independent variables and
the intercept.
 Coefficient: The logistic regression model’s estimated parameters, show how the
independent and dependent variables relate to one another.
 Intercept: A constant term in the logistic regression model, which represents the
log odds when all independent variables are equal to zero.
 Maximum likelihood estimation: The method used to estimate the coefficients of
the logistic regression model, which maximizes the likelihood of observing the
data given the model.

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

How does Logistic Regression work?


The logistic regression model transforms the linear regression function continuous value
output into categorical value output using a sigmoid function, which maps any real-
valued set of independent variables input into a value between 0 and 1. This function is
known as the logistic function.
Let the independent input features be:

and the dependent variable is Y having only binary value i.e. 0 or 1.

then, apply the multi-linear function to the input variables X.

Here is the ith observation of X, is the


weights or Coefficient, and b is the bias term also known as intercept. simply this can
be represented as the dot product of weight and bias.

whatever we discussed above is the linear regression.


Sigmoid Function
Now we use the sigmoid function where the input will be z and we find the probability
between 0 and 1. i.e. predicted y.

Sigmoid function

As shown above, the figure sigmoid function converts the continuous variable data into
the probability i.e. between 0 and 1.
 tends towards 1 as
 tends towards 0 as

T.Swetha,MITS CSE-Data Science


UNIT- 2 Deep Learning

 is always bounded between 0 and 1

T.Swetha,MITS CSE-Data Science

You might also like