0% found this document useful (0 votes)
24 views

Supervised Learning

Supervised Learning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Supervised Learning

Supervised Learning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Class 6

Supervised Learning
• Supervised learning is the types of machine learning in which machines
are trained using well "labelled" training data, and on basis of that data,
machines predict the output.
• The labelled data means some input data is already tagged with the
correct output.
• Supervised learning is a process of providing input data as well as
correct output data to the machine learning model.
• The aim of a supervised learning algorithm is to find a mapping
function to map the input variable(x) with the output variable(y).
• In the real-world, supervised learning can be used for Risk Assessment,
Image classification, Fraud Detection, spam filtering, etc.
How Supervised Learning Works?

• In supervised learning, models are trained using labelled dataset,


where the model learns about each type of data.
• Once the training process is completed, the model is tested on the
basis of test data (a subset of the training set), and then it predicts
the output.
• The working of Supervised learning can be easily understood by the
below example and diagram:
• Suppose we have a dataset of different types of shapes which
includes square, rectangle, triangle, and Polygon.
• Now the first step is that we need to train the model for each shape.

• If the given shape has four sides, and all the sides are equal, then it
will be labelled as a Square.
• If the given shape has three sides, then it will be labelled as a triangle.
• If the given shape has six equal sides then it will be labelled
as hexagon.
• Now, after training, we test our model using the test set, and the task
of the model is to identify the shape.
• The machine is already trained on all types of shapes, and when it
finds a new shape, it classifies the shape on the bases of a number of
sides, and predicts the output.
Steps Involved in Supervised Learning:
• First Determine the type of training datasetCollect/Gather the
labelled training data.
• Split the training dataset into training dataset, test dataset, and
validation dataset.
• Determine the input features of the training dataset, which should
have enough knowledge so that the model can accurately predict the
output.
• Determine the suitable algorithm for the model, such as support
vector machine, decision tree, etc.
• Execute the algorithm on the
training dataset. Sometimes we
need validation sets as the control
parameters, which are the subset
of training datasets.
• Evaluate the accuracy of the
model by providing the test set. If
the model predicts the correct
output, which means our model is
accurate
Types of supervised Machine
learning Algorithms:
1. Regression
• Regression algorithms are used if there is a relationship between the
input variable and the output variable.
• It is used for the prediction of continuous variables, such as Weather
forecasting, Market Trends, etc.
Below are some popular Regression algorithms which come under
supervised learning:
• Linear Regression
• Regression Trees
• Non-Linear Regression
• Bayesian Linear Regression
• Polynomial Regression
2. Classification
Classification algorithms are used when the output variable is
categorical, which means there are two classes such as Yes-No, Male-
Female, True-false, etc.
• Spam Filtering,
• Random Forest
• Decision Trees
• Logistic Regression
• Support vector Machines
Regression Analysis in Machine learning
• Regression analysis is a statistical method to model the relationship
between a dependent (target) and independent (predictor) variables
with one or more independent variables.
• More specifically, Regression analysis helps us to understand how the
value of the dependent variable is changing corresponding to an
independent variable when other independent variables are held
fixed. It predicts continuous/real values such as temperature, age,
salary, price, etc.
We can understand the concept of regression analysis using the below
example:
• Example: Suppose there is a marketing company A, who does various
advertisement every year and get sales on that. The below list shows
the advertisement made by the company in the last 5 years and the
corresponding sales:
• Now, the company wants to do the advertisement of $200 in the year
2019 and wants to know the prediction about the sales for this year.
• So to solve such type of prediction problems in machine learning, we
need regression analysis.
• In Regression, we plot a graph between the variables which best fits
the given datapoints, using this plot, the machine learning model can
make predictions about the data.
• In simple words, "Regression shows a line or curve that passes
through all the datapoints on target-predictor graph in such a way
that the vertical distance between the datapoints and the regression
line is minimum."
• The distance between datapoints and line tells whether a model has
captured a strong relationship or not.
Some examples of regression can be as:
• Prediction of rain using temperature and other factors
• Determining Market trends
• Prediction of road accidents due to rash driving.
Terminologies Related to the Regression Analysis:
• Dependent Variable: The main factor in Regression analysis which we
want to predict or understand is called the dependent variable. It is
also called target variable.
• Independent Variable: The factors which affect the dependent
variables or which are used to predict the values of the dependent
variables are called independent variable, also called as a predictor.
• Outliers: Outlier is an observation which contains either very low
value or very high value in comparison to other observed values. An
outlier may hamper the result, so it should be avoided.
• Multicollinearity: If the independent variables are highly correlated
with each other than other variables, then such condition is called
Multicollinearity.
• Underfitting and Overfitting: If our algorithm works well with the
training dataset but not well with test dataset, then such problem is
called Overfitting. And if our algorithm does not perform well even with
training dataset, then such problem is called underfitting.
Below are some other reasons for using Regression analysis:
• Regression estimates the relationship between the target and the
independent variable.
• It is used to find the trends in data.
• It helps to predict real/continuous values.
• By performing the regression, we can confidently determine the most
important factor, the least important factor, and how each factor is
affecting the other factors.
Types of Regression
• Linear Regression
• Logistic Regression
• Polynomial Regression
• Support Vector Regression
• Decision Tree Regression
• Random Forest Regression
• Ridge Regression
• Lasso Regression:
Linear Regression:
• Linear regression is a statistical regression method which is used for
predictive analysis.
• It is one of the very simple and easy algorithms which works on
regression and shows the relationship between the continuous variables.
• It is used for solving the regression problem in machine learning.
• Linear regression shows the linear relationship between the independent
variable (X-axis) and the dependent variable (Y-axis), hence called linear
regression.
• If there is only one input variable (x), then such linear regression is
called simple linear regression. And if there is more than one input
variable, then such linear regression is called multiple linear regression.
• The relationship between variables in the linear regression model can be
explained using the below image. Here we are predicting the salary of an
employee on the basis of the year of experience.
Below is the mathematical equation for Linear regression:
• Y= aX+b
• Here, Y = dependent variables (target variables),
X= Independent variables (predictor variables),
a and b are the linear coefficients
Some popular applications of linear regression are:
• Analyzing trends and sales estimates
• Salary forecasting
• Real estate prediction
• Arriving at ETAs in traffic.
Logistic Regression:
• Logistic regression is another supervised learning algorithm which is
used to solve the classification problems. In classification problems,
we have dependent variables in a binary or discrete format such as 0
or 1.
• Logistic regression algorithm works with the categorical variable such
as 0 or 1, Yes or No, True or False, Spam or not spam, etc.
• It is a predictive analysis algorithm which works on the concept of
probability.
• Logistic regression is a type of regression, but it is different from the
linear regression algorithm in the term how they are used.
• Logistic regression uses sigmoid function or logistic function which is
a complex cost function. This sigmoid function is used to model the
data in logistic regression. The function can be represented as:
•f(x)= Output between the 0 and 1 value.
•x= input to the function
•e= base of natural logarithm.
When we provide the input values (data) to the function, it gives the S-curve as
follows:

It uses the concept of threshold levels, values above the threshold level are rounded up
to 1, and values below the threshold level are rounded up to 0.
Polynomial Regression:
• Polynomial Regression is a type of regression which models the non-
linear dataset using a linear model.
• It is similar to multiple linear regression, but it fits a non-linear curve
between the value of x and corresponding conditional values of y.
• Suppose there is a dataset which consists of datapoints which are
present in a non-linear fashion, so for such case, linear regression will
not best fit to those datapoints.
• To cover such datapoints, we need Polynomial regression.
• In Polynomial regression, the original features are transformed into
polynomial features of given degree and then modeled using a
linear model. Which means the datapoints are best fitted using a
polynomial line.
The equation for polynomial regression also derived from linear
regression equation that means Linear regression equation Y= b0+ b1x,
is transformed into Polynomial regression equation Y= b0+b1x+ b2x2+
b3x3+.....+ bnxn.
• Here Y is the predicted/target output, b0, b1,... bn are the regression
coefficients. x is our independent/input variable.
• The model is still linear as the coefficients are still linear with
quadratic

You might also like