0% found this document useful (0 votes)

4 views

DA_Unit_2

This document provides an overview of supervised learning with a focus on regression analysis, explaining its purpose, methods, and common algorithms. It also covers Bayesian decision theory, including concepts such as prior and posterior probabilities, and introduces the Naïve Bayes classifier and Support Vector Machines (SVM). The document highlights the advantages and applications of these techniques in various fields.

Uploaded by

chahat9076

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

DA_Unit_2

Uploaded by

chahat9076

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 124

Unit 2

Supervised Learning: Regression

Pallavi Shukla
Assistant Professor
UCER
Regression
• Regression analysis is a statistical method to model the relationship
between dependent (target) and independent (predictor) variables
with one or more independent variables.
• Helps us to understand how the value of the dependent variable
changes corresponding to an independent variable when other
independent variables are held fixed.
• Regression searches for relationships among variables.
• For example, you can observe several employees of some company
and try to understand how their salaries depend on the features, such
as experience, level of education, role, city they work in, and so on.
Regression
• In Regression, we plot a graph between the variables which best fits the
given datapoints.
• Using this plot, the machine learning model can make predictions about
the data.
• In simple words, "Regression shows a line or curve that passes through
all the datapoints on target-predictor graph in such a way that the
vertical distance between the datapoints and the regression line is
minimum."
• The distance between datapoints and line tells whether a model has
captured a strong relationship or not.
Examples

• Prediction of rain using temperature and other factors

• Determining Market trends
• Prediction of road accidents due to rash driving.
Taxonomy -

• Dependent Variable: The main factor in Regression analysis which we

want to predict or understand is called the dependent variable. It is also
called target variable.
• Independent Variable: The factors which affect the dependent variables
or which are used to predict the values of the dependent variables are
called independent variable, also called as a predictor.
• Outliers: Outlier is an observation which contains either very low value or
very high value in comparison to other observed values. An outlier may
hamper the result, so it should be avoided.
• .
Taxonomy -

• Multicollinearity: If the independent variables are highly correlated with

each other than other variables, then such condition is called
Multicollinearity. It should not be present in the dataset, because it
creates problem while ranking the most affecting variable.

• Underfitting and Overfitting: If our algorithm works well with the training
dataset but not well with test dataset, then such problem is
called Overfitting. And if our algorithm does not perform well even with
training dataset, then such problem is called underfitting.
Common Regression Algorithms
The most common regression algorithms are:
• Simple Linear Regression
• Multiple Linear Regression
• Polynomial Linear Regression
• Multivariate adaptive regression splines
• Logistic Regression
• Maximum likelihood estimation(least squares)
Linear Regression:

• Linear regression is a statistical regression method that is used for

predictive analysis.
• It is one of the very simple and easy algorithms that works on regression
and shows the relationship between the continuous variables.
• It is used for solving the regression problem in machine learning.
• Linear regression shows the linear relationship between the independent
variable (X-axis) and the dependent variable (Y-axis), hence called linear
regression.
• If there is only one input variable (x), then such linear regression is
called simple linear regression. And if there is more than one input
variable, then such linear regression is called multiple linear regression.
• The relationship between variables in the linear regression model can be
explained using the below image. Here we are predicting the salary of an
employee on the basis of the year of experience.
Applications of linear regression are:

• Analyzing trends and sales estimates

• Salary forecasting
• Real estate prediction
• Arriving at ETAs in traffic.
Simple Linear Regression
Slop of Simple Linear Regression
Linear Positive Slope
Curve Linear Positive Slope
Linear Negative Slope
Curve Linear Negative Slope
No Relationship Graph
Error in Simple regression
Example
Multiple Linear Regression-
Logistic Regression -
Bayesian Decision Theory

• It is a method to take actions based on present observations.

• Mr. R. Thomas Bayes suggested this method in the year
1761.
Basic Concepts in Bayes Decision Theory -

• Marginal Probability (Simple Probability) P(A) –

• The ordinary probability of occurrence of an event (A) irrespective of all
other events is called simple or marginal probability.
• P(A) = No. of Successful Events
Total no. of all Events
Basic Concepts in Bayes Decision Theory -

• Condition Probability P(A/B) –

• The probability (P) of the occurrence of an event (A), when event(B)
has already occurred is called Conditional probability.
Basic Concepts in Bayes Decision Theory -

• Joint Probability P(A,B) –

• The occurrence of two events (A) and (B) simultaneously is called
Joint Probability.
Basic Concepts in Bayes Decision Theory -

• Prior: The prior knowledge or belief about the probabilities of

various hypotheses in H is called Prior in the Context of Bayes’
theorem.
• Ex- Knowledge about tumors can be used to validate tumors
being malignant.
Basic Concepts in Bayes Decision Theory

• Posterior – The probability that a particular hypothesis holds for a dataset

based on Prior is called the Posterior Probability or simply Posterior.

Example: The probability of the hypothesis that the patient has

a malignant tumour considering the Prior of correctness of the
malignancy test.
BAYES’ THEOREM
• It is based on the conditional probability. It is given as :
SOME MORE QUESTIONS:

• Q1 – To calculate the probability of “fire” when “smoke “ is given with data

as: P(Fire) = Prior Probability = 0.3, P(Smoke|Fire) = Likelihood
Probability = 0.5, P(Smoke) = Evidence = 0.7
• Q2 – (Patient Diseases Problem) Let us consider data of a patient as
Effect = the state of patient having red dot on skin. Cause = The state of a
patient having Rubella Disease. Given probabilities , P(Cause) = 0.001,
P(Effect) = 0.01, P(Effect|Cause)= 0.9. Use Baye's rule to find the value of
probability P(Cause|Effect).
Bayes’ Theorem in Terms of Posterior
Probability -
P(h|D) = P(D|h) . P(h)
P(D)
P( h|D) = called posterior Probability or conditional probability of the hypothesis
(h) when data(D) is given
P(D|h)= called likelihood Probability or conditional Probability of Data(D) when
hypothesis(h) is given.
P(h) = Prior probability of a hypothesis (h) or simple probability of a hypothesis(h).
P(D) = Prior probability of data(D) or simple probability of D.
Maximum a Posterior(MAP) Hypothesis

• The maximum probable hypothesis is called the Maximum A Posterior (MAP)

hypothesis.
• Denoted by (hMAP).
• hMAP = Arg max P(h|D) = Arg max P(D|h) . P(h)
P(D)
• In above equation we can ignore the denominator
hMAP = Arg max P(D|h) . P(h)
Maximum Likelihood(ML) Hypothesis

• All hypothesis are equiprobable.

hML = Arg max P(D|h)
Difference between Max f(x) and Arg max
f(x) functions in Mathematics -
Max f(x) Arg max f(x)
Maximum value of function f(x) Called Argument of variable(x) at which the
function f(x) has maximum value.

Ex – Ex –
Max f(x) of sin θ = 90o Arg.max f(x) of sin θ = 1
It means sin θ has its max value at 90o It means sin θ has a maximum value of 1.
BRUTE FORCE BAYESIAN CONCEPT
LEARNING -
• Also called Brute Force Algorithm.
P(h|D) = P(D|h) . P(h)
P(D)
• hMAP = Arg max P(h|D)
• Let P(h) = 1 / |H| for all h in H.
• h = a single hypothesis , H = a set consisting pf all hypothesis
• H = {h1, h2, h3,……..hn}
• Now, P(h) = Probability of hypothesis (h)
• P(D|h) = 1 , if di = h(xi)
0 , otherwise
P(D|h) = Conditional probability of data(D) when hypothesis (h) is given
di = Data Value
Xi = Variable Value
P(h|D) = 1 . 1/|H|
P(D)
• But P(D) = |VS H,D| / |H|
• Now, putting thois value in above equation ,

• = Probability of Value(vj ) when data is given

• = Probability of value (vj) when hypothesis (hi) is given
• = Probability of hypothesis (hi) , when data is given
Naïve Bayes Classifier

• It is a supervised learning algorithm.

• Based on Bayes Theorem.
• Used for solving classification problems in machine learning.
• It is a probabilistic classifier.
• It predicts on the basis of the probability of an event.
Naïve Bayes Classifier
• Naïve : Means “untrained” or “without experience”.
• Bayes : It is defined on Bayes Theorem.

• Question : We have been given dataset for weather condition with two
columns in which one has a value of weather condition and the other
column reports regarding whether player has gone for playing or not.
Find the probability of player going for play on sunny day.
0 Outlook Play
0. Rainy Yes
1. Sunny Yes
2. Overcast Yes
3. Overcast Yes
4. Sunny No
5. Rainy Yes
6. Sunny Yes
7. Overcast Yes
8. Rainy No
9. Sunny No
10. Sunny Yes
11. Rainy No
12. Overcast Yes
13. Overcast Yes
Solution: Frequency Table

Weather Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 5
Make Likelihood Table :

Weather No Yes Likelihood

Overcast 0 5 5/14 = 0.35
Rainy 2 2 4/14 = 0.29
Sunny 2 3 5/14=0.35
All 4/14 =0.29 10/14 =0.71
Apply Bayes Theorem :
• P (A|B) = P(B|A) .P(A)
• P(B)

• P(Yes | Suuny) = P(Sunny|Yes) x P(Yes)

• P(Sunny)
• P(Sunny |Yes) = 3/10 = 0.3
• P(Sunny) = 0.35
• P(Yes) = 0.71
• P(Yes |Sunny) = (0.3 x 0.71) /0.35 = 0.60
• P (No |Sunny) = P(Sunny |No) x P(No) = 0.5 X 0.29 = 0.41
• P(Sunny) 0.35
• As the P (Yes |Sunny) > P(No| Sunny) i.e 0.60 >0.41
• Therefore , we can say that on a sunny day, the player will go for play.
Advantages of Naïve Bayes Classifier -
• It is a fast and easy algorithm for classification.
• It can be used for binary and multi-classification.
• It is mostly used for text classification problems.
Disadvantages of Naïve Bayes Classifier

• It cannot learn relation between independent features.

Applications of Naïve Bayes Classifier
• Real Time Prediction
• Text Classification
• Sentiment Analysis
• Multiclass Classification
• Spam Filtering
• Recommendation System
BAYESIAN BELIEF NETWORKS
• A Bayesian Belief Network is a probabilistic graphical model. It represents a
set of variables and their conditional dependencies using a directed acyclic
graph.

• Also called Bayes Network, Belief Network , Decision Network or Bayesian

Model.
• The Bayesian network consists of two parts:
• Directed Acyclic Graph.
• Table of Conditional Probabilities.
• The Bayesian Belief networks are based on the joint probability and marginal
probability.
Support Vector Machine
• It is most popular supervised learning technique which is used for both
classification and regression tasks.
• It is mainly used for classification problems in machine learning.
• Objective of an SVM algorithm is to find a hyperplane in an N-dimensional
space, that distinctly classify the data points.
Support Vectors
• They are simply the coordinates of individual observation.

• SVM classifier is a frontier that best segregates the two classes (Hyper plane/
line)
Support Vector Machine Terminology
1.Hyperplane: Hyperplane is the decision boundary that is used to
separate the data points of different classes in a feature space. In
the case of linear classifications, it will be a linear equation i.e.
wx+b = 0.
2.Support Vectors: Support vectors are the closest data points to
the hyperplane, which makes a critical role in deciding the
hyperplane and margin.
3.Margin: Margin is the distance between the support vector and
hyperplane. The main objective of the support vector machine
algorithm is to maximize the margin. The wider margin indicates
better classification performance.
Support Vector Machine Terminology
1. Kernel: Kernel is the mathematical function, which is used in SVM to map the
original input data points into high-dimensional feature spaces, so, that the
hyperplane can be easily found out even if the data points are not linearly
separable in the original input space. Some of the common kernel functions are
linear, polynomial, radial basis function(RBF), and sigmoid.
2. Hard Margin: The maximum-margin hyperplane or the hard margin hyperplane
is a hyperplane that properly separates the data points of different categories
without any misclassifications.
3. Soft Margin: When the data is not perfectly separable or contains outliers, SVM
permits a soft margin technique. Each data point has a slack variable introduced
by the soft-margin SVM formulation, which softens the strict margin
requirement and permits certain misclassifications or violations. It discovers a
compromise between increasing the margin and reducing violations.
Types of SVM

• Linear SVM
Non Linear SVM
PROPERTIES OF SVM
• Flexibility in choosing a similarity function
• Sparseness of solution when dealing with large data sets - only support vectors are
used to specify the separating hyperplane
• Ability to handle large feature spaces - complexity does not depend on the
dimensionality of the feature space
• Overfitting can be controlled by soft margin approach
• Nice math property: a simple convex optimization problem which is guaranteed to
converge to a single global solution
• Feature Selection
Advantages of SVM
• Handling high-dimensional data: SVMs are effective in handling high-
dimensional data, which is common in many applications such as image
and text classification.
• Handling small datasets: SVMs can perform well with small datasets,
as they only require a small number of support vectors to define the
boundary.
• Modeling non-linear decision boundaries: SVMs can model non-linear
decision boundaries by using the kernel trick, which maps the data into a
higher-dimensional space where the data becomes linearly separable.
Advantages of SVM
• Robustness to noise: SVMs are robust to noise in the data, as the decision boundary is determined
by the support vectors, which are the closest data points to the boundary.
• Generalization: SVMs have good generalization performance, which means that they are able to
classify new, unseen data well.
• Versatility: SVMs can be used for both classification and regression tasks, and it can be applied to a
wide range of applications such as natural language processing, computer vision, and
bioinformatics.
• Sparse solution: SVMs have sparse solutions, which means that they only use a subset of the
training data to make predictions. This makes the algorithm more efficient and less prone to
overfitting.
• Regularization: SVMs can be regularized, which means that the algorithm can be modified to avoid
overfitting.
Disadvantages of SVM
• Computationally expensive: SVMs can be computationally expensive for large
datasets, as the algorithm requires solving a quadratic optimization problem.
• Choice of kernel: The choice of kernel can greatly affect the performance of an
SVM, and it can be difficult to determine the best kernel for a given dataset.
• Sensitivity to the choice of parameters: SVMs can be sensitive to the choice of
parameters, such as the regularization parameter, and it can be difficult to
determine the optimal parameter values for a given dataset.
• Memory-intensive: SVMs can be memory-intensive, as the algorithm requires
storing the kernel matrix, which can be large for large datasets.
Disadvantages of SVM
• Limited to two-class problems: SVMs are primarily used for two-class
problems, although multi-class problems can be solved by using one-versus-
one or one-versus-all strategies.
• Lack of probabilistic interpretation: SVMs do not provide a probabilistic
interpretation of the decision boundary, which can be a disadvantage in some
applications.
• Not suitable for large datasets with many features: SVMs can be very slow and
can consume a lot of memory when the dataset has many features.
• Not suitable for datasets with missing values: SVMs requires complete datasets,
with no missing values, it can not handle missing values.
Applications of SVM
1.Face observation – It is used for detecting the face according to
the classifier and model.
2.Text and hypertext arrangement – In this, the categorization
technique is used to find important information or you can say
required information for arranging text.
3.Grouping of portrayals – It is also used in the Grouping of
portrayals for grouping or you can say by comparing the piece of
information and take an action accordingly.
Applications of SVM
1. Bioinformatics – It is also used for medical science as well like in laboratory,
DNA, research, etc.
2. Handwriting remembrance – In this, it is used for handwriting recognition.
3. Protein fold and remote homology spotting – It is used for spotting or you can
say the classification class into functional and structural classes given their
amino acid sequences. It is one of the problems in bioinformatics.
4. Generalized predictive control(GPC) – It is also used for Generalized predictive
control(GPC) for predicting and it relies on predictive control using a multilayer
feed-forward network as the plants linear model is presented
Applications of SVM
5. Facial Expression Classification – Support vector machines (SVMs) is a
binary classification technique. The face Expression Classification model
determines the precise face expression by modeling differences between
two facial images. Validation techniques include the leave-one-out
methods and the K-fold test methods.
6. Speech Recognition – The transcription of speech into text is called
speech recognition. Mel Frequency Cepstral Coefficients (MFCC)-based
features are used to train Support Vector Machines (SVM), which are used
for figuring out speech. Speech recognition is a challenging classification
problem that is categorized using a variety of mathematical techniques,
including support vector machines, pattern recognition techniques, etc
• For any query : Write mail to

[email protected]

Projectreport Diabetes Prediction
No ratings yet
Projectreport Diabetes Prediction
25 pages
Business Report Machine Learning-1
100% (7)
Business Report Machine Learning-1
60 pages
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
No ratings yet
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
15 pages
Probability Theory For Machine Learning: Chris Cremer September 2015
No ratings yet
Probability Theory For Machine Learning: Chris Cremer September 2015
40 pages
Slide 1
No ratings yet
Slide 1
37 pages
ML - Unit 2
No ratings yet
ML - Unit 2
155 pages
Bayesian Learning
No ratings yet
Bayesian Learning
81 pages
2022 Slide9 BayesML Eng
No ratings yet
2022 Slide9 BayesML Eng
34 pages
2024 - Slide2 - BayesML Sub
No ratings yet
2024 - Slide2 - BayesML Sub
40 pages
MLT by engineering express
No ratings yet
MLT by engineering express
94 pages
Mathematics in Machine Learning
No ratings yet
Mathematics in Machine Learning
83 pages
Business Econometrics Using SAS Tools (BEST) : Class IV - Probability Refresher
No ratings yet
Business Econometrics Using SAS Tools (BEST) : Class IV - Probability Refresher
31 pages
Unit 2linear Regression Bayesian Learning
No ratings yet
Unit 2linear Regression Bayesian Learning
49 pages
8-Probability Theory Cont''d and BAYESIAN LEARNING-01!08!2024
No ratings yet
8-Probability Theory Cont''d and BAYESIAN LEARNING-01!08!2024
22 pages
MLT Assignment 1
No ratings yet
MLT Assignment 1
13 pages
2BAYESIAN LEARNING (1)
No ratings yet
2BAYESIAN LEARNING (1)
22 pages
Baysian Analysis Notes
No ratings yet
Baysian Analysis Notes
30 pages
Predicates and Quantifiers: MATH-2305 Discrete Mathematics
No ratings yet
Predicates and Quantifiers: MATH-2305 Discrete Mathematics
32 pages
Module4 Notes
100% (1)
Module4 Notes
31 pages
Bayes Theorem
No ratings yet
Bayes Theorem
20 pages
Bayesian Ibrahim
No ratings yet
Bayesian Ibrahim
370 pages
Naive Bayes
No ratings yet
Naive Bayes
60 pages
Cpts 440 / 540 Artificial Intelligence: Uncertainty Reasoning
No ratings yet
Cpts 440 / 540 Artificial Intelligence: Uncertainty Reasoning
59 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
67 pages
Naive Bayes
No ratings yet
Naive Bayes
36 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
Wa0002.
No ratings yet
Wa0002.
24 pages
L2_ Mathematical Preliminaries
No ratings yet
L2_ Mathematical Preliminaries
41 pages
Unit 4
No ratings yet
Unit 4
18 pages
Bayesian
No ratings yet
Bayesian
91 pages
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
No ratings yet
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
141 pages
Unit 2 .Statistical Decision Making-1
No ratings yet
Unit 2 .Statistical Decision Making-1
213 pages
07 Naive - Bayes
No ratings yet
07 Naive - Bayes
7 pages
Unit 2 Bayesian Learning
No ratings yet
Unit 2 Bayesian Learning
50 pages
Statistics With R
No ratings yet
Statistics With R
20 pages
R Print
No ratings yet
R Print
10 pages
Bayesian Learning Note
No ratings yet
Bayesian Learning Note
20 pages
Lecture4 Mech SU
No ratings yet
Lecture4 Mech SU
17 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
ML Assignment 1
No ratings yet
ML Assignment 1
7 pages
MLT Unit 2 - Updated
No ratings yet
MLT Unit 2 - Updated
58 pages
Bayes Classifier
No ratings yet
Bayes Classifier
35 pages
CDA Course
No ratings yet
CDA Course
196 pages
MODULE - 4 QB SOLVED-1
No ratings yet
MODULE - 4 QB SOLVED-1
31 pages
Unit 2
No ratings yet
Unit 2
102 pages
Unit 3c Linear Regression
No ratings yet
Unit 3c Linear Regression
98 pages
8.-Naive-Bayes-Classifier
No ratings yet
8.-Naive-Bayes-Classifier
37 pages
Gaussian Mixture Model
No ratings yet
Gaussian Mixture Model
48 pages
L2 - Mathematical Preliminaries.
No ratings yet
L2 - Mathematical Preliminaries.
42 pages
Pattern Recognition Machine Learning: Chapter 3: Linear Models For Regression
100% (1)
Pattern Recognition Machine Learning: Chapter 3: Linear Models For Regression
48 pages
Unit 2 (2) - 1
No ratings yet
Unit 2 (2) - 1
37 pages
w4 Generalisation
No ratings yet
w4 Generalisation
42 pages
R Programming Student Lab Manual-52-63-3-12
No ratings yet
R Programming Student Lab Manual-52-63-3-12
10 pages
BST413 12jan Page1to11
No ratings yet
BST413 12jan Page1to11
11 pages
Chapter 9 Data Mining
No ratings yet
Chapter 9 Data Mining
147 pages
Block 4 ST3189
No ratings yet
Block 4 ST3189
25 pages
ML Unit-Iii
No ratings yet
ML Unit-Iii
178 pages
Predicates and Quantifiers: Presenter Yukun Wang Computer Science and Technology
No ratings yet
Predicates and Quantifiers: Presenter Yukun Wang Computer Science and Technology
55 pages
EC-512EC512 LecNotes Pt2
No ratings yet
EC-512EC512 LecNotes Pt2
29 pages
U3 Prob & Stat & Hypo
No ratings yet
U3 Prob & Stat & Hypo
80 pages
Unit 5 - Machine Learning
No ratings yet
Unit 5 - Machine Learning
16 pages
Slide 2
No ratings yet
Slide 2
30 pages
Exercises of Statistical Inference
From Everand
Exercises of Statistical Inference
Simone Malacrida
No ratings yet
Cross Sell PDF
No ratings yet
Cross Sell PDF
8 pages
CE802 Report
No ratings yet
CE802 Report
7 pages
G.C. Calafiore (Politecnico Di Torino)
No ratings yet
G.C. Calafiore (Politecnico Di Torino)
39 pages
PGC _ ML_Deep Learning_ v5(1)
No ratings yet
PGC _ ML_Deep Learning_ v5(1)
18 pages
15-381 Spring 2007 Assignment 6: Learning
No ratings yet
15-381 Spring 2007 Assignment 6: Learning
14 pages
Csis355 Classifications 1
No ratings yet
Csis355 Classifications 1
70 pages
Django Tutorial
No ratings yet
Django Tutorial
26 pages
Synopsis
No ratings yet
Synopsis
8 pages
106 - Machine Learning and Credit Risk Modelling
100% (1)
106 - Machine Learning and Credit Risk Modelling
8 pages
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
No ratings yet
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
7 pages
AD3461_ML Lab Manual
No ratings yet
AD3461_ML Lab Manual
54 pages
SummerSchool2022chugginguhg Dmi Foswiki
No ratings yet
SummerSchool2022chugginguhg Dmi Foswiki
24 pages
Lab6 - Naive Bayes Classification
No ratings yet
Lab6 - Naive Bayes Classification
4 pages
Human Activity Recognition Using CNN
No ratings yet
Human Activity Recognition Using CNN
51 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
Sample Copy of Major Project Report
No ratings yet
Sample Copy of Major Project Report
60 pages
Megersa Oljira
100% (3)
Megersa Oljira
106 pages
Chapter 5 Artificial Intelligence notes
No ratings yet
Chapter 5 Artificial Intelligence notes
7 pages
Artificial Intelligence Certification
No ratings yet
Artificial Intelligence Certification
8 pages
Data Science Course Content - Artificial Intelligence - Machine Learning
No ratings yet
Data Science Course Content - Artificial Intelligence - Machine Learning
14 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
11 pages
Assignment Week 3 500832
No ratings yet
Assignment Week 3 500832
6 pages
Teit Cbgs Dmbi Lab Manual FH 2015
No ratings yet
Teit Cbgs Dmbi Lab Manual FH 2015
60 pages
Web Phishing Detection Using Machine Learning
No ratings yet
Web Phishing Detection Using Machine Learning
22 pages
Cyberbullying A17 Major Project
No ratings yet
Cyberbullying A17 Major Project
98 pages
Prediction of Airline Ticket Price: Motivation Models Diagnostics
No ratings yet
Prediction of Airline Ticket Price: Motivation Models Diagnostics
1 page
Wehel Hadi (8686807) Thesis Applied Data Science
No ratings yet
Wehel Hadi (8686807) Thesis Applied Data Science
46 pages

DA_Unit_2

Uploaded by

DA_Unit_2

Uploaded by

Unit 2

Supervised Learning: Regression

• Prediction of rain using temperature and other factors

• Dependent Variable: The main factor in Regression analysis which we

• Multicollinearity: If the independent variables are highly correlated with

• Linear regression is a statistical regression method that is used for

• Analyzing trends and sales estimates

• It is a method to take actions based on present observations.

• Marginal Probability (Simple Probability) P(A) –

• Condition Probability P(A/B) –

• Joint Probability P(A,B) –

• Prior: The prior knowledge or belief about the probabilities of

• Posterior – The probability that a particular hypothesis holds for a dataset

Example: The probability of the hypothesis that the patient has

• Q1 – To calculate the probability of “fire” when “smoke “ is given with data

• The maximum probable hypothesis is called the Maximum A Posterior (MAP)

• All hypothesis are equiprobable.

• = Probability of Value(vj ) when data is given

• It is a supervised learning algorithm.

Weather No Yes Likelihood

• P(Yes | Suuny) = P(Sunny|Yes) x P(Yes)

• It cannot learn relation between independent features.

• Also called Bayes Network, Belief Network , Decision Network or Bayesian

You might also like