Naive Bayes Classification Numerical Example - Coding Infinite
Naive Bayes Classification Numerical Example - Coding Infinite
Table of Contents
8. Conclusion
Here,
The Bayes theorem is directly derived from the formulas of conditional probability. For
instance, you might have studied the conditional probability formulae given below.
P(A/B)=P(A∩B)/P(B)
Here,
Here,
Now, if we extract the probability P(A∩B) from both formulas, we get the following.
P(A∩B)=P(B/A)*P(A)
P(A∩B)=P(A/B)*P(B)
P(B/A)*P(A)=P(A/B)*P(B)
From the above equation, we can get the posterior probability P(A/B) as shown below.
P(A/B)=P(B/A)*P(A)/P(B)
P(B/A)=P(A/B)*P(B)/P(A)
The above two formulas represent the Bayes theorem in alternate forms.
You are given a deck of cards. You have to find the probability of a card being king
if you know that it is a face card.
P(B/A)=P(A/B)*P(B)/P(A)
P(A) i.e. the probability of a card being a face card. As there are 12 face cards out of
52, P(A)=12/52.
P(B) i.e. the probability of a card being a King. As there are 4 Kings, P(B)=4/52.
P(A/B) i.e. the probability of a King being a face card. As all the kings are face cards,
P(A/B)=1.
Now, using Bayes theorem, we can easily find the probability of a card being a King if it is
a face card.
P(B/A)=P(A/B)*P(B)/P(A)
=1*(4/52)/(12/52)
=4/12
=1/3
Hence, the probability of a card being a King, if it is a face card, is 1/3. I hope that you
have understood the Bayes theorem at this point. Now, let us discuss the Naive Bayes
classification algorithm.
In the Naive Bayes algorithm, we assume that the features in the input dataset are
independent of each other. In other words, each feature in the input dataset
independently decides the target variable or class label and is not affected by other
features. While the assumption doesn’t hold true for most of the real-world classification
problems, Naive Bayes classification is still one of the goto algorithms for classification
due to its simplicity.
1. First, we calculate the probability of each class label in the training dataset.
2. Next, we calculate the conditional probability of each attribute of the training data
for each class label given in the training data.
3. Finally, we use the Bayes theorem and the calculated probabilities to predict class
labels for new data points. For this, we will calculate the probability of the new data
point belonging to each class. The class with which we get the maximum probability
is assigned to the new data point.
To understand the above steps using a naive Bayes classification numerical example, we
will use the following dataset.
2 Green 2 Tall No M
Sl. No. Color Legs Height Smelly Species
5 Green 2 Short No H
6 White 2 Tall No H
7 White 2 Tall No H
Using the above data, we have to identify the species of an entity with the following
attributes.
To predict the class label for the above attribute set, we will first calculate the probability
of the species being M or H in total.
P(Species=M)=4/8=0.5
P(Species=H)=4/8=0.5
Next, we will calculate the conditional probability of each attribute value for each class
label.
P(Color=White/Species=M)=2/4=0.5
P(Color=White/Species=H)=¾=0.75
P(Color=Green/Species=M)=2/4=0.5
P(Color=Green/Species=H)=¼=0.25
P(Legs=2/Species=M)=1/4=0.25
P(Legs=2/Species=H)=4/4=1
P(Legs=3/Species=M)=3/4=0.75
P(Legs=3/Species=H)=0/4=0
P(Height=Tall/Species=M)=3/4=0.75
P(Height=Tall/Species=H)=2/4=0.5
P(Height=Short/Species=M)=1/4=0.25
P(Height=Short/Species=H)=2/4=0.5
P(Smelly=Yes/Species=M)=3/4=0.75
P(Smelly=Yes/Species=H)=1/4=0.25
P(Smelly=No/Species=M)=1/4=0.25
P(Smelly=No/Species=H)=3/4=0.75
We can tabulate the above calculations in the tables for better visualization.
Color M H
Legs M H
2 0.25 1
3 0.75 0
Height M H
Smelly M H
No 0.25 0.75
Now that we have calculated the conditional probabilities, we will use them to calculate
the probability of the new attribute set belonging to a single class.
P(M/X)=P(Species=M)*P(Color=Green/Species=M)*P(Legs=2/Species=M)*P(Height=Tall/Sp
ecies=M)*P(Smelly=No/Species=M)
=0.5*0.5*0.25*0.75*0.25
=0.0117
P(H/X)=P(Species=H)*P(Color=Green/Species=H)*P(Legs=2/Species=H)*P(Height=Tall/Sp
ecies=H)*P(Smelly=No/Species=H)
=0.5*0.25*1*0.5*0.75
=0.0468
So, the probability of X belonging to Species M is 0.0117 and that to Species H is 0.0468.
Hence, we will assign the entity X with attributes {Color=Green, Legs=2, Height=Tall,
Smelly=No} to species H.
In this way, we can predict the class label for any number of new data points.
Gaussian Classifiers: The Gaussian Naive Bayes classifier assumes that the attributes
of a dataset have a normal distribution. Here, if the attributes have continuous
values, the classification model assumes that the values are sampled from a Gaussian
distribution.
Multinomial Naive Bayes Classifier: When the input data is multinomially
distributed, we use the multinomial naive Bayes classifier. This algorithm is primarily
used for document classification problems like sentiment analysis.
Bernoulli Classifiers: The Bernoulli Naive Bayes classification works in a similar
manner to the multinomial classification. The difference is that the attributes of the
dataset contain boolean values representing the presence or absence of a particular
attribute in a data point.
The naive Bayes classification algorithm is one of the fastest and easiest machine
learning algorithms for classification.
We can use the Naive Bayes classification algorithm for building binary as well as
multi-class classification models.
The Naive Bayes algorithm performs better than many classification algorithms while
implementing multi-class classification models.
Apart from its advantages, the naive Bayes classification algorithm also has some
drawbacks. The algorithm assumes that the attributes of the training dataset are
independent of each other. This assumption is not always True. Hence, when there is a
correlation between two attributes in a given training set, the naive Bayes algorithm will
not perform well.
The most popular use of the Naive Bayes classification algorithm is in text
classification. We often build spam filtering and sentiment analysis models using the
naive Bayes algorithm.
We can use the Naive Bayes classification algorithm to build applications to predict
the credit score and loan worthiness of customers in a bank.
The Naive Bayes classifier is an eager learner. Hence, we can use it for real-time
predictions too.
We can also use the Naive Bayes classification algorithm to implement models for
detecting diseases based on the medical results of the patients.
Conclusion
In this article, we discussed the Bayes theorem and the Naive Bayes classification
algorithm with a numerical example. To learn more about machine learning algorithms,
you can read this article on KNN classification numerical example. You might also like this
article on overfitting and underfitting in machine learning.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
Aditya
PREVIOUS NEXT
Similar Posts
Search
Enter
POPULAR CATEGORIES
Android
Java
Machine Learning
Kotlin
.Net Core
.Net
C#
Python
JavaScript
Latest Articles
Ensembling Techniques in Machine Learning
July 29, 2023
About
Advertise With Us
Ask a Question
Contact Disclaimer