0% found this document useful (0 votes)
2 views

Ex.no.5_Naïve Bayesian classifier

The Naïve Bayes classifier uses Bayes' Theorem to classify data based on feature probabilities, primarily in high-dimensional text classification. It operates under the assumption that features are independent and is commonly applied in spam filtration and sentiment analysis. The document outlines the algorithm's steps, provides an example of its application in weather conditions, and presents problem statements for developing a text classification system and predicting tumor malignancy using a breast cancer dataset.

Uploaded by

Soyeb Mohammad
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Ex.no.5_Naïve Bayesian classifier

The Naïve Bayes classifier uses Bayes' Theorem to classify data based on feature probabilities, primarily in high-dimensional text classification. It operates under the assumption that features are independent and is commonly applied in spam filtration and sentiment analysis. The document outlines the algorithm's steps, provides an example of its application in weather conditions, and presents problem statements for developing a text classification system and predicting tumor malignancy using a breast cancer dataset.

Uploaded by

Soyeb Mohammad
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Ex no:5 Naïve Bayesian classifier

Description:
The main idea behind the Naive Bayes classifier is to use Bayes’ Theorem to classify data
based on the probabilities of different classes given the features of the data. It is used mostly
in high-dimensional text classification. The Naive Bayes Classifier is a simple probabilistic
classifier with very few parameters used to build ML models that can predict at a faster speed
than other classification algorithms. Naive Bayes is called “naive” because it assumes that the
features of a data point are independent of each other. The Naïve Bayes Algorithm is used in
spam filtration, Sentiment analysis, classifying articles, etc.
Bayes' Theorem:
Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine the
probability of a hypothesis with prior knowledge. It depends on the conditional probability.
The formula for Bayes' theorem is given as:

Where,
P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a
hypothesis is true.

Steps in Naïve Bayesian Classifier Algorithm

Naive Bayes classifier calculates the probability of an event in the following steps:

Step 1: Calculate the prior probability for given class labels


Step 2: Find Likelihood probability with each attribute for each class
Step 3: Put these values in Bayes Formula and calculate posterior probability.
Step 4: See which class has a higher probability, given the input belongs to the higher
probability class.
Example:

Given an example of weather conditions and playing sports, calculate the probability of
playing sports. Now, you need to classify whether players will play or not, based on the
weather condition.

The Frequency table contains the occurrence of labels for all features. There are two
likelihood tables. Likelihood Table 1 is showing prior probabilities of labels and Likelihood
Table 2 is showing the posterior probability.

Suppose you want to calculate the probability of playing when the weather is overcast.

Probability of playing:

P(Yes | Overcast) = P(Overcast | Yes) P(Yes) / P (Overcast) .....................(1)

Calculate Prior Probabilities:

P(Overcast) = 4/14 = 0.29

P(Yes)= 9/14 = 0.64

Calculate Posterior Probabilities:

P(Overcast |Yes) = 4/9 = 0.44

Put Prior and Posterior probabilities in equation (1)


P (Yes | Overcast) = 0.44 * 0.64 / 0.29 = 0.98(Higher)

Probability of not playing:

P(No | Overcast) = P(Overcast | No) P(No) / P (Overcast) .....................(2)

Calculate Prior Probabilities:

P(Overcast) = 4/14 = 0.29

P(No)= 5/14 = 0.36

Calculate Posterior Probabilities:

P(Overcast |No) = 0/9 = 0

Put Prior and Posterior probabilities in equation (2)

P (No | Overcast) = 0 * 0.36 / 0.29 = 0

The probability of a 'Yes' class is higher. So you can determine here if the weather is overcast
than players will play the sport.

Problem statements:

1. Given a collection of text documents belonging to multiple categories (e.g., news


articles, emails, or product reviews), manually classifying them is inefficient and
prone to errors. Develop a Naïve Bayes-based text classification system in Python to
automatically categorize documents into predefined labels. Investigate how different
preprocessing techniques (such as stopword removal, stemming, and TF-IDF
vectorization) impact classification accuracy, precision, and recall. The goal is to
build an efficient, scalable, and interpretable model for real-world applications like
spam detection, sentiment analysis, and topic categorization.

2. Given a dataset of breast cancer patient records, the task is to build a Naïve Bayes
Classifier to predict whether a tumor is malignant or benign based on various features
such as tumor size, texture, and other medical measurements.
Breast Cancer dataset
mean_radius mean_texture mean_perimeter mean_area mean_smoothness diagnosis
17.99 10.38 122.8 1001 0.1184 0
20.57 17.77 132.9 1326 0.08474 0
19.69 21.25 130 1203 0.1096 0
11.42 20.38 77.58 386.1 0.1425 0
20.29 14.34 135.1 1297 0.1003 0
12.45 15.7 82.57 477.1 0.1278 0
16.13 20.68 108.1 798.8 0.117 0
19.81 22.15 130 1260 0.09831 0
13.54 14.36 87.46 566.3 0.09779 1
13.08 15.71 85.63 520 0.1075 1
9.504 12.44 60.34 273.9 0.1024 1

You might also like