ML Lab 7 - Naive Bayes

The document discusses the Naive Bayes algorithm and its assumptions. It demonstrates modeling a sample tennis dataset using Naive Bayes classification. The algorithm calculates probabilities of class membership based on applying Bayes' theorem. Laplace smoothing is introduced to address probabilities being zero for features not present in some classes. Training and test datasets are split for model evaluation.

Uploaded by

PRIYANSH AGGARWAL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views

ML Lab 7 - Naive Bayes

Uploaded by

PRIYANSH AGGARWAL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Labsheet - 7

Naive Bayes
Machine Learning
BITS F464

I Semester 2023-24

The Naive Bayes (NB) algorithm describes a simple application using Bayes' theorem for
classification. The naive Bayes algorithm is named as such because it makes a couple of "naive"
assumptions about the data.
1. All of the features in the dataset are equally important and independent.
2. It assumes class-conditional independence, which means that events are independent so long
as they are conditioned on the same class value.
Consider the below sample dataset comprising of target concept i.e. PlayTennis and 4 features :
Outlook, Temperature, Humidity, and Windy

Outlook Temperature Humidity Windy PlayTennis

sunny hot high weak no

sunny hot high strong no

overcast hot high weak yes

rain mild high weak yes

rain cool normal weak yes

rain cool normal strong no

overcast cool normal strong yes

sunny mild high weak yes

sunny cool normal weak yes

rain mild normal weak yes

sunny mild normal strong yes

overcast mild high strong no

overcast hot normal weak yes

rain mild high strong no

Modeling the dataset using Bayesian Concept-

P(YES|X) = P(X|YES) P(YES) ⁄ P(X)
P(NO|X) = P(X|NO) P(NO) ⁄ P(X)
where X = (Outlook=Rain,Temperature=Mild,Humidity=High,Windy=strong)
Because the denominator is the same in both cases, it can be ignored for now. Then,

The overall likelihood of YES:

The Naive Bayes classification algorithm we used in the preceding example can be summarized
by the following formula. Essentially, the probability of level L for class C, given the evidence
provided by features F1 through Fn, is equal to the product of the probabilities of each piece of
evidence conditioned on the class level, the prior probability of the class level.

P(CL|F1,…………..,Fn) = P(CL)∏ni=1𝑃(𝐹𝑖|𝐶𝐿)
# Install and load e1071 package
library("e1071")
#Load the dataset
data = read.csv("Lab3/WeatherData.csv",stringsAsFactors = FALSE)
data
#Display the structure of dataset
str(data)
#Encode the target vector as factor
data$PlayTennis <- factor(data$PlayTennis)
#Display the frequency of each target class type
table(data$PlayTennis)
#Display the probability of each target class type
prop.table(table(data$PlayTennis))
#Build the model
classifier <- naiveBayes(data,data$PlayTennis)
classifier
#Generate test sample (X)
Outlook <- "rain"
Temperature <- "mild"
Humidity <- "high"
Windy <- "strong"
#Load the instance X as dataframe
test_data <- data.frame(Outlook,Temperature,Humidity,Windy)
#Predict the target class for a given instance X
test_pred <- predict(classifier,test_data)
test_pred

The Laplace Estimator

An additional issue to be aware of - since naive Bayes uses the product of feature probabilities
conditioned on each class, we run into a serious problem when new data includes a feature
value that never occurs for one or more levels of a response class. What results is P (xi | Ck) = 0
for this individual feature and this zero will ripple through the entire multiplication of all
features and will always force the posterior probability to be zero for that class.
A solution to this problem involves using the Laplace smoother. The Laplace smoother adds a
small number to each of the counts in the frequencies for each feature, which ensures that each
feature has a nonzero probability of occurring for each class.

Li = Ci +S / N+K
Where
Ci: Count of tuples satisfying the test condition
S: Laplacian parameter (add small no.) , usually 1
N: Total no. of tuples belong to that class value
K: Count of Distinct value in particular feature

library("e1071")
data1 = read.csv("Lab3/WeatherData.csv",stringsAsFactors = FALSE)
data1
str(data1)
data1$Salary <- factor(data1$Humidity)
table(data1$Humidity)
prop.table(table(data1$Humidity))

classifier1 <- naiveBayes(data1,data1$Humidity,laplace=1)

classifier1
Outlook <- "sunny"
temperature <- "hot"
Windy <- "weak"
test_data1 <- data.frame(Outlook,temperature,Windy)
test_pred1 <- predict(classifier1,test_data1)
test_pred1

Dataset Preparation –Training and Test datasets

We split our dataset into two portions: A training dataset is used to build the NaiveBayes Model
and a test dataset to evaluate the performance of the model on new data. We will use 80
percent of the data for training and 20 percent for testing, which will provide us with 20%
records to simulate salary of test data.
library("e1071")
data1 = read.csv("Lab 3/Sales.csv",stringsAsFactors = FALSE)
data1
str(data1)
df=as.data.frame(data1)
df
repseq_len(nrow(df))
repeating_sequence = rep.int(seq_len(nrow(df)),df$Count)
repeating_sequence
dataset = df[repeating_sequence,]
dataset
#We no longer need the frequency, drop the feature
dataset$Count = NULL
dataset
sample_set <- sample(nrow(dataset), round(nrow(dataset)*.80), replace= FALSE)
train_data <-dataset[sample_set, ]
test_data <- dataset[-sample_set, ]
prop.table(table(dataset$Salary))
Naive_classifier = naiveBayes(Salary ~ .,data=train_data,laplace=6)
NB_Prediction = predict(Naive_classifier,test_data)
NB_Prediction
tab2 = table(NB_Prediction,test_data$Salary)
Accuracy = sum(diag(tab2)) / sum(tab2)
Accuracy
Exercise
1. Apply Naïve Bayes Algorithm on Titanic Dataset and do the preprocessing as required. Fit
the model on complete dataset and predict the survival for each instance i.e. Yes or no.
Also, built a confusion matrix. Find the Accuracy of the Naïve Bayes Model designed for
the Titanic survival class.
2. Download the Nursery Data Set (from this link
https://archive.ics.uci.edu/ml/datasets/nursery). Build a NaiveBayes Classifier to predict
finance. You may only use categorical variables, and ignore the continuous variables.
Check the accuracy of the model using laplace and find the outcome (which model
performed better)
3. Which data preprocessing technique is most suited for NBC and why?
4. What are the pros and cons of doing away with the assumptions of NBC?
5. How many probability calculations are involved in NBC for a classification problem
involving Xi attributes with cardinality Ci (i=1,2,….M), and class labels, Yi (i=1,2,…P)?

Naïve Bayes
No ratings yet
Naïve Bayes
15 pages
Purva Rawale _ BDA Practical No 2
No ratings yet
Purva Rawale _ BDA Practical No 2
9 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
11 pages
Naive Bayes Classifier in Machine Learning - Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning - Javatpoint
19 pages
RE
No ratings yet
RE
22 pages
6. Naive Bayes
No ratings yet
6. Naive Bayes
26 pages
Naive Bayes Classifier in Machine Learning
No ratings yet
Naive Bayes Classifier in Machine Learning
16 pages
651276118-Naive-Bayes-Classifier-in-Machine-Learning-Javatpoint
No ratings yet
651276118-Naive-Bayes-Classifier-in-Machine-Learning-Javatpoint
23 pages
Unit 2 AAM
No ratings yet
Unit 2 AAM
32 pages
07 - ML - Naive-Bayes-update
No ratings yet
07 - ML - Naive-Bayes-update
26 pages
Naive Bates Classifier
No ratings yet
Naive Bates Classifier
18 pages
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
3 pages
Unit-4 Naïve Bayes & Support Vector Machine
No ratings yet
Unit-4 Naïve Bayes & Support Vector Machine
79 pages
CSL0777 L24
No ratings yet
CSL0777 L24
38 pages
Wk08
No ratings yet
Wk08
10 pages
Practical-3 Ritesh
No ratings yet
Practical-3 Ritesh
5 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
6 pages
Practical_3 (2)
No ratings yet
Practical_3 (2)
11 pages
Unit 2.2
No ratings yet
Unit 2.2
9 pages
Naive Bayes - Report (Repaired)
No ratings yet
Naive Bayes - Report (Repaired)
5 pages
20210913115710D3708 - Session 09-12 Bayes Classifier
No ratings yet
20210913115710D3708 - Session 09-12 Bayes Classifier
30 pages
16_Naïve Bayes Classifier
No ratings yet
16_Naïve Bayes Classifier
21 pages
07 Naive - Bayes
No ratings yet
07 Naive - Bayes
7 pages
(Machine Learning) BAYES’ THEOREM AND CONCEPT LEARNING
No ratings yet
(Machine Learning) BAYES’ THEOREM AND CONCEPT LEARNING
22 pages
Exp 3 Bi 30
No ratings yet
Exp 3 Bi 30
7 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
46 pages
Naive Bayes
No ratings yet
Naive Bayes
11 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
Machine Learning-Lecture 04
No ratings yet
Machine Learning-Lecture 04
31 pages
Naive Bayes etc.
No ratings yet
Naive Bayes etc.
1 page
Week6 - Naive Bayes
No ratings yet
Week6 - Naive Bayes
68 pages
Bayes Classifier
No ratings yet
Bayes Classifier
35 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Machine Ass
No ratings yet
Machine Ass
33 pages
Exp 3 Bi
No ratings yet
Exp 3 Bi
12 pages
Exercises695Clas Solution
100% (2)
Exercises695Clas Solution
13 pages
Assignment No 2
No ratings yet
Assignment No 2
5 pages
SP14 CS188 Lecture 21 -- Naive Bayes - Print
No ratings yet
SP14 CS188 Lecture 21 -- Naive Bayes - Print
41 pages
Naive Bayes Numericals
No ratings yet
Naive Bayes Numericals
9 pages
AI NOTES unit 2
No ratings yet
AI NOTES unit 2
9 pages
What is Naive Bayes algorithm (1)
No ratings yet
What is Naive Bayes algorithm (1)
10 pages
9-Decision Tree Induction-23-01-2025
No ratings yet
9-Decision Tree Induction-23-01-2025
40 pages
Naive_bayes
No ratings yet
Naive_bayes
7 pages
Lec 9 Supervised Learning Final
100% (1)
Lec 9 Supervised Learning Final
182 pages
L6 - SLM Notes (Bayes Algorithm)
No ratings yet
L6 - SLM Notes (Bayes Algorithm)
28 pages
Naive Bayes Algorithm: Assignment 1a
No ratings yet
Naive Bayes Algorithm: Assignment 1a
4 pages
Pract 8 - Naive Bays Algorithm
No ratings yet
Pract 8 - Naive Bays Algorithm
2 pages
Navies Bayes
No ratings yet
Navies Bayes
18 pages
DWM EXP 4-2
No ratings yet
DWM EXP 4-2
4 pages
Unit 5-6
No ratings yet
Unit 5-6
18 pages
U02Lecture07 Classification
100% (1)
U02Lecture07 Classification
56 pages
Exp 4
No ratings yet
Exp 4
3 pages
CP4252 Machine Learning lab manual
No ratings yet
CP4252 Machine Learning lab manual
37 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
LM3 - Naive Bayes Model
No ratings yet
LM3 - Naive Bayes Model
21 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
16 pages
Module - 4 - ECE3047 - Machine Learning
No ratings yet
Module - 4 - ECE3047 - Machine Learning
81 pages
Naive_Bayes (1)
No ratings yet
Naive_Bayes (1)
4 pages