0% found this document useful (0 votes)
29 views

Foundations of Data Science - Unit 6 - Naive Bayes

The document discusses the Naive Bayes classifier machine learning algorithm. It begins with an overview of probability theory and Bayes' Theorem. It then explains how the Naive Bayes classifier uses conditional independence assumptions to simplify calculating class probabilities. The classifier predicts the class with the highest posterior probability given attributes, based on estimating attribute probabilities for each class from training data. Examples are provided to demonstrate calculating posterior probabilities using Bayes' Theorem.

Uploaded by

Maria Mushtaque
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Foundations of Data Science - Unit 6 - Naive Bayes

The document discusses the Naive Bayes classifier machine learning algorithm. It begins with an overview of probability theory and Bayes' Theorem. It then explains how the Naive Bayes classifier uses conditional independence assumptions to simplify calculating class probabilities. The classifier predicts the class with the highest posterior probability given attributes, based on estimating attribute probabilities for each class from training data. Examples are provided to demonstrate calculating posterior probabilities using Bayes' Theorem.

Uploaded by

Maria Mushtaque
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

3/20/2023

Data Science
Unit 6

Naïve Bayes Classifier

1
3/20/2023

Outline
▪ Supervised Learning
▪ Revision of Probability Theory and Bayes
Theorem
▪ Naïve Bayes Classifier

Dr. Muhammad Usman 3/20/2023 3

Dr. Muhammad Usman 3/20/2023 4

2
3/20/2023

Notation

▪ P(A) – Probability of an Event A


▪ P(B|A) Probability of an Event B given Event A
▪ Also called the Conditional Probability of B given A
▪ So what is the probability of you passing the exam if the
teacher is angry at you?
P(Passing|Teacher angry) = ?
Dr. Muhammad Usman 3/20/2023 5

Conditional Probability
▪ Independent Events – each event is not affected
by any other events
▪ Example – tossing a coin
▪ Dependent Events – event affected by previous
events
▪ Example – Marbles in a bag

Dr. Muhammad Usman 3/20/2023 6

3
3/20/2023

Conditional Probability Example


B ~B
C ~C C ~C
A 12 5 9 2
~A 4 8 20 4

▪ P(A | B, C) = 12/16
▪ P(A, B | ~C) = 5 / 19
▪ P(B | ~A, C) = 4 / 24
Dr. Muhammad Usman 3/20/2023 7

Conditional Independence
▪ Two events A and B are independent if knowing that A has
happened does not say anything about B happening.
P(A B) = P(A) P(B)
P(A | B) = P(A)
▪ Two events A and B are conditionally independent given a
third event C precisely if the occurrence or non-
occurrence of A and B are independent events in their
conditional probability distribution given C.
P(A B | C) = P(A | C) P(B | C)
P(A | B C) = P(A | C)

Dr. Muhammad Usman 3/20/2023 8

4
3/20/2023

Bayes Theorem
▪ P (A | B) = P(B | A) P(A)
P(B)
= P(B | A) P(A)
P(B|A)P(A)+P(B|A)P(A)
▪ P(A) is the prior probability and P(A | B) is the posterior probability.

▪ Suppose events A1, A2, ……., Ak are mutually exclusive and exhaustive;
i.e., exactly one of the events must occur. Then for any event B:
P(Ai | B) = P(B | Ai) P(Ai)
 P(B | Ai) P(Ai)

Dr. Muhammad Usman 3/20/2023 9

Example I
▪ According to American Lung Association, 7% of the population has lung cancer.
Of these people having lung disease, 90% are smokers; and of those not having
lung disease, 25.3% are smokers.
▪ Determine the probability that a randomly selected smoker has lung cancer.

Dr. Muhammad Usman 3/20/2023 10

10

5
3/20/2023

Example I Solution
▪ Let L = Lung Cancer, S = Smoker
▪ Given that
▪ P(L) = 0.07
▪ P(S | L) = 0.90 P(~S | L) = 0.10
▪ P(S | ~L) = 0.253 P(~S | ~L) = 0.747

▪ Find probability, P(L | S)

Dr. Muhammad Usman 3/20/2023 11

11

Example II
▪ Assume that about 1 in 1000 individuals in a given organization have
committed a security violation.
▪ Assume that the sensitivity of a routine screening polygraph is about
85%. That is, the probability that the polygraph report will indicate a
concern is about 85% if the individual has committed a security
violation.
▪ Assume the specificity of the polygraph is about 80%. That is, if the
individual has not committed a security violation, there is about an 80%
chance that the polygraph report will not indicate a concern.
▪ What is the posterior probability that an individual whose polygraph
report indicates a concern has committed a security violation?

Dr. Muhammad Usman 3/20/2023 12

12

6
3/20/2023

Example II Solution
▪ Let
▪ S = Security Violation Committed,
▪ T = Test Positive

▪ Given that
▪ P(S) = 0.001
▪ P(T | S) = 0.85 P(~T | S) = 0.15
▪ P(T | ~S) = 0.20 P(~T | ~S) = 0.80

▪ Find probability, P(S | T)

Dr. Muhammad Usman 3/20/2023 13

13

Bayesian Classifiers
▪ Consider each attribute and class label as random variables
▪ Given a record with attributes (A1, A2,…,An)
▪ Goal is to predict class C
▪ Specifically, we want to find the value of C that maximizes P(C| A1, A2,…,An )
▪ Can we estimate P(C| A1, A2,…,An ) directly from data?

Dr. Muhammad Usman 3/20/2023 14

14

7
3/20/2023

Bayesian Classifiers
▪ Approach:
▪ Compute the posterior probability P(C | A1, A2, …, An) for all values of
C using the Bayes theorem

▪ Choose value of C that maximizes


P(C | A1, A2, …, An)
▪ Equivalent to choosing value of C that maximizes
P(A1, A2, …, An|C) P(C)
▪ How to estimate P(A1, A2, …, An | C )?

Dr. Muhammad Usman 3/20/2023 15

15

Naïve Bayes Classifier


▪ Naïve Bayes classifiers assume that the effect of an attribute value
on a given class is independent of the values of the other
attributes.
▪ This assumption is called class conditional independence.
▪ It is made to simplify the computations involved and, in this sense,
is considered “naïve”.
▪ Remember:
▪ Two events A and B are conditionally independent given a third event C precisely if the
occurrence or non-occurrence of A and B are independent events in their conditional
probability distribution given C.
P(A B | C) = P(A | C) P(B | C)

Dr. Muhammad Usman 3/20/2023 16

16

8
3/20/2023

Naïve Bayes Classifier

Dr. Muhammad Usman 3/20/2023 17

17

How to Estimate
ric
al
ric
al Probabilities
ou
s from Data?
u
go go in s▪s Class: P(C) = Nc/N
te te nt
c a c a c o cla
Tid Refund Marital Taxable ▪ P(No) = ?
Status Income Evade P(Yes) = ?
1 Yes Single 125K No
2 No Married 100K No
▪ For discrete attributes:
3 No Single 70K No
4 Yes Married 120K No
P(Ai | Ck) = |Aik|/ NCk
5 No Divorced 95K Yes
6 No Married 60K No ▪ where |Aik| is number of
7 Yes Divorced 220K No instances having attribute Ai
8 No Single 85K Yes and belongs to class Ck
9 No Married 75K No ▪ Examples:
10 No Single 90K Yes
10

P(Status=Married|No) = ?
P(Refund=Yes|Yes)= ?
Dr. Muhammad Usman 3/20/2023 18

18

9
3/20/2023

How to Estimate
ica
l
ica
l Probabilities
ou
s from Data?
or or in
u
teg teg nt s▪s Class: P(C) = Nc/N
c a c a c o cla
Tid Refund Marital Taxable ▪ e.g., P(No) = 7/10,
Status Income Evade P(Yes) = 3/10
1 Yes Single 125K No
2 No Married 100K No
▪ For discrete attributes:
3 No Single 70K No
4 Yes Married 120K No
P(Ai | Ck) = |Aik|/ NCk
5 No Divorced 95K Yes
6 No Married 60K No ▪ where |Aik| is number of
7 Yes Divorced 220K No instances having attribute Ai
8 No Single 85K Yes and belongs to class Ck
9 No Married 75K No ▪ Examples:
10 No Single 90K Yes
10

P(Status=Married|No) = 4/7
P(Refund=Yes|Yes)=0
Dr. Muhammad Usman 3/20/2023 19

19

Naïve Bayes
Classification: Mammals vs. Non-mammals
Name
human
Give Birth
yes no
Can Fly Live in Water Have Legs
no yes
Class
mammals ▪ Train the model (learn the
python no no no no non-mammals parameters) using the given
salmon no no yes no non-mammals
whale yes no yes no mammals
data set.
frog no no sometimes yes non-mammals
komodo no no no yes non-mammals ▪ Apply the learned model on new
bat yes yes no yes mammals cases.
pigeon no yes no yes non-mammals
cat yes no no yes mammals
leopard shark yes no yes no non-mammals
turtle no no sometimes yes non-mammals
penguin no no sometimes yes non-mammals
porcupine yes no no yes mammals
eel no no yes no non-mammals
salamander no no sometimes yes non-mammals
gila monster no no no yes non-mammals
platypus no no no yes mammals
owl no yes no yes non-mammals
dolphin yes no yes no mammals
eagle no yes no yes non-mammals
Dr. Muhammad Usman Give Birth Can Fly Live in Water Have Legs Class 3/20/2023 20
yes no yes no ?

20

10
3/20/2023

Naïve Bayes
Classification: Mammals vs. Non-mammals
Name Give Birth Can Fly Live in Water Have Legs Class A: attributes
human yes no no yes mammals
python no no no no non-mammals M: mammals
salmon no no yes no non-mammals N: non-mammals
whale yes no yes no mammals
frog no no sometimes yes non-mammals 6 6 2 2
komodo no no no yes non-mammals P( A | M ) =    = 0.06
bat yes yes no yes mammals
7 7 7 7
pigeon no yes no yes non-mammals 1 10 3 4
cat yes no no yes mammals
P( A | N ) =    = 0.0042
13 13 13 13
leopard shark yes no yes no non-mammals
7
turtle
penguin
no
no
no
no
sometimes
sometimes
yes
yes
non-mammals
non-mammals
P( A | M ) P( M ) = 0.06  = 0.021
20
porcupine yes no no yes mammals
13
eel no no yes no non-mammals
P( A | N ) P( N ) = 0.004  = 0.0027
salamander no no sometimes yes non-mammals 20
gila monster no no no yes non-mammals
platypus no no no yes mammals P(A|M)P(M) > P(A|N)P(N)
owl no yes no yes non-mammals
dolphin yes no yes no mammals => Mammals
eagle no yes no yes non-mammals

Dr. Muhammad Usman Give Birth Can Fly Live in Water Have Legs Class 3/20/2023 21

yes no yes no ?
21

Example: Play Tennis outlook

Outlook Temperature Humidity W indy Class P(sunny|p) = 2/9 P(sunny|n) = 3/5

sunny hot high false N P(overcast|p) = 4/9 P(overcast|n) = 0


sunny hot high true N P(rain|p) = 3/9 P(rain|n) = 2/5
overcast hot high false P temperature
rain mild high false P P(hot|p) = 2/9 P(hot|n) = 2/5
rain cool normal false P P(mild|p) = 4/9 P(mild|n) = 2/5
rain cool normal true N P(cool|p) = 3/9 P(cool|n) = 1/5
overcast cool normal true P humidity
sunny mild high false N
P(high|p) = 3/9 P(high|n) = 4/5
sunny cool normal false P
P(normal|p) = 6/9 P(normal|n) = 2/5
rain mild normal false P
windy
sunny mild normal true P
P(true|p) = 3/9 P(true|n) = 3/5
overcast mild high true P
overcast hot normal false P P(false|p) = 6/9 P(false|n) = 2/5

rain mild high true N


P(P) = 9/14
Outlook Temperature Humidity Windy Class
rain hot high false ? P(N) = 5/14

Dr. Muhammad Usman 3/20/2023 22

22

11
3/20/2023

Characteristics of Naïve Bayes Classifier


▪ They are robust to isolated noise points because such points are averaged out
when estimating conditional probabilities from data.
▪ Naïve Bayes classifiers can also handle missing values by ignoring the example
during model building and classification.
▪ They are robust to irrelevant attributes. If Xi is an irrelevant attribute, then P(Xi |
Y) becomes almost uniformly distributed.
▪ Correlated attributes can degrade the performance of naïve Bayes classifiers
because the conditional independence assumption no longer holds for such
attributes

Dr. Muhammad Usman 3/20/2023 23

23

How Effective are Bayesian Classifiers


▪ Various empirical studies of this classifier in comparison to decision tree and
neural network classifiers have found it to be comparable in some domain.
▪ In theory, Bayesian classifiers have the minimum error rate in comparison to all
other classifiers.
▪ However, in practice this is not always the case, owning to inaccuracies in the
assumptions made of its use, such as class conditional independence, and the
lack of available probability data.

Dr. Muhammad Usman 3/20/2023 24

24

12

You might also like