Data Mining - Module 7
Data Mining - Module 7
Province of Cotabato
Municipality of Makilala
MAKILALA INSTITUTE OF SCIENCE AND TECHNOLOGY
Makilala, Cotabato
III. REFERENCES
Main Textbook
Tan, Steinbach, Karpatne, Kumar (2019). Introduction to Data Mining 2nd Edition.
Han, J., Kamber, M. & Pei, J. (2015). Data Mining Concepts and Techniques. 3rd Edition
I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal (2016) Data Mining: Practical Machine Learning Tools
and Techniques. 4TH Edition
• Bayesian Rule
P( X|C )P (C ) Likelihood ×Prior
P(C|X )= Posterior=
P( X ) Evidence
P(x)
P( X )
C1
C2
Posterior x
Probability
P(C|x)
1
0
PROFEL 3 – DATA MINING 3
x
Lesson 4: Naive Bayes
Bayes’ Theorem is a simple mathematical formula used for calculating conditional probabilities.
Conditional probability is a measure of the probability of an event occurring given that another event has
(by assumption, presumption, assertion, or evidence) occurred.
The formula is: —
Which tells us: how often A happens given that B happens, written P(A|B) also called posterior probability, When
we know: how often B happens given that A happens, written P(B|A) and how likely A is on its own,
written P(A) and how likely B is on its own, written P(B).
• Learning Phase
P(Play=Yes)P(Play=No)
= 9/14 = 5/14
• Test Phase
– Given a new instance,
x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)
– Look up tables
P(Outlook=Sunny|Play=Yes)P(Outlook=Sunny|Play=No)
= 2/9 =
P(Temperature=Cool|Play=Yes)
P(Temperature=Cool|Play==N
= 3/9
P(Huminity=High|Play=Yes)P(Huminity=High|Play=No)
= 3/9 =
P(Wind=Strong|Play=Yes)
PROFEL 3 – DATA MINING 5 =P(Wind=Strong|Play=No)
3/9 = 3/
P(Play=Yes) = 9/14 P(Play=No) = 5/14
– MAP rule
P(Yes|x’): [P(Sunny|Yes)P(Cool|Yes)P(High|
Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053
P(No|x’): [P(Sunny|No) P(Cool|No)P(High|
No)P(Strong|No)]P(Play=No) = 0.0206
Example2: Car stolen
As per the equations discussed above, we can calculate the posterior probability P(Yes | X) as :
Since 0.144 > 0.048, Which means given the features RED SUV and Domestic, our example gets classified as ’NO’ the
car is not stolen.
It is easy and fast to predict class of test data set. It also perform well in multi class prediction
When assumption of independence holds, a Naive Bayes classifier performs better compare to
other models like logistic regression and you need less training data.
It perform well in case of categorical input variables compared to numerical variable(s). For numerical
variable, normal distribution is assumed (bell curve, which is a strong assumption).
Disadvantages
If categorical variable has a category (in test data set), which was not observed in training data set,
then model will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as
V. ACTIVITY/ EXERCISES/EVALUATION
1. The following table informs about decision making factors to buy computer. Apply NB Algorithm to classify.
Credit Buy
Age Income Student Rating Computer
<=30 High No Fair No
<=30 High No Excellent No
31-40 High No Fair Yes
Mediu
>40 m No Fair yes
>40 Low Yes Fair Yes
>40 Low Yes Excellent No
31-40 Low Yes Excellent Yes
Mediu
<=30 m No Fair No
<=30 Low Yes Fair Yes
Mediu
>40 m Yes Fair Yes
Mediu
<=30 m Yes Excellent Yes
Mediu
31-40 m No Excellent Yes
31-40 High Yes Fair Yes
Mediu
>40 m No Excellent No