0% found this document useful (0 votes)
10 views

5.Data Mining - Bayesian Network

Uploaded by

pawankr16123114
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

5.Data Mining - Bayesian Network

Uploaded by

pawankr16123114
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Bayes’ Theorem

 Let’s consider a collection of random variables V1, V2,…., Vk

 Joint probability that the values of V1, V2,…., Vk are v1, v2,…,
vk, respectively can be represented by the expression
p(V1= v1, V2= v2, …, Vk= vk).

 If a fair coin is flipped five times, then we might have p(H, T,


T, H, T) = 1/32

 Probability functions must satisfy the following properties:


0≤ p(V1, V2, …, Vk)≤1,  p(V 1,V 2,...,Vk )  1

9/2/2024
2
 Let’s consider 4 binary valued variables, B, M, L & G,
there are 16 joint probabilities over these variables,
each of the form p(B=b, M=m, L=l, G=g)

B M L G Joint
Prob.
T T T T 0.5686
T T T F 0.0299
T T F T 0.0135
T T F F 0.0007
…. …. …. …. ….

9/2/2024
3
 When we know the values of all of the joint prob. for a set
of random variables, then we can compute what is called
the marginal prob. of one of these random var.

 E.g. p(B=true) = Σ p(B=true, M, L, G)


 p(B=b, M=m) = Σ p(B=b, M=m, L, G)

 Thus, given the full joint prob. function for a collection of


random var., it’s possible to compute all of the marginal &
lower order joint prob.

 However, when the no of random vars. are too large, then


the task of specifying all of the joint prob becomes
intractable. 9/2/2024
4
 For any values of the variables Vi & Vj, the conditional prob function is
given by:
p (Vi , V j )
p (Vi | V j ) 
p (V j )

Where p(Vi , Vj) is the joint prob. of Vi & Vj, and p(Vj) is the marginal
prob. Of Vj.

 So, we can have, p(Vi , Vj) = p(Vi| Vj) p(Vj)


p ( B , M )
 E.g. p ( B | M ) 
p ( M )

 Conditional prob. is thus a normalized version of a joint prob.

9/2/2024
5
 Joint prob. can be expressed in terms of a chain of conditional prob as
follows:
k
p (V1 , V2 ,...,Vk )   p (Vi | Vi 1 ,.....,V1 )
i 1

 e.g. p(B, L, G, M) = p(B|L, G, M)p(L|G, M)p(G|M)p(M)

 Since the way in which we order variables in a joint prob. function is


unimportant, then we can write:

 p(Vi , Vj) = p(Vi| Vj) p(Vj) = p(Vj| Vi) p(Vi) = p(Vj , Vi)
p(V j | Vi ) p(Vi )
 Then, p(Vi | V j ) 
p(V j )

 This is called Bayes’ rule


9/2/2024
6
 Consider the following joint probabilities:

p(P, Q, R) = 0.3, p(P, Q, ¬R) = 0.2, p(P, ¬Q, R) = 0.2, p(P, ¬Q, ¬R)
= 0.1, p(¬P, Q, R) = 0.05, p(¬P, Q, ¬R) = 0.1, p(¬P, ¬Q, R) = 0.05,
p(¬P, ¬Q, ¬R) = 0.0

 Calculate p(Q| ¬R)

9/2/2024
7
 Consider the following joint probabilities:

p(P, Q, R) = 0.3, p(P, Q, ¬R) = 0.2, p(P, ¬Q, R) = 0.2, p(P, ¬Q, ¬R)
= 0.1, p(¬P, Q, R) = 0.05, p(¬P, Q, ¬R) = 0.1, p(¬P, ¬Q, R) = 0.05,
p(¬P, ¬Q, ¬R) = 0.0

 Calculate p(Q| ¬R)

 p(Q| ¬R) = 0.75

9/2/2024
8
 A variable V is conditionally independent of a set of variables Vi,
given a set Vj, if p(V | Vi ,V j )  p(V | V j )

 This fact is represented by the notation I (V ,Vi | V j )

 The intuition is that if I (V ,Vi | V j ) , then Vi tells us nothing more


about V than we already knew by knowing Vj

 As a generalization of pair-wise independence, we say that the


variables V1, V2, …, Vk are mutually conditionally independent, given a
set V, if each of the variables is conditionally independent of all of the
others, given V.
k

 Since and since each Vi is conditionally


p (V1 , V2 ,...,Vk | V )   p (Vi | Vi 1 ,...,V1 , V )
i 1

independent of the others given V, we have the following:


9/2/2024
k
p (V1 ,..., Vk | V )   p(V
i 1
i |V) 9
 Conditional independencies can be conveniently represented by Bayes
Networks, also called belief networks
 A Bayes network is a DAG, where nodes are labeled by random variables
 The intuitive meaning of an arrow from a parent to a child is that the parent
directly influences the child.
 These influences are quantified by conditional probabilities
 BNs are graphical representations of joint distributions.
 In a Bayes network, each node Vi , in the graph is conditionally independent of
any subset of the nodes that are not descendants of Vi, given the parents of Vi.
 Let’s consider A(Vi) – any set of nodes that are not descendants of Vi,
P(Vi) – immediate parents of Vi
 With the above assumptions, we can say I(Vi, A(Vi)| P(Vi)), ∀ Vi in the graph
 In turn the above expression is as equivalent as p(Vi| A(Vi), P(Vi)) = p(Vi|
P(Vi))

9/2/2024
10
 Let’s assume that V1, V2, …, Vk be the nodes in a Bayes network

 Given the conditional independence assumptions made by the network, we


have the following joint prob. of all of the nodes in the network:
k
p (V1 , V2, ..., Vk )   p(V
i 1
i | P (Vi ))

 L p(L) = 0.7

 B p(B) = 0.95

p(M|B, L) = 0.9
p(M|B, ¬L) = 0.05
 G M p(M| ¬B, L) = 0.0
p(M| ¬B, ¬L) =
0.0
P(G|B) = 0.95
P(G| ¬B) = 0.1

p( G, B, M, L) = p(G|B)p(M|B, L)p(B)p(L)

9/2/2024
11
 Considering the Bayes network of last slide, calculate
p(M|L)

9/2/2024
12
 Considering the Bayes network of last slide, calculate
p(M|L)

 p(M|L) = p(M|B, L)p(B|L)+p(M|¬B, L)p(¬B|L)


= p(M|B, L)p(B)+p(M|¬B, L)p(¬B) = 0.855

9/2/2024
13
 R Q p(Q) = 0.05
 p(R) = 0.01 R

p(P|R, Q) = 0.95
 P S p(S|Q) = 0.95
p(P|R, ¬Q) = 0.90
p(P| ¬R, Q) = 0.80 p(S| ¬Q) = 0.05
p(P| ¬R, ¬Q) = 0.01


 U V
P(U|P) = 0.7 P(V|S) = 0.99
P(U| ¬P) = 0.2 P(V| ¬S) = 0.1

 Calculate p(Q|U).

9/2/2024
14
 From Bayes rule we have p(Q|U) = kp(U|Q)p(Q), where k=1/p(U)
 Now, p(U | Q)   p(U | P) p( P | Q)
p

p( P | Q)   p( P | R, Q) p( R)  p( P | R, Q) p( R)  p( P | R, Q) p(R)  0.80


R

 So p(P | Q)  0.20
 p(U|Q)=p(U|P)p(P|Q)+p(U|¬P)p(¬P|Q)=0.60
 p(Q|U)=k*0.60*0.05=k*0.03------------------(1)

 Similarly, p(Q | U )  kp(U | Q) p(Q)

 Now, p(U | Q)   p(U | P) p( P | Q)  0.21


p

 Therefore p(¬Q|U)=k*0.21*0.95=k*0.20 ---------(2)


 so from (1) & (2), we have k=4.35
 Now from (1), p(Q|U) = 0.13
9/2/2024
15
 Let’s consider the following situation:

You have a new burglar alarm installed at home. It is


fairly reliable at detecting a burglary, but also responds
on occasion to minor earthquakes. You also have two
neighbors, John and Mary, who have promised to call
you at work when they hear the alarm. John quite
reliably calls when he hears the alarm, but sometimes
confuses the telephone ringing with the alarm and calls
then too. Mary, on the other hand, likes loud music and
misses the alarm altogether sometimes.

9/2/2024
16
9/2/2024
17
A generic entry in the joint probability
distribution P(X1, …, Xn) is given by:
n
P ( X 1 , X 2 ,..., X n )   P ( X i | Parents( X i ))
i 1

9/2/2024
18
 Probabilityof the event that the alarm has
sounded but neither a burglary nor an
earthquake has occurred, and both Mary
and John calls:

9/2/2024
19
 Probabilityof the event that the alarm has
sounded but neither a burglary nor an
earthquake has occurred, and both Mary and
John calls:

 P(J∧ M ∧ A ∧ ¬B ∧ ¬E)
= P(J | A) P(M | A) P(A | ¬B ∧ ¬E) P(¬B) P(¬E)
=0.9 X 0.7 X 0.001 X 0.999 X 0.998
=0.00062
9/2/2024
20
 Let A, B, C, D be Boolean random variables. Given that: A and
B are (absolutely) independent. C is independent of B given
A. D is independent of C given A and B.
Prob(A=T) = 0.3, Prob(B=T) = 0.6, Prob(C=T|A=T) = 0.8,
Prob(C=T|A=F) = 0.4, Prob(D=T|A=T,B=T) = 0.7,
Prob(D=T|A=T,B=F) = 0.8, Prob(D=T|A=F,B=T) = 0.1,
Prob(D=T|A=F,B=F) = 0.2

 Compute the following quantities:


1) Prob(D=T)
2) Prob(D=F,C=T)
3) Prob(A=T|C=T)
4) Prob(A=T|D=F)
5) Prob(A=T,D=T|B=F). 9/2/2024
21
1. P(D=T) = P(D=T,A=T,B=T) + P(D=T,A=T,B=F) + P(D=T,A=F,B=T) + P(D=T,A=F,B=F)
= P(D=T|A=T,B=T) P(A=T,B=T) + P(D=T|A=T,B=F) P(A=T,B=F) + P(D=T|A=F,B=T)
P(A=F,B=T) + P(D=T|A=F,B=F) P(A=F,B=F)
(since A and B are independent absolutely)
=P(D=T|A=T,B=T) P(A=T) P(B=T) + P(D=T|A=T,B=F) P(A=T) P(B=F) + P(D=T|A=F,B=T)
P(A=F) P(B=T) + P(D=T|A=F,B=F) P(A=F) P(B=F)
= 0.7*0.3*0.6 + 0.8*0.3*0.4 + 0.1*0.7*0.6 + 0.2*0.7*0.4 = 0.32

2. P(D=F,C=T) = P(D=F,C=T,A=T,B=T) + P(D=F,C=T,A=T,B=F) + P(D=F,C=T,A=F,B=T) +


P(D=F,C=T,A=F,B=F)
= P(D=F,C=T|A=T,B=T) P(A=T,B=T) + P(D=F,C=T|A=T,B=F) P(A=T,B=F) +
P(D=F,C=T|A=F,B=T) P(A=F,B=T) + P(D=F,C=T|A=F,B=F) P(A=F,B=F) (since C and D
are independent given A and B)
 =P(D=F|A=T,B=T) P(C=T|A=T,B=T) P(A=T,B=T) + P(D=F|A=T,B=F) P(C=T|A=T,B=F)
P(A=T,B=F) +
 P(D=F|A=F,B=T) P(C=T|A=F,B=T) P(A=F,B=T) + P(D=F|A=F,B=F) P(C=T|A=F,B=F)
P(A=F,B=F)
 (since C is independent of B given A and A and B are independent absolutely)
=P(D=F|A=T,B=T) P(C=T|A=T) P(A=T) P(B=T) + P(D=F|A=T,B=F) P(C=T|A=T) P(A=T)
P(B=F) + P(D=F|A=F,B=T) P(C=T|A=F) P(A=F) P(B=T) + P(D=F|A=F,B=F) P(C=T|A=F)
P(A=F) P(B=F)
9/2/2024
= 0.3*0.8*0.3*0.6 + 0.2*0.8*0.3*0.4 + 0.9*0.4*0.7*0.6 + 0.8*0.4*0.7*0.4 = 0.3032 22
 Consider the following Bayesian
Network containing 3 Boolean
random variables:

Compute the following quantities:


(i) P(~B, C | A)
(ii)P(A | ~B, C)

9/2/2024
23
 (i) P (~B,C | A) = P (~B | A) P (C | A) = (0.15)(0.75) = 0.1125
 (ii)

9/2/2024
24
An admission committee for a college is trying to determine the probability
that an admitted candidate is really qualified.
The relevant probabilities are given in the following Bayesian network.
Calculate p (A|D).

9/2/2024
25

You might also like