PMRslides 03 B
PMRslides 03 B
Michael Gutmann
q h
Random variables: a, z, q, e, h
Parent sets: paa = paz = ∅, paq = {a, z}, pae = {q}, pah = {z}.
I Two consequences:
I For a given DAG, the independencies derived from the ordered
Markov property with any topological ordering imply the
independencies derived with any other topological ordering.
I The insensitivity to the particular topological ordering used
provides an alternative definition of directed graphical models.
I Definition (via ordered Markov property) A directed graphical
model based on a DAG with d nodes and associated random
variables xi is the set of pdfs/pmfs that satisfy the ordered
Markov property
xi ⊥
⊥ (prei \ pai ) | pai for all i
q h
Random variables: a, z, q, e, h
Ordering: (a, z, q, e, h) (meaning: x1 = a, x2 = z, x3 = q, x4 = e, x5 = h)
Predecessor sets for the ordering:
prea = ∅, prez = {a}, preq = {a, z}, pree = {a, z, q}, preh = {a, z, q, e}
Parent sets: as before
paa = paz = ∅, paq = {a, z}, pae = {q}, pah = {z}
All models in the set defined by the DAG satisfy xi ⊥
⊥ (prei \ pai ) | pai :
z⊥
⊥a e⊥
⊥ {a, z} | q h⊥
⊥ {a, q, e} | z
Michael Gutmann Directed Graphical Models 9 / 45
Example (different topological ordering)
DAG:
a z
q h
e
Ordering: (a, z, h, q, e)
Predecessor sets for the ordering:
prea = ∅, prez = {a}, preh = {a, z}, preq = {a, z, h}, pree = {a, z, h, q}
Parent sets: as before
paa = paz = ∅, pah = {z}, paq = {a, z}, pae = {q}
All models in the set defined by the DAG satisfy xi ⊥
⊥ (prei \ pai ) | pai :
z⊥
⊥a h⊥
⊥a|z q⊥
⊥ h | a, z e⊥
⊥ {a, z, h} | q
Note: the models also satisfy those obtained with the previous ordering:
z⊥
⊥a e⊥
⊥ {a, z} | q h⊥
⊥ {a, q, e} | z
Michael Gutmann Directed Graphical Models 10 / 45
Remarks
I By using different topological orderings you can generate possibly
different independence relations satisfied by the model.
(While they imply each other, deriving them from each other from the basic
definition of independence may not be straightforward.)
I Missing edges in a DAG cause the pai to be smaller than the prei ,
and thus lead to the independencies.
I The directed graphical model corresponds to a set of probability
distributions. Two views according to the two definitions: The set
includes all those distributions that you get
I by looping over all possible conditionals p(x |pa ),
i i
I by retaining, from all possible joint distributions over the x ,
i
those that satisfy the independencies given by the ordered
Markov property
I Individual pdfs/pmf in the set are typically also called a directed
graphical model (“overloading” of the name of the set and its elements).
I Other names for directed graphical models: belief network, Bayesian
network, Bayes network.
Michael Gutmann Directed Graphical Models 11 / 45
Example: Markov model
DAG:
x1 x2 x3 x4 x5
DAG:
x1 x2 x3
y1 y2 y3 y4 y5
x z y
x z y
x z y
We say that the z node is “closed” and that the trail between
x and y is “blocked” by the instantiated z. In other words,
knowing the value of z blocks the flow of evidence between x
and y .
pc
I One day your computer does not start and you bring it to a
repair shop. You think the issue could be the power unit or
the cpu.
I Investigating the power unit shows that it is damaged. Is the
cpu fine?
I Without further information, finding out that the power unit is
damaged typically reduces our belief that the cpu is damaged
power 6⊥
⊥ cpu | pc
I Finding out about the damage to the power unit explains
away the observed start-issues of the computer.
Michael Gutmann Directed Graphical Models 24 / 45
Summary
x1 x3
x5
x2 x4 x6
For those interested: A proof can be found in Section 2.8 of Bayesian Networks
– An Introduction by Koski and Noble (not examinable)
Important because:
1. the theorem allows us to read out (conditional)
independencies from the graph
2. no restriction on the sets X , Y , Z
3. the theorem shows that independencies detected by
d-separation do always hold. They are “true positives”
(“soundness of d-separation”).
Follows from ordered Markov property, but let us answer it with d-separation.
1. Determine all trails between x1
and x2
2. For trail x1 , x4 , x2 x1 x3
i default state
ii conditioning set is empty x5
iii ⇒ Trail is blocked
For trail x1 , x3 , x5 , x4 , x2 x2 x4 x6
i default state
ii conditioning set is empty
iii ⇒ Trail is blocked x1 ⊥ ⊥ x2 for all probabil-
Trail x1 , x3 , x5 , x6 , x4 , x2 is ity distributions that factor-
blocked too (same arguments). ise over the graph.
3. ⇒ x1 and x2 are d-separated.
x1 x3
1. Determine all trails between x1
and x2 x5
2. For trail x1 , x4 , x2
x2 x4 x6
i default state
ii influence of x6
iii ⇒ Trail not blocked
No need to check the other x1 ⊥⊥ x2 | x6 does generally
trails: x1 and x2 are not not hold for probability dis-
d-separated by x6 tributions that factorise over
the graph.
xi ⊥
⊥ (nondesc(xi ) \ pai ) | pai
xi ⊥
⊥ (prei \ pai ) |pai ⇐⇒ xi ⊥
⊥ (nondesc(xi ) \ pai ) |pai
xi ⊥
⊥ prei \ pai |pai ⇐ xi ⊥ ⊥ nondesc(xi ) \ pai |pai follows because
{x1 , . . . , xi−1 } ⊆ nondesc(xi ) for all topological orderings
For ⇒ consider all trails from xi to {nondesc(xi ) \ pai }.
Given a DAG with nodes (random variables) xi and parent sets pai , we
have the following equivalences:
Qd
p(x) factorises over the DAG p(x) = i=1
p(xi |pai )
m
p(x) satisfies the ordered MP xi ⊥
⊥ prei \ pai | pai for all i
m
p(x) satisfies the directed local MP xi ⊥
⊥ nondesc(xi ) \ pai | pai for all i
m
p(x) satisfies the directed global MP independencies asserted by d-separation