0% found this document useful (0 votes)
8 views

Lecture8 - bays1

Uploaded by

alexsegal666
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Lecture8 - bays1

Uploaded by

alexsegal666
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

CS 188: Artificial Intelligence

Bayes’ Nets

Fall 2022
[These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.]
Review: Probabilistic Inference
§ Probabilistic inference: compute a desired
probability from other known probabilities (e.g.
conditional from joint)

§ We generally compute conditional probabilities


§ P(on time | no reported accidents) = 0.90
§ These represent the agent’s beliefs given the evidence

§ Probabilities change with new evidence:


§ P(on time | no accidents, 5 a.m.) = 0.95
§ P(on time | no accidents, 5 a.m., raining) = 0.80
§ Observing new evidence causes beliefs to be updated
Review: Inference by Enumeration
* Works fine with
§ General case: § We want: multiple query
§ Evidence variables: variables, too
§ Query* variable:
All variables
§ Hidden variables:

§ Step 1: Select the § Step 2: Sum out H to get joint § Step 3: Normalize
entries consistent of Query and evidence
with the evidence
1

Z
Review: Inference by Enumeration
S T W P
§ P(W)?
summer hot sun 0.30
summer hot rain 0.05
summer cold sun 0.10
§ P(W | winter)? summer cold rain 0.05
winter hot sun 0.10
winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20
§ P(W | winter, hot)?
Review: Inference by Enumeration

§ Obvious problems:
§ Worst-case time complexity O(dn)
§ Space complexity O(dn) to store the joint distribution
Review: The Product Rule
§ Sometimes have conditional distributions but want the joint
Review: The Product Rule

§ Example:

D W P D W P
wet sun 0.1 wet sun 0.08
R P
dry sun 0.9 dry sun 0.72
sun 0.8
wet rain 0.7 wet rain 0.14
rain 0.2
dry rain 0.3 dry rain 0.06
Review: The Chain Rule

§ More generally, can always write any joint distribution as an


incremental product of conditional distributions

§ Why is this always true?


Probabilistic Models
§ Models describe how (a portion of) the world works

§ Models are always simplifications


§ May not account for every variable
§ May not account for all interactions between variables
§ “All models are wrong; but some are useful.”
– George E. P. Box

§ What do we do with probabilistic models?


§ We (or our agents) need to reason about unknown
variables, given evidence
§ Example: explanation (diagnostic reasoning)
§ Example: prediction (causal reasoning)
§ Example: value of information
Independence
Independence
§ Two variables are independent if:

§ This says that their joint distribution factors into a product two
simpler distributions
§ Another form:

§ We write:

§ Independence is a simplifying modeling assumption


§ Empirical joint distributions: at best “close” to independent
§ What could we assume for {Weather, Traffic, Cavity, Toothache}?
Example: Independence?

T P
hot 0.5
cold 0.5
T W P T W P
hot sun 0.4 hot sun 0.3
hot rain 0.1 hot rain 0.2
cold sun 0.2 cold sun 0.3
cold rain 0.3 cold rain 0.2
W P
sun 0.6
rain 0.4
Example: Independence
§ N fair, independent coin flips:

H 0.5 H 0.5 H 0.5


T 0.5 T 0.5 T 0.5
Conditional Independence
Conditional Independence
§ P(Toothache, Cavity, Catch)

§ If I have a cavity, the probability that the probe catches in it


doesn't depend on whether I have a toothache:
§ P(+catch | +toothache, +cavity) = P(+catch | +cavity)

§ The same independence holds if I don’t have a cavity:


§ P(+catch | +toothache, -cavity) = P(+catch| -cavity)

§ Catch is conditionally independent of Toothache given Cavity:


§ P(Catch | Toothache, Cavity) = P(Catch | Cavity)

§ Equivalent statements:
§ P(Toothache | Catch , Cavity) = P(Toothache | Cavity)
§ P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch | Cavity)
§ One can be derived from the other easily
Conditional Independence
§ Unconditional (absolute) independence very rare (why?)

§ Conditional independence is our most basic and robust form


of knowledge about uncertain environments.

§ X is conditionally independent of Y given Z

if and only if:

or, equivalently, if and only if


Conditional Independence
§ What about this domain:
§ Traffic
§ Umbrella
§ Raining
Conditional Independence
§ What about this domain:
§ Fire
§ Smoke
§ Alarm
Conditional Independence and the Chain Rule
§ Chain rule:

§ Trivial decomposition:

§ With assumption of conditional independence:

§ Bayes’nets / graphical models help us express conditional independence assumptions


Ghostbusters Chain Rule
§ Each sensor depends only
on where the ghost is P(T,B,G) = P(G) P(T|G) P(B|G)

§ That means, the two sensors are T B G P(T,B,G)


conditionally independent, given the
ghost position +t +b +g 0.16
+t +b -g 0.16
§ T: Top square is red
B: Bottom square is red +t -b +g 0.24
G: Ghost is in the top
+t -b -g 0.04
§ Givens: -t +b +g 0.04
P( +g ) = 0.5
P( -g ) = 0.5 -t +b -g 0.24
P( +t | +g ) = 0.8 -t -b +g 0.06
P( +t | -g ) = 0.4
P( +b | +g ) = 0.4 -t -b -g 0.06
P( +b | -g ) = 0.8
Bayes’Nets: Big Picture
Bayes’ Nets: Big Picture
§ Two problems with using full joint distribution tables
as our probabilistic models:
§ Unless there are only a few variables, the joint is WAY too
big to represent explicitly
§ Hard to learn (estimate) anything empirically about more
than a few variables at a time

§ Bayes’ nets: a technique for describing complex joint


distributions (models) using simple, local
distributions (conditional probabilities)
§ More properly called graphical models
§ We describe how variables locally interact
§ Local interactions chain together to give global, indirect
interactions
§ For about 10 min, we’ll be vague about how these
interactions are specified
Example Bayes’ Net: Insurance
Example Bayes’ Net: Car
Graphical Model Notation

§ Nodes: variables (with domains)


§ Can be assigned (observed) or unassigned
(unobserved)

§ Arcs: interactions
§ Similar to CSP constraints
§ Indicate “direct influence” between variables
§ Formally: encode conditional independence
(more later)

§ For now: imagine that arrows mean


direct causation (in general, they don’t!)
Example: Coin Flips
§ N independent coin flips

X1 X2 Xn

§ No interactions between variables: absolute independence


Example: Traffic
§ Variables:
§ R: It rains
§ T: There is traffic

§ Model 1: independence § Model 2: rain causes traffic

R R

T T
§ Why is an agent using model 2 better?
Example: Traffic II
§ Let’s build a causal graphical model!
§ Variables
§ T: Traffic
§ R: It rains
§ L: Low pressure
§ D: Roof drips
§ B: Ballgame
§ C: Cavity
Example: Alarm Network
§ Variables
§ B: Burglary
§ A: Alarm goes off
§ M: Mary calls
§ J: John calls
§ E: Earthquake!
Bayes’ Net Semantics
Bayes’ Net Semantics
§ A set of nodes, one per variable X

§ A directed, acyclic graph A1 An

§ A conditional distribution for each node


§ A collection of distributions over X, one for each
combination of parents’ values X

§ CPT: conditional probability table


§ Description of a noisy “causal” process

A Bayes net = Topology (graph) + Local Conditional Probabilities


Probabilities in BNs
§ Bayes’ nets implicitly encode joint distributions
§ As a product of local conditional distributions
§ To see what probability a BN gives to a full assignment, multiply all the
relevant conditionals together:

§ Example:
Probabilities in BNs
§ Why are we guaranteed that setting

results in a proper joint distribution?

§ Chain rule (valid for all distributions):

§ Assume conditional independences:

à Consequence:

§ Not every BN can represent every joint distribution


§ The topology enforces certain conditional independencies
Example: Coin Flips

X1 X2 Xn

h 0.5 h 0.5 h 0.5


t 0.5 t 0.5 t 0.5

Only distributions whose variables are absolutely independent can be


represented by a Bayes’ net with no arcs.
Example: Traffic

+r 1/4
R
-r 3/4

+r +t 3/4
T -t 1/4

-r +t 1/2
-t 1/2
Example: Alarm Network
B P(B) E P(E)
Burglary Earthqk +e 0.002
+b 0.001
-b 0.999 -e 0.998

Alarm
B E A P(A|B,E)
+b +e +a 0.95
John Mary
calls calls +b +e -a 0.05
+b -e +a 0.94
A J P(J|A) A M P(M|A) +b -e -a 0.06
+a +j 0.9 +a +m 0.7 -b +e +a 0.29
+a -j 0.1 +a -m 0.3 -b +e -a 0.71
-a +j 0.05 -a +m 0.01 -b -e +a 0.001
-a -j 0.95 -a -m 0.99 -b -e -a 0.999
Example: Traffic
§ Causal direction

+r 1/4
R
-r 3/4

+r +t 3/16
+r -t 1/16
+r +t 3/4
-r +t 6/16
T -t 1/4
-r -t 6/16
-r +t 1/2
-t 1/2
Example: Reverse Traffic
§ Reverse causality?

+t 9/16
T
-t 7/16

+r +t 3/16
+r -t 1/16
+t +r 1/3
-r +t 6/16
R -r 2/3
-r -t 6/16
-t +r 1/7
-r 6/7
Causality?
§ When Bayes’ nets reflect the true causal patterns:
§ Often simpler (nodes have fewer parents)
§ Often easier to think about
§ Often easier to elicit from experts

§ BNs need not actually be causal


§ Sometimes no causal net exists over the domain
(especially if variables are missing)
§ E.g. consider the variables Traffic and Drips
§ End up with arrows that reflect correlation, not causation

§ What do the arrows really mean?


§ Topology may happen to encode causal structure
§ Topology really encodes conditional independence
Bayes’ Nets
§ So far: how a Bayes’ net encodes a joint
distribution

§ Next: how to answer queries about that


distribution
§ Today:
§ First assembled BNs using an intuitive notion of
conditional independence as causality
§ Then saw that key property is conditional independence
§ Main goal: answer queries about conditional
independence and influence

§ After that: how to answer numerical queries


(inference)

You might also like