L15 Misc Topic Sampling
L15 Misc Topic Sampling
Semester I, 2024-25
Rohan Paul
1
Acknowledgement
These slides are intended for teaching purposes only. Some material
has been used/adapted from web sources and from slides by Doina
Precup, Dorsa Sadigh, Percy Liang, Mausam, Parag, Emma Brunskill,
Alexander Amini, Dan Klein, Anca Dragan, Nicholas Roy and others.
2
Answering Probabilistic Queries Fast:
Approximate Inference
• Exact Inference
• Inference by enumeration (Variable elimination)
• Exact likelihood for probabilistic queries.
• Exact Marginal likelihood P(Late = Yes)
• Exact Conditional likelihood (posterior probability)
• P(Late = True | Rain = Yes, Traffic = High) etc.
• Problem:
• In many practical applications variable elimination can be intractable. Variable elimination
may need to create a large table.
• Approximate Inference
• Compute an “approximate” posterior probability
• Principle Methods
• Generate samples from the distribution.
• Use the samples to construct an approximate estimate of the probabilistic query. P’(Late) or • Prior Sampling
P’(Late| Rain, Traffic).
• Rejection Sampling
• Advantage
• Generating samples and constructing the approximate distribution is often faster. • Gibbs Sampling
• Note that the estimate is approximate, not exact.
How to sample from a distribution?
• Sampling from given distribution Example:
• Step 1: Get sample u from uniform
distribution over [0, 1)
C P(C)
• Step 2: Convert this sample u into an
outcome for the given distribution by red 0.6
having each outcome associated with a
sub-interval of [0,1) with sub-interval green 0.1
size equal to probability of the blue 0.3
outcome
• Utility? We should be able to sample
from a CPT defining the probabilistic ▪ If random() returns u = 0.83,
model. then our sample is C = blue
• Next, we look at approaches for ▪ E.g, after sampling 8 times:
sampling from a Bayes Net.
Prior Sampling
Sampling from an empty network (without evidence).
Called “prior sampling” or “ancestral sampling”
• For i=1, 2, …, n
• Sample xi from P(Xi | Parents(X i))
C = +c, S = -s,
R = +r C = +c, S = -s, R =
+r, W = +w
Prior Sampling
• For i=1, 2, …, n +c 0.5
-c 0.5
• Sample xi from P(Xi | Parents(Xi))
WetGrass Samples:
+s +r +w 0.99
-w 0.01 +c, -s, +r, +w
-r +w 0.90 -c, +s, -r, +w
-w 0.10
-s +r +w 0.90 …
-w 0.10
-r +w 0.01
-w 0.99
Approximate Probabilistic Queries with Samples
• Potential samples from the Bayes Net:
+c, -s, +r, +w C
+c, +s, +r, +w
-c, +s, +r, -w
+c, -s, +r, +w
-c, -s, -r, +w
S R
• What can we do with these samples?
• Can empirically estimate the probabilistic queries
• Estimating P(W)
• We have counts <+w:4, -w:1> W
• Normalize to get P(W) = <+w:0.8, -w:0.2>
• Can estimate other probabilistic queries as well:
• P(C| +w)? P(C| +r, +w)? P(C| -r, -w)?
• Note: if some evidence is not observed then we cannot estimate it
Problem: Prior sampling is unaware of the types of probabilistic queries that will be asked later?
Can we be more efficient if we knew the queries from the Bayes Net?
Rejection Sampling
• IN: evidence instantiation
C
• For i=1, 2, …, n
S R
• Sample xi from P(Xi | Parents(Xi))
• If xi not consistent with evidence W
• Reject: Return, and no sample is generated in this cycle
• Property: in the limit of repeating this infinitely many times the resulting
sample is coming from the correct distribution.
12
Gibbs Sampling: Example
Estimating P( S | +r)
Step 1: Fix evidence C Step 2: Initialize other variables C
• R = +r ▪ Randomly
S +r S +r
W W
Steps 3: Repeat
• Randomly select a non-evidence variable X
• Resample X from P( X | all other variables)
C C C C C C
S +r S +r S +r S +r S +r S +r
W W W W W W
13
Sampling from the conditional
• Sample from P(S | +c, +r, -w) C
S +r
Sampling from the conditional distribution is needed as a sub-routine for Gibbs sampling. It is typically easier to sample
from. The expression is simpler due to instantiated variables, can even construct the probability table if needed. 14
The Markov Chain
15
Demo: https://www.cs.cmu.edu/~./15281/demos/gibbsDemo/
Markov Blanket
The Markov boundary of a node A in a Bayesian
Network is the set of nodes composed of A's
parents, A's children, and A's children's other
parents.
Example:
16
Markov Chain Monte Carlo – General Idea
MCMC is a general technique for obtaining samples from distributions (also applied to continuous distributions).
Demo: https://chi-feng.github.io/mcmc-demo/app.html
17
Gibbs Sampling with Markov Blanket Sampling
18