0% found this document useful (0 votes)

25 views

Lecture 18-2

Uploaded by

niveditasimmi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

Lecture 18-2

Uploaded by

niveditasimmi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Approximate Inference via Sampling (2)

MCMC algos: MH and Gibbs Sampling

CS698X: Topics in Probabilistic Modeling and Inference

Piyush Rai
2

Some MCMC Algorithms

CS698X: TPMI
3
Metropolis-Hastings (MH) Sampling (1960)
෤
𝑝(𝒛)
▪ Suppose we wish to generate samples from a target distribution 𝑝 𝒛 =
𝑍𝑝

▪ Assume a suitable proposal distribution 𝑞(𝒛|𝒛(𝜏) ), e.g., 𝒩(𝒛|𝒛 𝜏 , 𝜎 2 𝑰)

▪ In each step, draw 𝒛∗ from 𝑞(𝒛|𝒛(𝜏) ) and accept 𝒛∗ with probability

Favor acceptance of 𝒛∗ if our proposal allows
Favors acceptance of 𝒛∗ if it is more reverting to the older state 𝒛(𝜏) from 𝒛∗
probable than 𝒛(𝜏) (under 𝑝(𝒛))
Favor acceptance of 𝒛∗ if it had very
low chance of being generated by the
proposal but it does have high
෤ ∗ ) under the target
probability 𝑝(𝒛

▪ Transition function of this Markov Chain: 𝑇 𝒛∗ |𝒛(𝜏) = 𝐴(𝒛∗ , 𝒛 𝜏 )𝑞(𝒛∗ |𝒛(𝜏) )

▪ Exercise: Show that 𝑇 𝒛∗ |𝒛(𝜏) satisfies the detailed balance property
𝑝 𝒛 𝑇 𝒛(𝜏) |𝒛 = 𝑝 𝒛(𝜏) 𝑇(𝒛|𝒛(𝜏) ) CS698X: TPMI
4
The MH Sampling Algorithm
▪ Initialize 𝒛(1) randomly
▪ For ℓ = 1,2, … , 𝐿
▪ Sample 𝒛∗ ∼ 𝑞(𝒛∗ |𝒛(ℓ) ) and 𝑢 ∼ Unif(0,1)
▪ Compute acceptance probability

▪ If 𝐴 𝒛∗ , 𝒛(ℓ) > 𝑢 Meaning accepting 𝒛∗ with

probability 𝐴 𝒛∗ , 𝒛(ℓ)

𝒛(ℓ+1) = 𝒛∗
▪ Else
𝒛(ℓ+1) = 𝒛(ℓ)
CS698X: TPMI
5
MH Sampling in Action: A Toy Example..
▪ Target distribution

▪ Proposal distribution

CS698X: TPMI
6
MH Sampling: Some Comments
▪ If prop. distrib. is symmetric, we get Metropolis Sampling algo (Metropolis, 1953) with

▪ Some limitations of MH sampling

▪ Can sometimes have very slow convergence (also known as slow “mixing”)

𝜏 𝜏
𝑄 𝒛𝒛 = 𝒩(𝒛|𝒛 , 𝜎 2 𝑰) 2
𝐿
𝜎 large ⇒ many rejections ∼ iterations required for convergence
𝜎
𝜎 small ⇒ slow diffusion

෤
𝑝(𝒛)
▪ Computing acceptance probability can be expensive*, e.g., if 𝑝 𝒛 = is some target
𝑍𝑝
posterior then 𝑝(𝒛)
෤ would require computing likelihood on all the data points (expensive)
*Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget (Korattikara et al, 2014), Firefly Monte Carlo: Exact MCMC with Subsets of Data {(Maclaurin and Adams, 2015) CS698X: TPMI
7
Gibbs Sampling (Geman & Geman, 1984)
▪ Goal: Sample from a joint distribution 𝑝(𝒛) where 𝒛 = [𝑧1 , 𝑧2 , … , 𝑧𝑀 ]

▪ Suppose we can’t sample from 𝑝(𝒛) but can sample from each conditional 𝑝(𝑧𝑖 |𝒛−𝑖 )
▪ In Bayesian models, can be done easily if we have a locally conjugate model

▪ For Gibbs sampling, the proposal is the conditional distribution 𝑝(𝑧𝑖 |𝒛−𝑖 )

▪ Gibbs sampling samples from these conditionals in a cyclic order Hence no need
to compute it

▪ Gibbs sampling is equivalent to MH sampling with acceptance prob. = 1

Since only one component

is changed at a time
CS698X: TPMI
8
Gibbs Sampling: Sketch of the Algorithm
▪ 𝑀: Total number of variables, 𝑇: number of Gibbs sampling iterations
Assuming 𝒛 = [𝑧1 , 𝑧2 , … , 𝑧𝑀 ]

CP of each component of 𝑧 uses

the most recent values (from this
or the previous iteration) of all
the other components

Each iteration will give us one

sample 𝒛(𝜏) of 𝒛 = [𝑧1 , 𝑧2 , … , 𝑧𝑀 ]

▪ Note: Order of updating the variables usually doesn’t matter (but see “Scan Order in Gibbs
Sampling: Models in Which it Matters and Bounds on How Much” from NIPS 2016)
CS698X: TPMI
9
Gibbs Sampling: A Simple Example
▪ Can sample from a 2-D Gaussian using 1-D Gaussians

Conditional distribution of
Contours of a 𝑧1 given 𝑧2 is Gaussian
2-D Gaussian

Conditional distribution of
𝑧2 given 𝑧1 is Gaussian Gibbs sampling looks like doing
a co-ordinate-wise update to
generate each successive
sample of 𝑧 = [𝑧1 , 𝑧2 ]

CS698X: TPMI
10
Gibbs Sampling: Some Comments
▪ One of the most popular MCMC algorithms

▪ Very easy to derive and implement for locally conjugate models

▪ Many variations exist, e.g.,

▪ Blocked Gibbs: sample more than one component jointly (sometimes possible)
▪ Rao-Blackwellized Gibbs: Can collapse (i.e., integrate out) the unneeded components while
sampling. Also called “collapsed” Gibbs sampling
▪ MH within Gibbs: If CPs are not easy to sample distributions

▪ Instead of sampling from CPs, an alternative is to use the mode of the CPs
▪ Called the “Iterative Conditional Mode” (ICM) algorithm
▪ ICM doesn’t give the posterior though – it’s more like ALT-OPT to get (approx) MAP estimate

CS698X: TPMI
11
Coming Up Next
▪ Using posterior’s gradient info in sampling algorithms
▪ Online MCMC algorithms
▪ Recent advances in MCMC
▪ Some other practical issues (convergence etc)

CS698X: TPMI

Markov Chain Monte Carlo in Practice (W R Gilks, S Richardson, D J Spiegelhalter
No ratings yet
Markov Chain Monte Carlo in Practice (W R Gilks, S Richardson, D J Spiegelhalter
485 pages
Case Problem 4 Workforce Scheduling
No ratings yet
Case Problem 4 Workforce Scheduling
4 pages
Lecture 19
No ratings yet
Lecture 19
12 pages
Lecture 18 1
No ratings yet
Lecture 18 1
17 pages
Markov Chain Monte Carlo (MCMC) Methods: Example 11 (Matlab)
No ratings yet
Markov Chain Monte Carlo (MCMC) Methods: Example 11 (Matlab)
21 pages
18 Aos1715
No ratings yet
18 Aos1715
33 pages
CSE291D Lecture 6: Monte Carlo Methods 2: Markov Chain Monte Carlo
No ratings yet
CSE291D Lecture 6: Monte Carlo Methods 2: Markov Chain Monte Carlo
66 pages
Lec30 GibbsSampling
No ratings yet
Lec30 GibbsSampling
55 pages
MCMC
No ratings yet
MCMC
7 pages
Markov Chain Monte Carlo and Gibbs Sampling
No ratings yet
Markov Chain Monte Carlo and Gibbs Sampling
24 pages
MCMC - Markov Chain Monte Carlo: One of The Top Ten Algorithms of The 20th Century
100% (1)
MCMC - Markov Chain Monte Carlo: One of The Top Ten Algorithms of The 20th Century
31 pages
03 Markov Chain Monte Carlo
No ratings yet
03 Markov Chain Monte Carlo
4 pages
Intro To Markov Chain Monte Carlo: Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601
No ratings yet
Intro To Markov Chain Monte Carlo: Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601
35 pages
High-Dimensional Gaussian SamplingA Review and a Unifying Approach Basedon a Stochastic Proximal Point Algorithm
No ratings yet
High-Dimensional Gaussian SamplingA Review and a Unifying Approach Basedon a Stochastic Proximal Point Algorithm
54 pages
Markov Chain Monte Carlo Sampling Using A Reservoir Method
No ratings yet
Markov Chain Monte Carlo Sampling Using A Reservoir Method
11 pages
MCMC Final Edition
No ratings yet
MCMC Final Edition
17 pages
L15 Misc Topic Sampling
No ratings yet
L15 Misc Topic Sampling
18 pages
9. Bayesian_Lec_4
No ratings yet
9. Bayesian_Lec_4
25 pages
Biostats Gibbs
No ratings yet
Biostats Gibbs
41 pages
MCMC: Gibbs Sampling: D K k1 k+1 D
No ratings yet
MCMC: Gibbs Sampling: D K k1 k+1 D
7 pages
NeurIPS-2019-sample-adaptive-mcmc-Paper
No ratings yet
NeurIPS-2019-sample-adaptive-mcmc-Paper
12 pages
Gibbs sampling algorithm from scratch using R programming language
No ratings yet
Gibbs sampling algorithm from scratch using R programming language
11 pages
Markov Chain Monte Carlo
No ratings yet
Markov Chain Monte Carlo
2 pages
PDF Sampling: Markov Chain Monte Carlo: X N I I
No ratings yet
PDF Sampling: Markov Chain Monte Carlo: X N I I
13 pages
Bayesian Modelling Tuts-12-15
No ratings yet
Bayesian Modelling Tuts-12-15
4 pages
Sampling Methods: Søren Højsgaard
No ratings yet
Sampling Methods: Søren Højsgaard
22 pages
Mcmc-A Comparative Study
No ratings yet
Mcmc-A Comparative Study
29 pages
The Gibbs Sampler: Function
No ratings yet
The Gibbs Sampler: Function
1 page
A Smart-Dumb/Dumb-Smart Algorithm For Efficient Split-Merge MCMC
No ratings yet
A Smart-Dumb/Dumb-Smart Algorithm For Efficient Split-Merge MCMC
10 pages
Chib Greenberg 1995
No ratings yet
Chib Greenberg 1995
12 pages
Cra I U Rosenthal Ann Rev
No ratings yet
Cra I U Rosenthal Ann Rev
40 pages
Chib-UnderstandingMetropolisHastingsAlgorithm-1995
No ratings yet
Chib-UnderstandingMetropolisHastingsAlgorithm-1995
10 pages
markov-chain-monte-carlo
No ratings yet
markov-chain-monte-carlo
8 pages
Sequential Monte Carlo Methods
No ratings yet
Sequential Monte Carlo Methods
6 pages
Approximate Inference
No ratings yet
Approximate Inference
37 pages
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
No ratings yet
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
10 pages
Arnaud Dufays Inference 3
No ratings yet
Arnaud Dufays Inference 3
45 pages
Using Early Rejection Markov Chain Monte Carlo and Gaussian Processes To Accelerate ABC Methods
No ratings yet
Using Early Rejection Markov Chain Monte Carlo and Gaussian Processes To Accelerate ABC Methods
33 pages
MCMC Brief
No ratings yet
MCMC Brief
69 pages
Bayesian Analysis
No ratings yet
Bayesian Analysis
20 pages
2 MS2 (Sampling)
No ratings yet
2 MS2 (Sampling)
29 pages
General State Space Markov Chains and MCMC Algorithms - Gareth O. Roberts, Jeffrey S. Rosenthal
No ratings yet
General State Space Markov Chains and MCMC Algorithms - Gareth O. Roberts, Jeffrey S. Rosenthal
64 pages
Biometrika Trust
No ratings yet
Biometrika Trust
14 pages
Questions_for_Unit_5__RM_
No ratings yet
Questions_for_Unit_5__RM_
4 pages
Annurev Statistics 031219 041300
No ratings yet
Annurev Statistics 031219 041300
26 pages
Annurev Statistics 022513 115540
No ratings yet
Annurev Statistics 022513 115540
26 pages
29-Approximate Inference Methods-28-03-2024
No ratings yet
29-Approximate Inference Methods-28-03-2024
26 pages
3-MS2 (MCMC)
No ratings yet
3-MS2 (MCMC)
32 pages
Statistical Computation Algorithm Based On Markov Chain Monte Carlo Sampling To Solve Multivariable Nonlinear Optimization
No ratings yet
Statistical Computation Algorithm Based On Markov Chain Monte Carlo Sampling To Solve Multivariable Nonlinear Optimization
5 pages
SI Nonlin
No ratings yet
SI Nonlin
14 pages
MCMC With Temporary Mapping and Caching With Application On Gaussian Process Regression
No ratings yet
MCMC With Temporary Mapping and Caching With Application On Gaussian Process Regression
16 pages
AI 19 Bayes Nets IV Sampling
No ratings yet
AI 19 Bayes Nets IV Sampling
29 pages
Markov Chain Monte Carlo
No ratings yet
Markov Chain Monte Carlo
29 pages
Walter R. Gilks, Sylvia Richardson (Auth.), Walter R. Gilks, Sylvia Richardson, David J. Spiegelhalter (Eds.) - Markov Chain Monte Carlo in Practice-Springer US (1996)
No ratings yet
Walter R. Gilks, Sylvia Richardson (Auth.), Walter R. Gilks, Sylvia Richardson, David J. Spiegelhalter (Eds.) - Markov Chain Monte Carlo in Practice-Springer US (1996)
487 pages
Monte Carlo
No ratings yet
Monte Carlo
59 pages
Markov Chain Monte-Carlo Explained
No ratings yet
Markov Chain Monte-Carlo Explained
12 pages
CPSC 540: Machine Learning: Gibbs Sampling, Variational Inference
No ratings yet
CPSC 540: Machine Learning: Gibbs Sampling, Variational Inference
37 pages
LM7 Approximate Inference in BN
No ratings yet
LM7 Approximate Inference in BN
18 pages
Deviance Information Criterion (DIC)
No ratings yet
Deviance Information Criterion (DIC)
1 page
Hill Climbing: Fundamentals and Applications
From Everand
Hill Climbing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
Day 18 Dial Algorithm
No ratings yet
Day 18 Dial Algorithm
9 pages
8th_lecture_Delta_Rule_Learning_s1_21_22
No ratings yet
8th_lecture_Delta_Rule_Learning_s1_21_22
48 pages
NM Assignment Solution
No ratings yet
NM Assignment Solution
146 pages
DAA FJ1_Set B Key
No ratings yet
DAA FJ1_Set B Key
9 pages
Question Bank 1711435538
No ratings yet
Question Bank 1711435538
10 pages
Initial Feasible Solution
No ratings yet
Initial Feasible Solution
3 pages
Quiz 05 - Questions & Solutions
No ratings yet
Quiz 05 - Questions & Solutions
2 pages
Linear Programming: Sources: Quantitative Techniques by Sirug/Tabuloc
No ratings yet
Linear Programming: Sources: Quantitative Techniques by Sirug/Tabuloc
48 pages
64f04b4c3c1c93eea57747f9 Nanosoxabowebiwof
No ratings yet
64f04b4c3c1c93eea57747f9 Nanosoxabowebiwof
3 pages
20.DFS Vs BFS
No ratings yet
20.DFS Vs BFS
37 pages
DAA Question Paper Set -1 (1)
No ratings yet
DAA Question Paper Set -1 (1)
2 pages
4.0 Finite Difference Methods and Interpolation
No ratings yet
4.0 Finite Difference Methods and Interpolation
37 pages
2.1 Solution of Equations of One Variables: The Problem
No ratings yet
2.1 Solution of Equations of One Variables: The Problem
6 pages
Curve Fitting: Fitting A Straight Line
No ratings yet
Curve Fitting: Fitting A Straight Line
17 pages
Assignmenttttttttt
No ratings yet
Assignmenttttttttt
1 page
(MAA 1.10) SYSTEMS OF LINEAR EQUATIONS - Eco
No ratings yet
(MAA 1.10) SYSTEMS OF LINEAR EQUATIONS - Eco
3 pages
Chapter 2 Polynomials
No ratings yet
Chapter 2 Polynomials
2 pages
BCS401 2nd IA Question Paper
No ratings yet
BCS401 2nd IA Question Paper
2 pages
1224 XMPLYRNPYQs
No ratings yet
1224 XMPLYRNPYQs
1 page
Maths Made Easy
No ratings yet
Maths Made Easy
76 pages
2-4 Zeros of Polynomial Functions
100% (1)
2-4 Zeros of Polynomial Functions
58 pages
STA 421 LNote
No ratings yet
STA 421 LNote
20 pages
KIL1005: Numerical Methods For Engineering
No ratings yet
KIL1005: Numerical Methods For Engineering
36 pages
Practice Test
No ratings yet
Practice Test
8 pages
Tutorial 5 - MA 204 PDF
No ratings yet
Tutorial 5 - MA 204 PDF
2 pages
choleskys method
No ratings yet
choleskys method
3 pages
Gauss Legendre 2 & 3 Point
No ratings yet
Gauss Legendre 2 & 3 Point
12 pages
Math Baby Thesis - Joven
No ratings yet
Math Baby Thesis - Joven
12 pages
Chapter 11-Sorting Algorithms
No ratings yet
Chapter 11-Sorting Algorithms
26 pages

Lecture 18-2

Uploaded by

Lecture 18-2

Uploaded by

Approximate Inference via Sampling (2)

MCMC algos: MH and Gibbs Sampling

CS698X: Topics in Probabilistic Modeling and Inference

Some MCMC Algorithms

▪ Assume a suitable proposal distribution 𝑞(𝒛|𝒛(𝜏) ), e.g., 𝒩(𝒛|𝒛 𝜏 , 𝜎 2 𝑰)

▪ In each step, draw 𝒛∗ from 𝑞(𝒛|𝒛(𝜏) ) and accept 𝒛∗ with probability

▪ Transition function of this Markov Chain: 𝑇 𝒛∗ |𝒛(𝜏) = 𝐴(𝒛∗ , 𝒛 𝜏 )𝑞(𝒛∗ |𝒛(𝜏) )

▪ If 𝐴 𝒛∗ , 𝒛(ℓ) > 𝑢 Meaning accepting 𝒛∗ with

▪ Some limitations of MH sampling

▪ Gibbs sampling is equivalent to MH sampling with acceptance prob. = 1

Since only one component

CP of each component of 𝑧 uses

Each iteration will give us one

▪ Very easy to derive and implement for locally conjugate models

▪ Many variations exist, e.g.,

You might also like