0% found this document useful (0 votes)
15 views

AAEC 6984 / SPRING 2014 Instructor: Klaus Moeltner

This document outlines several common "tricks" used in Bayesian estimation, including: 1) Breaking a joint density into marginals and conditionals to introduce known densities. 2) Turning a marginal density into an integrated joint density to introduce draws from conditionals. 3) Obtaining draws from an unknown marginal density by drawing from a known conditional, as in the Gibbs sampler. 4) Using Monte Carlo integration to approximate an unknown density or expectation using draws from known conditional densities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

AAEC 6984 / SPRING 2014 Instructor: Klaus Moeltner

This document outlines several common "tricks" used in Bayesian estimation, including: 1) Breaking a joint density into marginals and conditionals to introduce known densities. 2) Turning a marginal density into an integrated joint density to introduce draws from conditionals. 3) Obtaining draws from an unknown marginal density by drawing from a known conditional, as in the Gibbs sampler. 4) Using Monte Carlo integration to approximate an unknown density or expectation using draws from known conditional densities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

COMMON BAYESIAN ESTIMATION “TRICKS”

AAEC 6984 / SPRING 2014


INSTRUCTOR: KLAUS MOELTNER

Assume throughout that θ1 , θ2 , and z are random elements (scalars or vectors), and y symbolizes
observed data. The letter p will denote a generic distribution or probability.

breaking a joint density into marginals and conditionals


Example: p (θ1 , θ2 , z) = p (θ1 ) p (θ2 , z|θ1 ) = p (θ1 ) p (θ2 |θ1 ) p (z|θ1 , θ2 ) (1)

Of course, other split-ups are possible as well. The split-up strategy is usually chosen to be left with
as many known densities as possible on the right hand side. If the original joint density is already
conditioned on some other variable, that conditioning is carried through all subsequent components.

Example:

p (θ1 , θ2 , z|y) = p (θ1 |y) p (θ2 , z|θ1 , y) = p (θ1 |y) p (θ2 |θ1 , y) p (z|θ1 , θ2 , y) (2)

Turning a marginal into an integrated joint density


Example:

Z
p (θ1 ) = p (θ1 , θ2 , z) dz dθ2 =
θ2 ,z
Z (3)
p (θ1 |θ2 , z) p (θ2 , z) dz dθ2
θ2 ,z

Same example with pre-existing conditioning:

Z
p (θ1 |y) = p (θ1 , θ2 , z|y) dz dθ2 =
θ2 ,z
Z (4)
p (θ1 |θ2 , z, y) p (θ2 , z|y) dz dθ2
θ2 ,z

Obtaining draws from an unknown marginal by drawing from a known conditional


Continuing with the above example, if p (θ1 |θ2 , z, y) is known, and if draws of θ2 and z from
p (θ2 , z|y) are available (or, in a rare cases, p (θ2 , z|y) is known), draws of p (θ1 |y) can be obtained

1
by drawing from p (θ1 |θ2,r , zr , y) for many different draws of θ2,r , zr from p (θ2 , z|y).

The Gibbs Sampler is a special case of this strategy with a built-in reciprocity condition. Drop-
ping z for convenience and without loss in generality, assume we need draws from p (θ1 |y), but we
only know the form of p (θ1 |θ2 , y). Using the integration trick and the “breaking up a joint”-trick,
we obtain:
Z
p (θ1 |y) = p (θ1 , θ2 |y) dθ2 =
Zθ2 (5)
p (θ1 |θ2 , y) p (θ2 |y) dθ2
θ2

The problem here is that we don’t know the other marginal either, i.e. we don’t know p (θ2 |y).
However, if we have 1 draw of θ2 (our staring value for the GS), we can take a single draw of θ1
from p (θ1 |θ2 , y). By the reasoning above, this will also be a draw from the marginal p (θ1 |y). We
can then set up the reverse integration problem for θ2 , i.e.

Z
p (θ2 |y) = p (θ1 , θ2 |y) dθ1 =
θ1
Z (6)
p (θ2 |θ1 , y) p (θ1 |y) dθ1
θ1

If p (θ2 |θ1 , y) is known, we can draw θ2 from it (conditioning on the draw of θ1 we just obtained
from the first step). This will also be a draw from p (θ2 |y). This process is then repeated many
times to yield draws from the entire support of p (θ1 |y) and p (θ2 |y).

Monte Carlo integration


Another flavor of the integration trick is when we wish to evaluate the marginal (or any other
unknown density) at a specific point (say θ̄1 |y), or evaluate a specific single-valued function of θ1
(say its expectation, E (θ1 |y)). This works well when we already have draws of the remaining model
parameters from their respective marginal densities.

Example:

Z
 
p θ̄1 |y = p θ̄1 , θ2 |y dθ2 =
Zθ2 (7)

p θ̄1 |θ2 , y p (θ2 |y) dθ2
θ2

2

If we know p (θ1 |θ2 , y), and we have draws of θ2 from p (θ2 |y), we can approximate p θ̄1 |y via:

R
X
1
 
p θ̄1 |y ≈ R p θ̄1 |θ2,r , y (8)
r=1

using r = 1 . . . R draws of θ2 from p (θ2 |y).

Similarly, for E (θ1 |y):

Z
E (θ1 |y) = θ1 ∗ p (θ1 , θ2 |y) dθ2 =
Zθ2 (9)
θ1 ∗ p (θ1 |θ2 , y) p (θ2 |y) dθ2
θ2

which can be approximated via: R


X
1
E (θ1 |y) ≈ R θ1,r (10)
r=1

using r = 1 . . . R draws of θ1 from p (θ1 |θ2,r , y), which themselves are based on r = 1 . . . R draws of
θ2 from p (θ2 |y).

The same logic holds for any other (smooth, continuous) function g (θ1 |y), which is exploited
when generating posterior predictive distributions (PPDs).

You might also like