Bayes 2021 Part1
Bayes 2021 Part1
A. Colin Cameron
Univ. of Calif. - Davis
. .
May 2021
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 1 / 44
1. Introduction
1. Introduction
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 2 / 44
1. Introduction
Outline
1 Introduction
2 Bayesian Approach
3 Normal-normal Example
4 MCMC Example using Stata command bayes:
5 Markov Chain Monte Carlo Methods
6 Further discussion
7 Appendix: Accept/reject method
8 Some references
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 3 / 44
2. Bayesian Approach
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 4 / 44
2. Bayesian Approach Posterior density
p (θ, y, X)
p (θjy, X) =
p (y, X)
L(yjθ, X) π (θ)
p (θjy, X) =
m (y jX)
R
I m (y jX) = L(yjθ, X) π (θ)d θ is called the marginal likelihood
F problem: there is usually no tractable expression for m (yjX).
In general
Posterior _ Likelihood Prior
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 5 / 44
2. Bayesian Approach Posterior density
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 6 / 44
2. Bayesian Approach Posterior density
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 7 / 44
3. Normal-normal Example
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 8 / 44
3. Normal-normal Example
Prior N [5, 3] and likelihood N [10, 2] and yields posterior N [8, 1.2]
for θ
.4
.3
.2
.1
0
0 5 10 15 20
x
prior likelihood
posterior
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 9 / 44
3. Normal-normal Example
Classical inference: b
θ = ȳ = 10 N [10, 2]
p
I A 95% con…dence interval for θ is 10 1.96 2 = (7.23, 12.77)
I i.e. if we sampled many times then 95% of the time a similarly
constructed con…dence interval will include the unknown constant θ.
Bayesian inference: Posterior θ N [8, 1.2]
p
I A 95% posterior interval for θ is 8 1.96 1.2 = (5.85, 10.15)
I i.e. with probability 0.95 the true value of θ lies in this interval.
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 10 / 44
3. Normal-normal Example
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 11 / 44
3. Normal-normal Example Tractable results are rare
I using conjugate prior is like augmenting data with a sample from the
same distribution
I for Normal with precision matrix Σ 1 gamma generalizes to Wishart.
But in general tractable results are not available
I so use numerical methods, notably MCMC.
I using tractable results in subcomponents of MCMC can speed up
computation.
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 12 / 44
4. MCMC Example using Stata command bayes:
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 13 / 44
4. MCMC Example using Stata command bayes:
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 14 / 44
4. MCMC Example using Stata command bayes:
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 15 / 44
4. MCMC Example using Stata command bayes:
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 16 / 44
4. MCMC Example using Stata command bayes:
MCMC Example
Burn-in ...
Simulation ...
Model summary
Likelihood:
lnearnings ~ regress(xb_lnearnings,{sigma2})
Priors:
{lnearnings:education age _cons} ~ normal(0,10000) (1)
{sigma2} ~ igamma(.01,.01)
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 17 / 44
4. MCMC Example using Stata command bayes:
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 18 / 44
4. MCMC Example using Stata command bayes:
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
lnearnings
education .0871874 .0217776 .000819 .0868041 .0471493 .1312628
age .008496 .0062873 .000231 .0089316 -.0037933 .0208249
_cons 9.198406 .4482471 .016292 9.196124 8.319206 10.09851
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 19 / 44
4. MCMC Example using Stata command bayes: Diagnostics
20
.15
15
.1
10
.05
5
0
0
0 2000 4000 6000 8000 10000
Iteration number 0 .05 .1 .15
Autocorrelation Density
20
0.80 all
1-half
0.60
15
2-half
0.40 10
0.20
5
0.00
0
0 10 20 30 40
Lag 0 .05 .1 .15 .2
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 20 / 44
4. MCMC Example using Stata command bayes: Convergence of Chain
Convergence of Chain
There is no formal test.
Can do multiple independent chains and see if the variability of the
posterior mean of θ across chains is small, relative to the variation of
draws of θ within each chain.
Consider the jth of m chains
I b
θ j = posterior mean and sj = posterior variance
B measures variation between chains
I B= 1
∑m b b
θ )2 where b 1
∑m b
m 1 j =1 ( θ j θ= m j =1 θ j .
W measures variation in θ within chains
I W = 1
m ∑m 2
j = 1 sj .
W +B
The Gelman-Rubin Rc statistic Rc ' W
I Actually uses an adjustment for …nite number of chains
I A common threshold is Rc< 1.1 (equivalently WB < 0.1).
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 21 / 44
4. MCMC Example using Stata command bayes: Convergence of Chain
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
lnearnings
education .085597 .0222416 .000371 .0855127 .0416117 .12877
age .0079981 .0063156 .000096 .0081201 -.0044435 .0202879
_cons 9.241303 .4537841 .007116 9.23721 8.355778 10.14552
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 22 / 44
4. MCMC Example using Stata command bayes: Convergence of Chain
Number of chains = 5
MCMC size, per chain = 10,000
Max Gelman–Rubin Rc = 1.002092
Rc
lnearnings
education 1.00161
age 1.001305
_cons 1.002092
sigma2 1.000309
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 23 / 44
4. MCMC Example using Stata command bayes: Some bayes: code
* Estimation
bayes rseed(10101): regress y x
* Summary statistics for model parameters
bayesstats summary {y:x}
* Probability that slope is in range 0.4 to 0.6
bayestest interval {y:x}, lower(0.4) upper(0.6)
* Effective sample size
bayesstats ess
* Graphical Diagnostics
bayesgraph diagnostics {y:x}
* Convergence diagnostics
bayes, rseed(10101) nchains(5): regress y x
bayesstats grubin
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 24 / 44
5. Markov chain Monte Carlo (MCMC) methods
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 25 / 44
5. Markov chain Monte Carlo (MCMC) methods
Markov Chains
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 26 / 44
5. Markov chain Monte Carlo (MCMC) methods
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 27 / 44
5. Markov chain Monte Carlo (MCMC) methods Metropolis Algorithm
Metropolis Algorithm
We want to draw from posterior p ( ) but usually cannot directly do
so.
Metropolis draws from a candidate distribution g (θ(s ) jθ(s 1)
)
I these draws are sometimes accepted and some times not
I like accept-reject method but do not require p ( ) kg ( )
Metropolis algorithm at the s th round
I draw candidate θ from candidate distribution g ( )
I the candidate distribution g (θ(s ) jθ(s 1 ) ) needs to be symmetric
F so it must satisfy g (θa jθb ) = g (θb jθa )
I draw u from uniform[0, 1]
p (θ )
θ(s ) = θ if u <
p (θ(s 1) )
= θ(s 1)
otherwise.
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 28 / 44
5. Markov chain Monte Carlo (MCMC) methods Metropolis Algorithm
For proof that the Markov chain converges to the desired distribution
see, for example, Cameron and Trivedi (2005), p.451
I the proof requires that the candidate distribution is symmetric.
Taking logs
θ(s ) = θ if ln u < ln p (θ ) ln p (θ(s 1)
)
= θ(s 1)
otherwise.
Random walk Metropolis draws from θ(s ) N [θ(s 1)
, V] for …xed V
I ideally V such that 25-50% of candidate draws are accepted.
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 29 / 44
5. Markov chain Monte Carlo (MCMC) methods Metropolis-Hastings Algorithm
Metropolis-Hastings Algorithm
Metropolis-Hastings is a generalization
I the candidate distribution g (θ(s ) jθ(s 1) ) need not be symmetric
I
p (θ ) g (θ jθ(s 1 ) )
the acceptance rule is then u <
p (θ(s 1 ) ) g (θ(s 1 ) jθ )
I Metropolis algorithm itself is often called Metropolis-Hastings.
Independence chain MH uses g (θ(s ) ) not depending on θ(s 1)
where
g ( ) is a good approximation to p ( )
I e.g. Do ML for p (θ) and then g (θ) is multivariate T with mean b
θ,
b [b
variance V θ].
I multivariate rather than normal as has fatter tails.
M and MH called Markov chain Monte Carlo
I because θ(s ) given θ(s 1 ) is a …rst-order Markov chain
I Markov chain theory proves convergence to draws from p ( ) as s ! ∞
I poor choice of candidate distribution leads to chain stuck in place.
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 30 / 44
5. Markov chain Monte Carlo (MCMC) methods Gibbs sampler
Gibbs sampler
Gibbs sampler (a general method for making draws)
I draw (Y1 , Y2 ) by alternating draws from f (y1 jy2 ) and f (y2 jy1 )
I after many draws gives draws from f (y1 , y2 ) even though
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 31 / 44
5. Markov chain Monte Carlo (MCMC) methods Correlated Draws
Correlated Draws
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 32 / 44
5. Markov chain Monte Carlo (MCMC) methods Stata bayes: and bayesmh commands
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 33 / 44
5. Markov chain Monte Carlo (MCMC) methods Stata bayes: and bayesmh commands
The following command gives exactly the same results as the earlier
bayes, rseed(10101): regress lnearnings education age
bayesmh command example
bayesmh lnearnings education age, likelihood(normal({sigma2})) ///
prior({lnearnings:education}, normal(0,10000)) ///
prior({lnearnings:age}, normal(0,10000)) ///
prior({lnearnings:_cons},normal(0,10000)) ///
prior({sigma2},igamma(0.01,0.01)) rseed(10101) ///
block({lnearnings: education age _cons}) block({sigma2})
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 34 / 44
6. Further discussion Speci…cation of prior
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 35 / 44
6. Further discussion Speci…cation of prior
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 36 / 44
6. Further discussion Informative Prior example
Convergence of MCMC
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 38 / 44
6. Further discussion Bayesian model selection
L1 (yjb
θ, X) (k2 k 1 )/2
B12 = N
b
L2 (yjθ, X)
I Here model 1 is nested in model 2 and due to asymptotics the prior has
no in‡uence (so the ratio of posteriors is the ratio of likelihoods)
I This is the Bayesian information criterion (BIC) or Schwarz criterion.
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 40 / 44
6. Further discussion What does it mean to be Bayesian?
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 41 / 44
7. Appendix: Accept reject method
f (x )
u
kg (x )
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 42 / 44
7. Appendix: Accept reject method
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 43 / 44
8. Some References
8. Some References
Chapter 13 “Bayesian Methods” in A. Colin Cameron and Pravin K. Trivedi,
Microeconometrics: Methods and Applications, Cambridge University Press.
Chapter 29 “Bayesian Methods: basics” in A. Colin Cameron and Pravin K.
Trivedi, Microeconometrics using Stata, Second edition, forthcoming.
Bayesian books by econometricians that feature MCMC are
I Geweke, J. (2003), Contemporary Bayesian Econometrics and Statistics,
Wiley.
I Koop, G., Poirier, D.J., and J.L. Tobias (2007), Bayesian Econometric
Methods, Cambridge University Press.
I Koop, G. (2003), Bayesian Econometrics, Wiley.
I Lancaster, T. (2004), Introduction to Modern Bayesian Econometrics, Wiley.
A. Colin Cameron Univ. of Calif. - Davis . . () Bayesian Methods: Part 1 May 2021 44 / 44