bayesian-course-2-short
bayesian-course-2-short
Ben Lambert1
[email protected]
4 Sampling
1 Model testing through posterior predictive checks
4 Sampling
Example: Modelling rainfall in Oxford
Example:
Measure the average rainfall by month in Oxford.
Modelling rainfall in Oxford
150
monthly rainfall, mm
100
50
0
time
Scenario: modelling Oxford rainfall for farmers
100
50
0
time
Scenario: modelling Oxford rainfall for farmers
100
50
0
time
Scenario: modelling Oxford rainfall for farmers
1
rainfall indicator
0
time
Choosing a likelihood
Conditions:
Xt ∈ (0, 1) is a discrete random variable.
Assume independence among Xt .
Assume identical distribution for Xt ; probability of
rainfall exceeding monthly average is θ.
=⇒ Bernoulli likelihood for each individual Xt .
The Bernoulli likelihood
L(θ|Xt = 1) = θ1 (1 − θ0 ) = θ (3)
Therefore here the sampling distribution is discrete whereas
the likelihood distribution is continuous.
Likelihood vs sampling distribution
0.4
0.25
●
0.2
0.0
0 1
Xt , rainfall above monthly average
Likelihood vs sampling distribution
0.8
likelihood
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
θ, probability monthly rain is above average
Likelihood vs sampling distribution
0.8
area = 0.5
likelihood
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
θ, probability monthly rain is above average
The overall likelihood
Defined:
“The probability distribution for a new data sample X̃ given
our current data X .”
We obtain this by the following recipe:
1 Sample a value of θi from posterior:
θi ∼ p(θ|X ) (5)
Nmaxreal = 7
1
rainfall indicator
Nmaxsim = 5
1
rainfall indicator
Another sample.
Nmaxsim = 4
1
rainfall indicator
A further sample.
Nmaxsim = 7
1
rainfall indicator
A number of samples.
Nmaxsim = 7 Nmaxsim = 2 Nmaxsim = 5
1 1 1
rainfall indicator
rainfall indicator
rainfall indicator
0 0 0
2010 2011 2012 2013 2014 2015 2010 2011 2012 2013 2014 2015 2010 2011 2012 2013 2014 2015
year year year
1 1 1
rainfall indicator
rainfall indicator
rainfall indicator
0 0 0
2010 2011 2012 2013 2014 2015 2010 2011 2012 2013 2014 2015 2010 2011 2012 2013 2014 2015
year year year
1 1 1
rainfall indicator
rainfall indicator
rainfall indicator
0 0 0
2010 2011 2012 2013 2014 2015 2010 2011 2012 2013 2014 2015 2010 2011 2012 2013 2014 2015
year year year
Scenario: p value
2000
1500
1000
500
0
2 4 6 8 10 12
max run of wet months
Scenario: p value
2000
1500
1000
500
0
2 4 6 8 10 12
max run of wet months
Scenario: p value
2000
1500
1000
500
0
2 4 6 8 10 12
max run of wet months
1 Model testing through posterior predictive checks
4 Sampling
Example problem: paternal discrepancy
p(X = 2)
Z1
p(X = 2) = p(X = 2|θ) × p(θ)dθ (8)
0
pdf prior
0 20 40 60 80 100
likelihood
likelihood
0 20 40 60 80 100
likelihood × prior
0 20 40 60 80 100
θ (PD prevalence), %
The denominator as an area
pdf prior
0 20 40 60 80 100
likelihood
likelihood
0 20 40 60 80 100
likelihood × prior
Pr(X = 2) ≃ 0.08
0 20 40 60 80 100
θ (PD prevalence), %
Calculating the denominator in 1 dimension
Z1
p(X = 2) = p(X = 2|θ) × p(θ)dθ (9)
0
This is equivalent to working out an area under a curve.
likelihood × prior
Pr(X = 2) ≃ 0.08
0 20 40 60 80 100
θ (PD prevalence), %
Calculating the denominator in 2 dimensions
Z1 Z1
p(X = 2) = p(X = 2|θ1 , θ2 ) × p(θ1 , θ2 )dθ1 dθ2 (10)
0 0
Z1 Z1
p(X = 2) = ... p(X = 2|θ1 , ..., θd ) × p(θ1 , ..., θd )dθ1 ...dθd
0 0
(11)
This is equivalent to working out a (d + 1)-dimensional
volume contained within a d-dimensional (hyper-surface)!
“I have no idea what I’m doing.”
The difficult denominator
Arrrghhh!
Other difficult integrals
Θ1 Θ2 Θd
Z
= θ1 p(θ1 |X )dθ1
Θ1
4 Sampling
What are conjugate priors?
Method:
Convert continuous parameter into k discrete values.
Use discrete version of Bayes’ rule.
As k → ∞ discrete posterior → true posterior.
Scenario: discretised Bayesian inference
● ●
● ●
● ●
● ●
The problem with discretised Bayes
4 Sampling
Black box die
50
current value = 26
40
running mean
30
20
10
0
0 20 40 60 80 100
# shakes
Black box die: sampling to estimate a sum
X ≈ E(X ) (19)
An infinitely-sided die as a continuous distribution
0.1
001
# faces
1 6
0.7
0.001
.48
0.23 0 0
.87
50
0 1
number on face
1.0
current value = 0.725545
0.8
running mean
0.6
0.4
0.2
0.0
0 20 40 60 80 100
# shakes
A stranger distribution
-10 -5 0 5 10
die value
A stranger distribution: sampling
10
current value = 4.85897
5
running mean
-5
-10
0 20 40 60 80 100
# shakes
A stranger distribution: why does sampling work?
Compare samples...
frequency
-10 -5 0 5 10
die value
A stranger distribution: why does sampling work?
frequency
-10 -5 0 5 10
die value
A stranger distribution: why does sampling work?
frequency
-10 -5 0 5 10
die value
A stranger distribution: why does sampling work?
frequency
-10 -5 0 5 10
die value
What is an independent sample?