0% found this document useful (0 votes)
17 views

Slides 2 V 3

This document discusses Bayesian inference for normal populations. It introduces Bayes' theorem for many parameters and the generalised t-distribution. It then presents the prior and posterior analysis when the data are normally distributed with unknown mean and precision, and describes the marginal distributions that result from this model.

Uploaded by

emily
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Slides 2 V 3

This document discusses Bayesian inference for normal populations. It introduces Bayes' theorem for many parameters and the generalised t-distribution. It then presents the prior and posterior analysis when the data are normally distributed with unknown mean and precision, and describes the marginal distributions that result from this model.

Uploaded by

emily
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Chapter 2

Inference for a Normal population


Chapter 2. Inference for a normal population

2.1 Bayes Theorem for many parameters


Data: x = (x1 , x2 , . . . , xn )T
Model: depends on many parameters θ = (θ1 , θ2 , . . . , θp )T
pdf/pf f (x |θ) → Likelihood function f (x |θ)
Prior beliefs: pdf/pf π(θ)

Combine using Bayes Theorem


Posterior beliefs: pdf/pf π(θ|x )

Posterior distribution summarises all our current knowledge about the


parameter θ
Bayes Theorem
The posterior probability (density) function for θ is

π(θ) f (x |θ)
π(θ|x ) =
f (x )

where

 Θ π(θ) f (x |θ) dθ
R
 if θ is continuous,
f (x ) =
x |θ)

P
Θ π(θ) f ( if θ is discrete.

As before, this can be rewritten as

π(θ|x ) ∝ π(θ) × f (x |θ)


i.e. posterior ∝ prior × likelihood
Generalised t distribution: X ∼ ta (b, c)
Density for x ∈ R − a+1
Γ a+1
 
2 (x − b)2 2
f (x|a, b, c) = √ a
 1+
acπ Γ 2 ac

Parameters: a > 0, b ∈ R, c > 0


ac
E (X ) = Mode(X ) = b and Var (X ) = , if a ≥ 2
a−2

Generalisation of the standard t-distribution since (X − b)/ c ∼ ta
ta (0, 1) ≡ ta
lima→∞ ta (b, c) = N(b, c)
Example 2.1

If X ∼ ta (b, c) then show that Y = (X − b)/ c ∼ ta , with density
− a+1
Γ a+1
 
2 y2 2
f (y ) = √ a
 1+ , −∞ < y < ∞,
aπ Γ 2 a

Recall the general result: if X is a random variable with probability


density function fX (x) and g is a bijective (1–1) function then the
random variable Y = g (X ) has probability density function

d −1
fY (y ) = fX g −1 (y )

g (y ) . (2.1)
dy

Solution
...
Comments
R functions pgt and dgt in the package nclbayes give values for
FX (x) and fX (x) when X ∼ ta (b, c)
Relationship between the generalised t distribution and the standard t
distribution is similar to that of the normal distribution and the
standard normal distribution:
X −b
X ∼ N(b, c) =⇒ √ ∼ N(0, 1)
c

X −b
X ∼ ta (b, c) =⇒ √ ∼ ta
c
Inverse Chi distribution: Y ∼ Inv-Chi(a, b)
Density
2
2b a y −2a−1 e −b/y
f (y |a, b) = , y >0
Γ(a)
Parameters: a > 0, b > 0

b Γ(a − 1/2)
E (Y ) =
Γ(a)
b
Var (Y ) = − E (Y )2 , if a ≥ 1
a−1
The name of the distribution comes from the fact that
1/Y 2 ∼ Ga(a, b) ≡ χ22a /(2b)
2.2 Prior to posterior analysis
Data: Xi |µ, τ ∼ N(µ, 1/τ ), i = 1, 2, . . . , n (indep)
1

Prior: µ|τ ∼ N b, cτ , τ ∼ Ga(g , h), with joint density, for
µ ∈ R, τ > 0

π(µ, τ ) = π(µ|τ )π(τ )


 cτ 1/2 n cτ o hg τ g −1 e −hτ
= exp − (µ − b)2 ×
2π 2 Γ(g )
1
n τ o
∝ τ g − 2 exp − c(µ − b)2 + 2h

(2.2)
2
2.2 Prior to posterior analysis
Data: Xi |µ, τ ∼ N(µ, 1/τ ), i = 1, 2, . . . , n (indep)
1

Prior: µ|τ ∼ N b, cτ , τ ∼ Ga(g , h), with joint density, for
µ ∈ R, τ > 0

π(µ, τ ) = π(µ|τ )π(τ )


 cτ 1/2 n cτ o hg τ g −1 e −hτ
= exp − (µ − b)2 ×
2π 2 Γ(g )
1
n τ o
∝ τ g − 2 exp − c(µ − b)2 + 2h

(2.2)
2
 
µ
We write ∼ NGa(b, c, g , h)
τ  
µ
Determine the posterior distribution for
τ
Hint:
2
nc(x̄ − b)2
 
2 2 cb + nx̄
c(µ − b) + n(x̄ − µ) = (c + n) µ − +
c +n c +n
Solution
...
(2.3)
2.2.1 Marginal distributions
 
µ
If ∼ NGa(b, c, g , h) then τ ∼ Ga(g , h)
τ
The (marginal) density for µ is, for µ ∈ R
Z ∞
π(µ) = π(µ, τ ) dτ
0
Z ∞
1
n τ  o
∝ τ g − 2 exp − c(µ − b)2 + 2h dτ
0 2
Z ∞
1
n τ  o
τ g + 2 −1 exp − c(µ − b)2 + 2h

∝ dτ
0 2
Γ g + 12
 Z ∞
Γ(a)
∝ 1
using θa−1 e −bθ dθ = a
2
[{c(µ − b) + 2h}/2}] g+ 2 0 b
−g −1/2
c(µ − b)2
 
∝ h−g −1/2 1 +
2h
2g +1

c(µ − b)2
  2
∝ 1+
2h  
h
i.e. µ ∼ t2g b, (2.4)
gc
Summary of marginal distributions
 
µ
The prior ∼ NGa(b, c, g , h) has marginal distributions
τ
 
h
µ ∼ t2g b, gc

τ ∼ Ga(g , h)

Also σ = 1/ τ ∼ Inv-Chi(g , h)
 
The posterior
µ
τ
x ∼ NGa(B, C , G , H) has marginal distributions
µ|x ∼ t2G B, GC
H


τ |x ∼ Ga(G , H)

Also σ|x ∼ Inv-Chi(G , H)


Example 2.2
The 18th century physicist Henry Cavendish made 23 experimental
determinations of the earth’s density, and these data (in g /cm3 ) are
5.36 5.29 5.58 5.65 5.57 5.53 5.62 5.29
5.44 5.34 5.79 5.10 5.27 5.39 5.42 5.47
5.63 5.34 5.46 5.30 5.78 5.68 5.85
with sufficient statistics n = 23, x̄ = 5.4848, s = 0.1882
We consider X |µ, τ ∼ N(µ, 1/τ ), with τ unknown
In Example 1.4 we assumed Xi |µ ∼ N(µ, 0.22 ) with µ ∼ N(5.41, 0.42 )
and τ known with τ = 1/0.22 = 25.
Must specify the parameters in the NGa(b, c, g , h) prior distribution
for (µ, τ )T . Suppose we choose Var (τ ) = 250.
Choice of b and c: the NGa prior distribution has
µ|τ ∼ N{b, 1/(cτ )}. Matching the prior for µ|τ = 25 gives b = 5.41
and c = 0.25
Choice of g and h: the NGa prior distribution has τ ∼ Ga(g , h).
Choose E (τ ) = 25, giving g = 2.5 and h = 0.1
Summary: take prior
 
µ
∼ NGa(b = 5.41, c = 0.25, g = 2.5, h = 0.1)
τ

Is the marginal prior distribution for µ close to what we used in


Example 1.4? Yes, see plot below
0.8
density

0.4
0.0

4.0 4.5 5.0 5.5 6.0 6.5 7.0

µ
Figure: Marginal prior density for µ: new version (solid) and previous version
(dashed)
Determine the posterior distribution for (µ, τ )T . Also determine the
marginal prior distribution for τ and for σ, and the marginal posterior
distribution for each of µ, τ and σ.

Solution
...
Comparison of prior and posterior marginal distributions

0 2 4 6 8
density 4.0 4.5 5.0 5.5 6.0 6.5 7.0

µ
density

0.03
0.00

0 10 20 30 40 50 60 70

τ
15
Review of contour plots for bivariate distribution
σx2
     
X 0 ρσx σy
Bivariate normal: ∼N , , with
Y 0 ρσx σy σy2
n  o
x2 y2
density 1√ 1
exp − 2(1−ρ 2) σx2
+ σy2
− 2ρ σxy
x σy
.
2πσx σy 1−ρ2

Figure: Contour plots for different bivariate normal densities


Comparison of prior and posterior distributions
Can plot their contours using R command NGacontour:
mu=seq(4.5,6.5,len=1000); tau=seq(0,71,len=1000)
NGacontour(mu,tau,b,c,g,h,lty=3); NGacontour(mu,tau,B,C,G,H,add=TRUE)

70
0.002

60
0.004

50
40

0.016
τ

0.02
30

0.024
20

0.2

0.05
0.022
10

0.018
0.014
0.01 0.012 0.008
0.006
0.004
0.002
0

4.5 5.0 5.5 6.0 6.5

µ
Figure: Contour plot of the prior (dashed) and posterior (solid) densities for (µ, τ )T .
Comments
Wikipedia tells us that the actual mean density of the earth is
5.515 g /cm3
What is the (posterior) probability that the mean density is within 0.1
of this value?

µ|x ∼ t28 (5.484, 0.001561) ⇒ Pr (5.415 < µ < 5.615|x ) = 0.9529

using
pgt(5.615,28,5.484,0.001561)-pgt(5.415,28,5.484,0.001561)
Comments
Wikipedia tells us that the actual mean density of the earth is
5.515 g /cm3
What is the (posterior) probability that the mean density is within 0.1
of this value?

µ|x ∼ t28 (5.484, 0.001561) ⇒ Pr (5.415 < µ < 5.615|x ) = 0.9529

using
pgt(5.615,28,5.484,0.001561)-pgt(5.415,28,5.484,0.001561)
Without the data, the only basis for determining the earth’s density is
via the prior distribution

µ ∼ t5 (5.41, 0.16) ⇒ Pr (5.415 < µ < 5.615) = 0.1802

using pgt(5.615,5,5.41,0.16)-pgt(5.415,5,5.41,0.16)
These probability calculations demonstrate that the data have been
very informative and changed our beliefs about the earth’s density
2.3 Confidence intervals and regions

Example 2.3
Determine the 100(1 − α)% highest density interval (HDI) for the
population mean µ in terms of quantiles of the standard t-distribution

Solution
...
Calculating 95% posterior confidence intervals using R
µ|x ∼ t2G {B, H/(GC )}
Symmetric distribution → HDI and equi-tailed intervals are the same
c(qgt(0.025,2*G,B,H/(G*C)),qgt(0.975,2*G,B,H/(G*C)))

τ |x ∼ Ga(G , H)
Skewed distribution → HDI and equi-tailed intervals are different
hdiGamma(0.95,G,H)
c(qgamma(0.025,G,H),qgamma(0.975,G,H))

σ|x ∼ Inv-Chi(G , H)
Skewed distribution → HDI and equi-tailed intervals are different
hdiInvchi(0.95,G,H)
c(qinvchi(0.025,G,H),qinvchi(0.975,G,H))
Results from this data analysis . . .
Prior Posterior
µ: (4.3818, 6.4382) (5.4031, 5.5649)
τ: (1.4812, 55.9573) (14.0193, 42.2530) ← HDI
(4.1561, 64.1625) (15.0674, 43.7625)
σ: (0.1062, 0.4246) (0.1466, 0.2505) ← HDI
(0.1248, 0.4905) (0.1512, 0.2576)

Posterior HDI and equi-tailed intervals for τ are fairly similar but prior
intervals are not
Ditto for σ
Why?
Results from this data analysis . . .
Prior Posterior
µ: (4.3818, 6.4382) (5.4031, 5.5649)
τ: (1.4812, 55.9573) (14.0193, 42.2530) ← HDI
(4.1561, 64.1625) (15.0674, 43.7625)
σ: (0.1062, 0.4246) (0.1466, 0.2505) ← HDI
(0.1248, 0.4905) (0.1512, 0.2576)

Posterior HDI and equi-tailed intervals for τ are fairly similar but prior
intervals are not
Ditto for σ
Why?
The prior distributions are quite skewed but the posterior distributions
are fairly symmetric
Confidence regions
We have looked at (marginal) HDIs
Can be useful to also look at (joint) confidence regions

Example 2.4
Determine a joint α confidence region for (µ, τ )T by calculating k
satisfying P(π(µ, τ ) > k) = α or P(π(µ, τ ) > k | x ) = α.

Solution
...

Can plot these regions using NGacontour by using the appropriate


value for the contour level
mu=seq(3.5,7.5,len=1000)
tau=seq(0,80,len=1000)
NGacontour(mu,tau,b,c,g,h,p=c(0.95,0.9,0.8),lty=3)
NGacontour(mu,tau,B,C,G,H,p=c(0.95,0.9,0.8),add=TRUE)
Confidence regions for (µ, τ )T

80
60
40
τ

20

0.0053
0.0014 0.0027
0

4 5 6 7

Figure: 95%, 90% and 80% prior (dashed) and posterior (solid) confidence
regions for (µ, τ )T ; 95% (outer), 80% (inner).
Focusing on central part of plot . . .

60
0.0053

50
0.056

40
τ

30
20

0.11

0.029
10

5.3 5.4 5.5 5.6 5.7

Figure: 95%, 90% and 80% prior (dashed) and posterior (solid) confidence
regions for (µ, τ )T ; 95% (outer), 80% (inner).
Mid-term feedback: A snapshot of your comments

Good notes; well-structured notes; fantabulous notes; super


hand-writing; good use of visualiser
Mid-term feedback: A snapshot of your comments

Good notes; well-structured notes; fantabulous notes; super


hand-writing; good use of visualiser
Left-handed; learn to be ambidextrous; your left-hand gets in the way
man; hate lefties; your left hand is problematic
Mid-term feedback: A snapshot of your comments

Good notes; well-structured notes; fantabulous notes; super


hand-writing; good use of visualiser
Left-handed; learn to be ambidextrous; your left-hand gets in the way
man; hate lefties; your left hand is problematic
Good pace; way too slow; never seem to get through much; took
weeks to start the course; loads of stage 2 content - will this be on
exam?
Mid-term feedback: A snapshot of your comments

Good notes; well-structured notes; fantabulous notes; super


hand-writing; good use of visualiser
Left-handed; learn to be ambidextrous; your left-hand gets in the way
man; hate lefties; your left hand is problematic
Good pace; way too slow; never seem to get through much; took
weeks to start the course; loads of stage 2 content - will this be on
exam?
No more postgrad help; don’t go to China; you abandoned us when
the course got tough; don’t be ill
Mid-term feedback: A snapshot of your comments

Good notes; well-structured notes; fantabulous notes; super


hand-writing; good use of visualiser
Left-handed; learn to be ambidextrous; your left-hand gets in the way
man; hate lefties; your left hand is problematic
Good pace; way too slow; never seem to get through much; took
weeks to start the course; loads of stage 2 content - will this be on
exam?
No more postgrad help; don’t go to China; you abandoned us when
the course got tough; don’t be ill
Be on time; puntuality problems; please start on time (not bothered
really)
Mid-term feedback: A snapshot of your comments

Good notes; well-structured notes; fantabulous notes; super


hand-writing; good use of visualiser
Left-handed; learn to be ambidextrous; your left-hand gets in the way
man; hate lefties; your left hand is problematic
Good pace; way too slow; never seem to get through much; took
weeks to start the course; loads of stage 2 content - will this be on
exam?
No more postgrad help; don’t go to China; you abandoned us when
the course got tough; don’t be ill
Be on time; puntuality problems; please start on time (not bothered
really)
Trust issues
Mid-term feedback: A snapshot of your comments

Good notes; well-structured notes; fantabulous notes; super


hand-writing; good use of visualiser
Left-handed; learn to be ambidextrous; your left-hand gets in the way
man; hate lefties; your left hand is problematic
Good pace; way too slow; never seem to get through much; took
weeks to start the course; loads of stage 2 content - will this be on
exam?
No more postgrad help; don’t go to China; you abandoned us when
the course got tough; don’t be ill
Be on time; puntuality problems; please start on time (not bothered
really)
Trust issues
Lee >>> Chris; we want more Chris comparisons; Bring Chris back;
Chris’s part of the module is better than yours
Mid-term feedback: A snapshot of your comments

Good notes; well-structured notes; fantabulous notes; super


hand-writing; good use of visualiser
Left-handed; learn to be ambidextrous; your left-hand gets in the way
man; hate lefties; your left hand is problematic
Good pace; way too slow; never seem to get through much; took
weeks to start the course; loads of stage 2 content - will this be on
exam?
No more postgrad help; don’t go to China; you abandoned us when
the course got tough; don’t be ill
Be on time; puntuality problems; please start on time (not bothered
really)
Trust issues
Lee >>> Chris; we want more Chris comparisons; Bring Chris back;
Chris’s part of the module is better than yours
What could be improved?
Mid-term feedback: A snapshot of your comments

Good notes; well-structured notes; fantabulous notes; super


hand-writing; good use of visualiser
Left-handed; learn to be ambidextrous; your left-hand gets in the way
man; hate lefties; your left hand is problematic
Good pace; way too slow; never seem to get through much; took
weeks to start the course; loads of stage 2 content - will this be on
exam?
No more postgrad help; don’t go to China; you abandoned us when
the course got tough; don’t be ill
Be on time; puntuality problems; please start on time (not bothered
really)
Trust issues
Lee >>> Chris; we want more Chris comparisons; Bring Chris back;
Chris’s part of the module is better than yours
What could be improved? The shirts
2.4 Predictive distributions
In this model we can determine the predictive distribution via
Z
f (y |x ) = f (y |µ, τ ) π(µ, τ |x ) dµ dτ

or by using Candidate’s formula (as this is a conjugate analysis)


But, for this model, there is a more straightforward way

...
2.4 Predictive distributions
In this model we can determine the predictive distribution via
Z
f (y |x ) = f (y |µ, τ ) π(µ, τ |x ) dµ dτ

or by using Candidate’s formula (as this is a conjugate analysis)


But, for this model, there is a more straightforward way

...
These predictive intervals can be calculated easily using the R
function qgt
In Example 3.2, the prior and posterior predictive HDIs for a new
value Y from the population are (4.2604, 6.5596) and
(5.0855, 5.8825) respectively, calculated using
c(qgt(0.025,2*g,b,h*(c+1)/(g*c)),qgt(0.975,2*g,b,h*(c+1)/(g*c)))
c(qgt(0.025,2*G,B,H*(C+1)/(G*C)),qgt(0.975,2*G,B,H*(C+1)/(G*C)))
2.5 Summary
Suppose we have a normal random sample with Xi |µ, τ ∼ N(µ, 1/τ ),
i = 1, 2, . . . , n (independent)

(i) (µ, τ )T ∼ NGa(b, c, g , h) is a conjugate prior distribution


(ii) The posterior distribution is (µ, τ )T |x ∼ NGa(B, C , G , H) where the
posterior parameters are given by (2.3)
(iii) The marginal prior distributions are µ ∼ t2g {b, h/(gc)},

τ ∼ Ga(g , h), σ = 1/ τ ∼ Inv-Chi(g , h)
(iv) The marginal posterior distributions are µ|x ∼ t2G {B, H/(GC )},
τ |x ∼ Ga(G , H), σ|x ∼ Inv-Chi(G , H)
(v) Prior and posterior means and standard deviations for µ, τ and σ can
be calculated from the properties of the t, Gamma and Inv-Chi
distributions
(vi) Prior and posterior probabilities and densities for µ, τ and σ can be
calculated using the R functions pgt, dgt, pgamma, dgamma,
pinvchi, dinvchi
Suppose we have a normal random sample with Xi |µ, τ ∼ N(µ, 1/τ ),
i = 1, 2, . . . , n (independent)

(vii) HDIs or equi-tailed CIs for µ, τ and σ can be calculated using qgt,
hdiGamma, hdiInvchi, qgamma, qinvchi
(viii) Contour plots of the prior and posterior densities for (µ, τ )T can be
plotted using the NGacontour function
(ix) Prior and posterior confidence regions for (µ, τ )T can be plotted
using the NGacontour function
(x) The predictive distribution for a new observation Y from the
population is Y |x ∼ t2G {B, H(C + 1)/(GC )} and its HDI can be
calculated using the qgt function
2.6 Why do we have so many different distributions?

So far we have used many distributions, some you will have met
before and some will be new
After a while the variety and sheer number of different distributions
can become overwhelming
Why do we need so many distributions and why do we name so many
of them?

Justification
Statistics studies the random variation in experiments, samples and
processes
The variety of applications leads to their randomness being described
by many different distributions
In many applications, bespoke distributions need to be formulated
However, some distributions come up time and time again for
modelling random variation in data and for describing prior beliefs
Justification (continued)
It is helpful for us to be able to refer to these distributions – and so
we give each one a name – and be able to quote known results for
these distributions such as their mean and variance
For example, in this chapter you have been introduced to
a generalisation of the t-distribution
the inverse chi distribution
The Normal-Gamma distribution
We have been able to use results for their mean and variance to study
prior and posterior distributions and have been able to plot these
distributions using functions in the R package
You will meet several other new distributions in the remainder of the
module
Justification (continued)
Obviously it’s useful to have a working knowledge of each of these
distributions but not vital to remember all their properties
Justification (continued)
Obviously it’s useful to have a working knowledge of each of these
distributions but not vital to remember all their properties
The exam paper will contain a list of all the distributions used in the
exam, together with their density (or probability function) and any
useful proprieties such as their mean and variance (as needed for the
exam)
2.7 Learning objectives
By the end of this chapter, you should be able to:
1. Determine the posterior distribution for (µ, τ )T
2. Determine and use the univariate prior and posterior distributions
3. Determine confidence intervals, HDIs and confidence regions
4. Determine the predictive distribution of another value from the
population, and its predictive interval
5. Determine the predictive distribution of the mean of another random
sample from the population
both in general and for a particular prior and data set.
Also you should be able to:
6. Appreciate the benefit of naming distributions and for having lists of
properties for these distributions

You might also like