0% found this document useful (0 votes)

3 views

IntroBayesTimeSeries2

The document provides an introduction to Bayesian Time Series Econometrics, focusing on various models such as vector autoregression (VAR), unobserved components models, and Bayesian model comparison techniques. It outlines the structure of the VAR model, estimation methods using Gibbs sampling, and the application of state space models, including the Kalman filter. Additionally, it includes empirical examples and MATLAB code for implementing these models using U.S. economic data.

Uploaded by

jessezheng742247

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

IntroBayesTimeSeries2

Uploaded by

jessezheng742247

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 73

Introduction to

Bayesian Time Series Econometrics

Joshua Chan

Australian National University

6 July 2013
Instructor

Name: Joshua Chan (Josh or Dr. Chan)

Website:

http://people.anu.edu.au/joshua.chan/

Email:
[email protected]
Plan for Today

Four 50-minute Sessions

◦ vector autoregression or VAR
◦ unobserved components model (Kalman filter, precision
sampler)
◦ time-varying parameter VAR
◦ Bayesian model comparison (harmonic mean estimator, Chib’s
method, etc.)
Vector Autoregression

Consider the VARn (p) model:

yt = b + B1 yt−1 + · · · + Bp yt−p + ǫt ,

where ǫt ∼ N(0, Σ)

b is an n × 1 vector of intercepts and Bi is an n × n coefficient

matrix

(So basically a regression, but with multiple equations)

An Example

For example, if n = 2 and p = 1:

y1,t b1 B11,1 B12,1 y1,t−1 ǫ
= + + 1,t ,
y2,t b2 B21,1 B22,1 y2,t−1 ǫ2,t
where 2
ǫ1,t 0 σ11 σ12
∼N , 2 .
ǫ2,t 0 σ21 σ22
Estimation

If we can write the system as

y = Xβ + ǫ,

we can use the Gibbs sampler for the linear regression model

So let’s try to do that

(Work out the previous example first)

An Example (Continued)

Write

y1,t b1 B11,1 B12,1 y1,t−1 ǫ
= + + 1,t ,
y2,t b2 B21,1 B22,1 y2,t−1 ǫ2,t
as
 
b1
 
B11,1 
1 y1,t−1 y2,t−1 0 0 0  
y1,t
= B12,1 + ǫ1,t ,
y2,t 0 0 0 1 y1,t−1 
y2,t−1  b2   ǫ2,t
B21,1 
B22,1

Or
′
yt = I2 ⊗ [1, yt−1 ] β + ǫt
In the Form of a Regression

In general, rewrite the VARn (p) model as:

yt = Xt β + ǫt ,
′
where Xt = In ⊗ [1, yt−1 ′
, . . . , yt−p ] and β = vec([b, B1 , · · · Bp ]′ ).

Then stack the observations over t to get

y = Xβ + ǫ,

where
   
y1 X1
   
y =  ...  , X =  ...  , ǫ ∼ N(0, IT ⊗ Σ)
yT XT
VAR: Likelihood

Here we have used

|IT ⊗ Σ| = |Σ|T
(IT ⊗ Σ)−1 = IT ⊗ Σ−1
Also note that since

(yt | β, Σ) ∼ N(Xt β, Σ),

another way to write the likelihood is

Tn T 1 PT ′ −1 (y −X β)
f (y | β, Σ) = (2π)− 2 |Σ|− 2 e− 2 t=1 (yt −Xt β) Σ t t
Inverse-Wishart Distribution

An n × n random matrix Z is said to have an inverse-Wishart

distribution with shape parameter α > 0 and scale matrix W if its
pdf is given by

|W|α/2 1 −1
|Z|− 2 e− 2 tr(WZ ) ,
α+n+1
f (Z; α, W) = nα/2
2 Γn (α/2)

where Γn is the multivariate gamma function and tr(·) is the trace

function

We write Z ∼ InvWishart(α, W)
W
Moment: EZ = α−n−1 for α > n + 1
Priors and Gibbs Sampler

Independent priors for β and Σ:

β ∼ N(β 0 , Vβ ), Σ ∼ InvWishart(ν0 , S0 )

Use Gibbs sampler to estimate the model

We need to derive (1) (Σ | y, β) and (2) (β | y, Σ)

Sample (Σ | y, β)

Recall that for conformable matrices A, B, C

tr(ABC) = tr(BCA) = tr(CAB)

(1) f (Σ | y, β):
ν0 +n+1 1 −1 T 1 PT ′ −1 (y −X β)
∝ |Σ|− 2 e− 2 tr(S0 Σ )
|Σ|− 2 e− 2 t=1 (yt −Xt β) Σ t t

ν0 +n+T +1 1
) − 12 tr[ T ′ −1
]
−1
P
∝ |Σ|− 2 e− 2 tr(S0 Σ e t=1 (yt −Xt β)(yt −Xt β) Σ

ν +n+T +1 PT
− 0 2 − 21 tr[(S0 + t=1 (yt −Xt β)(yt −Xt β)
′
)Σ−1 ]
∝ |Σ| e
Now, compare
1 −1
e− 2 tr(SΣ )
ν+n+1
f (Σ | y, β) ∝ |Σ|− 2 ,

with
ν0 +T +n+1 1 PT
e− 2 tr[(S0 + )Σ−1 ]
′
∝ |Σ|− 2 t=1 (yt −Xt β)(yt −Xt β)

Hence,
T
!
X
(Σ | y, β) ∼ InvWishart ν0 + T , S0 + (yt − Xt β)(yt − Xt β)′
t=1
Sample (β | y, Σ)

(2) f (β | y, Σ):
1 ′ −1 1
∝ e− 2 (β−β0 ) Vβ (β−β0 ) e− 2 (y−Xβ) (IT ⊗Σ )(y−Xβ)
′ −1

1
∝ e− 2 [β (Vβ +X (IT ⊗Σ )X)β−2β (Vβ β0 +X (IT ⊗Σ )y)]
′ −1 ′ −1 ′ −1 ′ −1

Hence,
b Dβ ),
(β | y, Σ) ∼ N(β,
where
−1
Dβ = Vβ−1 ′
+ X (IT ⊗ Σ−1
)X ,

b = Dβ V−1 β + X′ (IT ⊗ Σ−1 )y
β β 0
Gibbs Sampler for the VAR

Pick some initial values β (0) = a0 and Σ(0) = B0 > 0. Then,

repeat the following steps from r = 1 to R:
1. Draw Σ(r ) ∼ f (Σ | y, β (r −1) ) (inverse-Wishart).
2. Draw β (r ) ∼ f (β | y, Σ(r ) ) (multivariate normal).
Empirical Example

U.S. data: GDP growth, CPI inflation rate, unemployment rate,

and Fed funds rate

From 1947:Q1 to 2011:Q4

Estimate a VAR4 (3)

MATLAB Code

% VAR3.m
nloop = 11000; burnin = 1000;
load ’USdata.csv’;
Z0 = USdata(1:3,:); Z = USdata(4:end,:);
[T n] = size(Z);
longZ = reshape(Z’,T*n,1);
k = n+3*n^2;

%% prior
temp1 = ones(k,1); temp1(1:2*n+1:k) = 1/100;
invVbeta = sparse(1:k,1:k,temp1’);
nu0 = 10; S0 = 7*eye(n);
MATLAB Code

%% compute and define a few things

X = [ones(T,1) [Z0(3,:); Z(1:end-1,:)] ...
[Z0(2:end,:); Z(1:end-2,:)] [Z0; Z(1:end-3,:)] ];
bigX = SURform2(X,n);
newnu = T + nu0;

%% initialize for storage

store_Sig = zeros(nloop-burnin,n,n);
store_beta = zeros(nloop-burnin,k);

%% initialize the chain

beta = (bigX’*bigX)\(bigX’*longZ);
err = reshape(longZ - bigX*beta,n,T);
Sig = err*err’/T;
invSig = Sig\speye(n);
MATLAB Code

for loop = 1:nloop

%% sample beta
XinvSig = bigX’*kron(speye(T),invSig);
XinvSigX = XinvSig*bigX;
invDbeta = invVbeta + XinvSigX;
betahat = invDbeta\(XinvSig*longZ);
beta = betahat + chol(invDbeta,’lower’)’\randn(k,1);

%% sample Sig
err = reshape(longZ - bigX*beta,n,T);
newS = S0 + err*err’;
Sig = iwishrnd(newS,newnu);
invSig = Sig\speye(n);
end
State Space Models: Overview

Now we study a general class of models called state space models

(focus on the linear Gaussian case)

High-dimensional, flexible models

Traditionally estimated by Kalman filter based methods

Here we introduce some new algorithms (Chan and Jeliazkov,

2009; McCausland et al. 2011) to do that without using KF
(easier derivation and faster algorithms)
State Space Models: Definition

A state space model consists of two modeling levels

In the first level, observations are related to the latent or

unobserved variables called states according to the observation or
measurement equation

yt = Xt θ t + ǫt , ǫt ∼ N(0, Σ)

In the second level, the evolution of the states are modeled via the
state or transition equation

θ t = θ t−1 + ζ t , ζ t ∼ N(0, Ω)
Kalman Filter: Overview

Kalman Filter: a system of recursive equations that is used for two

purposes:
◦ sample from the conditional density f (θ | y, Σ, Ω) (typically a
very high-dimensional normal density)
◦ evaluate the integrated or observed-data likelihood f (y | Σ, Ω)
(as opposed to the complete-data likelihood f (y | θ, Σ, Ω))

Turns out we can accomplish both tasks with some new sparse
matrix algorithms without using KF
Unobserved Components Model

We start with the simplest state space model: the unobserved

components model

The measurement equation is given by

yt = τ t + ǫ t , ǫt ∼ N(0, σ 2 )

The states, in turn, are initialized with τ1 ∼ N(τ0 , ω02 ) for some
known constants τ0 and ω02 , and evolve according to the transition
equation
τt = τt−1 + ut , ut ∼ N(0, ω 2 )
for t = 2, . . . , T
Priors and Estimation

Independent priors for σ 2 and ω 2

σ 2 ∼ InvGamma(νσ2 , Sσ2 ), ω 2 ∼ InvGamma(νω2 , Sω2 ),

A Gibbs sampler is constructed by sampling through

1. (τ | y, σ 2 , ω 2 ),
2. (σ 2 | y, τ , ω 2 ),
3. (ω 2 | y, τ , σ 2 ).

One main difficulty is to sample (τ | y, σ 2 , ω 2 ), which is a

T -dimensional density
Sample (τ | y, σ 2, ω 2): Overview

The joint posterior is given by

f (τ , σ 2 , ω 2 | y) ∝ f (y | τ , σ 2 )f (τ | ω 2 )f (σ 2 )f (ω 2 ),

where f (σ 2 ) and f (ω 2 ) are the inverse-Gamma priors

First show that f (τ | y, σ 2 , ω 2 ) is a normal density

Then discuss how one can sample from it efficiently

An Expression for f (y | τ , σ 2)

Rewrite the measurement equation as

y = τ + ǫ, ǫ ∼ N(0, σ 2 IT )

Hence, we have
T 1 ′
f (y | τ , σ 2 ) = (2πσ 2 )− 2 e− 2σ2 (y−τ ) (y−τ )
An Expression for f (τ | ω 2 )

For simplicity, assume τ0 = 0

Rewrite the transition equation as:

    
1 0 0 ··· 0 τ1 u1
−1 1 0 · · · 0   τ 2   u2 
    
 0 −1 1 · · · 0  τ3   u3 
   =  
 .. . . ..   ..   .. 
 . . .  .   . 
0 0 ··· −1 1 τT uT
| {z }
H

That is,
Hτ = u, u ∼ N(0, Ω)
where Ω = diag(ω02 , ω 2 , . . . , ω 2 )
Note that |H| = 1 and hence H is invertible

Since τ = H−1 u, Eτ = 0 and

Var(τ ) = H−1 Ω(H−1 )′

Finally,
τ ∼ N(0, (H′ Ω−1 H)−1 )

(Recall (AB)−1 = B−1 A−1 )

It follows that
1 1 ′ (H′ Ω−1 H)τ
f (τ | ω 2 ) = |(2π)(H′ Ω−1 H)−1 |− 2 e− 2 τ
T 1 1 ′ (H′ Ω−1 H)τ
= (2π)− 2 |Ω|− 2 e− 2 τ
T 1 T −1 1 ′ (H′ Ω−1 H)τ
= (2π)− 2 (ω02 )− 2 (ω 2 ) 2 e− 2 τ
An Expression for f (τ | y, σ 2, ω 2)

Combining f (τ | ω 2 ) and f (y | τ , σ 2 )

f (τ | y, σ 2 , ω 2 ) ∝ f (y | τ , σ 2 )f (τ | ω 2 )

− 21 1
(y−τ )′ (y−τ )+τ ′ (H′ Ω−1 H)τ
∝e σ2

− 12 τ ′ (H′ Ω−1 H+σ−2 IT )τ − 2
τ ′y
∝e σ2

Hence,
(τ | y, σ 2 , ω 2 ) ∼ N(b
τ , K−1 ),
where K = H′ Ω−1 H + σ −2 IT and τb = σ −2 K−1 y
Sample (τ | y, σ 2, ω 2)

Since f (τ | y, σ 2 , ω 2 ) is multivariate normal, sampling from it

might seem easy

Main difficulty: the covariance matrix K−1 is a full T × T matrix

(computing its Cholesky factor, e.g., is time-consuming)

However, the precision matrix K = H′ Ω−1 H + σ −2 IT is sparse

Computations involving sparse matrices are much quicker

0 0

50 50

100 100

150 150

200 200

250 250
0 100 200 0 100 200
nz = 769 nz = 66049
Precision Sampler

τ , K−1 ) of dimension T ,
To generate R independent draws from N(b
carry out the following steps:
1. Compute the lower Cholesky factorization K = BB′ .
2. Generate Z = (Z1 , . . . , ZT )′ by drawing Z1 , . . . , ZT ∼ N(0, 1).
3. Output U = τb + (B′ )−1 Z.
4. Repeat Steps 2 and 3 independently R times.
Quick Check

Recall that U = τb + (B′ )−1 Z, where Z ∼ N(0, IT )

U is an affine transformation of normal random variables, so it is

normal

Easy to check that EU = τb

The covariance matrix is

Var(U) = (B′ )−1 Var(Z)((B′ )−1 )′ = (B′ )−1 (B)−1

= (BB′ )−1 = K−1
Sample (σ 2 | y, τ , ω 2)

(2) f (σ 2 | y, τ , ω 2 ):

f (σ 2 | y, τ , ω 2 ) ∝ f (y | τ , σ 2 )f (σ 2 )
S 2 1
T ′
∝ (σ 2 )−(νσ2 +1) e− σ2 (σ 2 )− 2 e− 2σ2 (y−τ ) (y−τ )
σ

1 ′
∝ (σ 2 )−(νσ2 +T /2+1) e− σ2 (Sσ2 +(y−τ ) (y−τ )/2)

Hence,

(σ 2 | y, τ , ω 2 ) ∼ InvGamma νσ2 + T /2, Sσ2 + (y − τ )′ (y − τ )/2
Sample (ω 2 | y, τ , σ 2)

Recall that the transition equation is given by

τt = τt−1 + ut , ut ∼ N(0, ω 2 )

for t = 2, . . . , T

Hence, another way to write f (τ | ω 2 ) is

T −1 1 PT 2
f (τ | ω 2 ) = f (τ1 )(2πω 2 )− 2 e− 2ω2 t=2 (τt −τt−1 )
(3) f (ω 2 | y, τ , σ 2 ):

f (ω 2 | y, τ , σ 2 ) ∝ f (τ | ω 2 )f (ω 2 )
S 2 1 PT
T −1 2
t=2 (τt −τt−1 )
ω
∝ (ω 2 )−(νω2 +1) e− ω2 (ω 2 )− 2 e− 2ω2
T −1 1 1 PT 2
∝ (ω 2 )−(νω2 + 2
+1)
e− ω2 (Sω2 + 2 t=2 (τt −τt−1 ) )

Hence,

2 2 T −1
(ω | y, τ , σ ) ∼ InvGamma νω2 + ,S
2
PT
where S = Sω2 + t=2 (τt − τt−1 )2 /2
Gibbs Sampler for the Unobserved Components Model

Pick some initial values τ (0) = a0 , σ 2(0) = b0 > 0, and

ω 2(0) = c0 > 0. Then, repeat the following steps from r = 1 to R:
1. Draw τ (r ) ∼ f (τ | y, σ 2(r −1) , ω 2(r −1) ) (multivariate normal).
2. Draw σ 2(r ) ∼ f (σ 2 | y, τ (r ) , ω 2(r −1) ) (inverse-gamma).
3. Draw ω 2(r ) ∼ f (ω 2 | y, τ (r ) , σ 2(r ) ) (inverse-gamma).
MATLAB Code

% UC_RW.m
nloop = 11000; burnin = 1000;
load ’USCPI.csv’; Y = USCPI;
T = length(Y);

%% initialize for storage

store_tau = zeros(nloop-burnin,T);
store_theta = zeros(nloop-burnin,2);

%% prior
invVtau = 1/5;
nusig0 = 5; Ssig0 = 4*(nusig0-1);
nuomega0 = 5; Somega0 = .25^2*(nuomega0-1);
MATLAB Code

%% initialize the Markov chain

sig2 = 1; omega2 = 1;

%% compute a few things outside the loop

H = speye(T) - sparse(2:T,1:(T-1),ones(1,T-1),T,T);

for loop = 1:nloop

%% sample tau
invOmega = sparse(1:T,1:T, ...
[invVtau 1/omega2*ones(1,T-1)],T,T);
invDtau = H’*invOmega*H + 1/sig2*speye(T);
Ctau = chol(invDtau,’lower’);
tauhat = invDtau\(Y/sig2);
tau = tauhat + Ctau’\randn(T,1);
MATLAB Code

%% sample sig2
newSsig = Ssig0 + sum((Y - tau).^2)/2;
sig2 = 1/gamrnd(nusig0+T/2,1/newSsig);

%% sample omega2
u = tau(2:end) - tau(1:T-1);
newSomega = Somega0 + sum(u.^2)/2;
omega2 = 1/gamrnd(nuomega0+(T-1)/2,1./newSomega);

if loop>burnin
i = loop-burnin;
store_tau(i,:) = tau’;
store_theta(i,:) = [sig2 omega2];
end
end
Time-Varying Parameter VAR

Consider again the VARn (p) but now with time-varying parameters:

yt = bt + B1t yt−1 + · · · + Bpt yt−p + ǫt , ǫt ∼ N(0, Σ)

Or equivalently,
yt = Xt β t + ǫt ,
where
′ ′
Xt = In ⊗ [1, yt−1 , . . . , yt−p ], β t = vec([bt , B1t , · · · Bpt ]′ )
The state equation is given by

β t = β t−1 + ut , ut ∼ N(0, Q)

for t = 2, . . . , T

Initialized with β 1 ∼ N(β 0 , Q0 ) for some known constant matrices

β 0 and Q0

The covariance matrix is typically assumed to be diagonal

Q = diag(q1 , . . . , qk )
Priors and Estimation

Independent priors for Σ and Q

Σ ∼ InvWishart(ν0 , S0 ), qi ∼ InvGamma(ν0i , S0i )

A Gibbs sampler is constructed by sampling through

1. (β | y, Σ, Q),
2. (Σ | y, β, Q),
3. (Q | y, β, Σ).
An Expression for f (y | β, Σ)

Rewrite the measurement equation as

y = Xβ + ǫ, ǫ ∼ N(0, IT ⊗ Σ),

where  
X1 0 · · · 0
 0 X2 · · · 0 
 
X =  .. .. . . .. 
 . . . . 
0 0 ··· XT

Hence, we have
Tn T 1 ′ −1
f (y | β, Σ) = (2π)− 2 |Σ|− 2 e− 2 (y−Xβ) (IT ⊗Σ )(y−Xβ)
An Expression for f (β | Q)

For simplicity, assume β 0 = 0

Rewrite the transition equation as:

    
Ik 0 0 ··· 0 β1 u1
−Ik Ik 0 · · · 0   β   u2 
  2  
 0 −Ik Ik · · · 0   β   u3 
  3 =  
 .. . . ..   ..   .. 
 . . .  .   . 
0 0 · · · −Ik Ik βT uT
| {z }
H

That is,
Hβ = u, u ∼ N(0, Ω)
where Ω = diag(Q0 , Q, . . . , Q)
Again |H| = 1 implies that H is invertible

Since
β ∼ N(0, (H′ Ω−1 H)−1 ),
it follows that
Tk 1 T −1 1 ′ ′ −1
f (β | Q) = (2π)− 2 |Q0 |− 2 |Q| 2 e− 2 β (H Ω H)β
An Expression for f (β | y, Σ, Q)

Combining f (β | Q) and f (y | β, Σ)

f (β | y, Σ, Q) ∝ f (β | Q)f (y | β, Σ)
1
∝ e− 2 ((y−Xβ) (IT ⊗Σ )(y−Xβ)+β (H Ω H)β )
′ −1 ′ ′ −1

1
∝ e− 2 (β (H Ω H+X′ (IT ⊗Σ−1 )X)β−2β ′ X′ (IT ⊗Σ−1 )y)
′ ′ −1

Hence,
b K−1 ),
(β | y, Σ, Q) ∼ N(β,
where

K = H′ Ω−1 H + X′ (IT ⊗ Σ−1 )X

b = K−1 X′ (IT ⊗ Σ−1 )y
β
Sample (β | y, Σ, Q)

Note that
K = H′ Ω−1 H + X′ (IT ⊗ Σ−1 )X
is again a sparse matrix

Moreover,
b = K−1 X′ (IT ⊗ Σ−1 )y
β
can be computed quickly

Use the precision sampler to sample (β | y, Σ, Q)

Sample (Σ | y, β, Q)

(2) f (Σ | y, β, Q):
ν0 +n+1 1 −1 T 1 PT ′ −1 (y −X β )
∝ |Σ|− 2 e− 2 tr(S0 Σ )
|Σ|− 2 e− 2 t=1 (yt −Xt β t ) Σ t t t

ν0 +n+T +1 PT
1 −1
) − 21 tr[ ′ −1
t=1 (yt −Xt β t )(yt −Xt β t ) Σ ]
∝ |Σ|− 2 e− 2 tr(S0 Σ e
ν0 +n+T +1 1 PT
e− 2 tr[(S0 + )Σ−1 ]
′
∝ |Σ|− 2 t=1 (yt −Xt β t )(yt −Xt β t )

Hence,
(Σ | y, β, Q) ∼ InvWishart (ν0 + T , S)
P
where S = S0 + T t=1 (yt − Xt β t )(yt − Xt β t )
′
Sample (Q | y, β, Σ)

Recall that Q = diag(q1 , . . . , qk ) is diagonal

Can show that

(qi | y, β, Σ) ∼ InvGamma (ν0i + (T − 1)/2, Si ) ,

P
where Si = S0i + T 2
t=2 (βi ,t − βi ,t−1 ) /2
Gibbs Sampler for the TVP-VAR

Pick some initial values β (0) = a0 , Σ(0) = B0 , and Q(0) = C0 .

Then, repeat the following steps from r = 1 to R:
1. Draw β (r ) ∼ f (β | y, Σ(r −1) , Q(r −1) ) (multivariate normal).
2. Draw Σ(r ) ∼ f (Σ | y, β (r ) , Q(r −1) ) (inverse-Wishart).
3. Draw Q(r ) ∼ f (Q | y, β (r ) , Σ(r ) ) (independent
inverse-gammas).
MATLAB Code

for loop = 1:nloop

%% sample beta
XinvSig = bigX’*kron(speye(T),invSig);
XinvSigX = XinvSig*bigX;
invOmega = sparse(1:Tk,1:Tk, ...
[1./Vbeta; repmat(invQ,T-1,1)]’);
invDbeta = H’*invOmega*H + XinvSigX;
C = chol(invDbeta,’lower’);
betahat = C’\(C\(XinvSig*Y));
beta = betahat + C’\randn(Tk,1);
MATLAB Code

%% sample Sig
e1 = reshape(Y-bigX*beta,n,T);
newS1 = S01 + e1*e1’;
invSig = wishrnd(newS1\speye(n),newnu1);
Sig = invSig\speye(n);

%% sample Q
e2 = reshape(H*beta,k,T);
newS2 = S02 + sum(e2(:,2:end).^2,2)/2;
invQ = gamrnd(newnu2,1./newS2);
Q = 1./invQ;
end
Evaluating the Integrated Likelihood

The integrated likelihood is defined as

Z
f (y | Σ, Q) = f (y | β, Σ)f (β | Q)dβ

Very high-dimensional integration

It can be evaluated using Kalman filter (often very slow)

Turns out it can be quickly evaluated using space matrix algorithms

By Bayes’ Theorem, we have

f (y | β, Σ)f (β | Q)
f (β | y, Σ, Q) =
f (y | Σ, Q)

Or equivalently,

f (y | β, Σ)f (β | Q)
f (y | Σ, Q) =
f (β | y, Σ, Q)
Note that the RHS does not depend on β (true for all β)

Pick any β = β ∗ :

f (y | β ∗ , Σ)f (β ∗ | Q)
f (y | Σ, Q) =
f (β ∗ | y, Σ, Q)

The RHS involves evaluating only normal densities

Used in Chan and Jeliazkov (2009) and Chan and Eisenstat (2013)
Bayesian Model Comparison

Consider the problem of comparing models M1 , . . . , MK

Each model Mk is formally defined by a likelihood function

f (y | θ k , Mk ) and a prior distribution f (θ k | Mk )

We make the model indicator Mk explicit, and θ k is a

model-specific parameter vector
Posterior Odds Ratio

One popular criterion to compare models Mi against Mj is the

posterior odds ratio:

f (Mi | y) f (Mi ) f (y | Mi )
POij = = × ,
f (Mj | y) f (Mj ) f (y | Mj )
| {z } | {z }
prior odds ratio Bayes factor

where Z
f (y | Mk ) = f (y | θ k , Mk )f (θ k | Mk )dθ k

is the marginal likelihood for model Mk

When f (Mi | y) = f (Mj | y), i.e., when the models are equally
probable apriori, then

f (y | Mi )
POij =
f (y | Mj )

It boils down to computing the marginal likelihood for each model

In general, computing
Z
f (y | Mk ) = f (y | θ k , Mk )f (θ k | Mk )dθ k

is non-trivial
Marginal Likelihood Estimation

From now on we omit the dependence on the model Mk

So just write f (y) instead of f (y | Mk )

Two popular methods to estimate the marginal likelihood: the

modified harmonic mean estimator (Gelfand and Dey, 1994) and
Chib’s method (Chib, 1995; Chib and Jeliazkov, 2001)
Modified Harmonic Mean Estimator

For any density g (θ), we have

R
This identity is true for any g such that g (θ)dθ = 1
Since
g (θ) 1
E y =
f (θ)f (y | θ) f (y)

we consider the estimator

 −1
R (r )
1 X g θ
fd
(y) =   ,
R (r )
f y|θ (r )
r =1 f θ

where θ (1) , . . . , θ (R) are posterior draws

Summary

The modified harmonic mean estimator of Gelfand and Dey (1994):

1. Obtain draws θ (1) , . . . , θ (R) from f (θ | y) using, e.g., Gibbs
sampler.
2. Compute fd(y).
How to choose a ’good’ g ?

Geweke (1999) suggests a normal approximation to the posterior

distribution with a tail truncation

Can prove that fd

(y) has finite variance

Some comments:
◦ easy to code up
◦ works well in low-dimensional models
◦ bias can be substantial in high-dimensional models (e.g.,
latent data models)
Chib’s Method

Chib’s Method is based on the identity

f (y | θ)f (θ)
f (y) =
f (θ | y)

which is true for any θ (in the support of the posterior)

Taking θ = θ ∗ , we have

log f (y) = log f (y | θ ∗ ) + log f (θ ∗ ) − log f (θ ∗ | y)

The first two terms on the RHS can often be evaluated

analytically; only need to estimate the third term
Example: a 3-Block Gibbs Sampler

Suppose we have the output from a 3-Block Gibbs sampler:

f (θ 1 | y, θ 2 , θ 3 ), f (θ 2 | y, θ 1 , θ 3 ), and f (θ 3 | y, θ 1 , θ 2 )

The three conditional distributions are fully known (so they can be
evaluated exactly)

Goal: estimate

Second, note that

Z Z
f (θ ∗1 | y) = f (θ ∗1 , θ 2 , θ 3 | y)dθ 2 dθ 3
Z Z
= f (θ ∗1 | y, θ 2 , θ 3 )f (θ 2 , θ 3 | y)dθ 2 dθ 3

Hence, f (θ ∗1 | y) can be estimated by

Hence, f (θ ∗2 | y, θ ∗1 ) can be estimated by

1X ∗
R
∗ (r )
f (θ\
∗
2 | y, θ ∗
1 ) = f θ 2 | y, θ ,
1 3θ
R
r =1
n o
(r )
where θ3 are sampled from f (θ 3 | y, θ ∗1 )
Draws from f (θ 3 | y, θ ∗1 ) can be obtained via a reduced run
(0) (0)
Initialize θ 2 = a0 and θ 3 = b0 . Then, repeat the following steps
from r = 1 to R:
(r ) (r −1)
1. Draw θ 2 ∼ f θ 2 | y, θ ∗1 , θ 3 .

(r ) (r )
2. Draw θ 3 ∼ f θ 3 | y, θ ∗1 , θ 2 .
Example: Summary

First, consider the identity

log f (y) = log f (y | θ ∗ ) + log f (θ ∗ ) − log f (θ ∗ | y)

Both f (y | θ ∗ ) and f (θ ∗ ) are often known. Suffices to estimate

f (θ ∗ | y) = f (θ ∗1 , θ ∗2 , θ ∗3 | y)

Next, note that

log f (θ ∗ | y) = log f (θ ∗1 | y) + log f (θ ∗2 | y, θ ∗1 ) + log f (θ ∗3 | y, θ ∗1 , θ ∗2 )

◦ f (θ ∗3 | y, θ ∗1 , θ ∗2 ) is known
◦ f (θ ∗1 | y) can be estimated using draws from the main run
◦ f (θ ∗2 | y, θ ∗1 ) can be estimated using draws from a reduced run
Comments:
◦ more programming effort is required
◦ more blocks require more reduced runs
◦ even more complicated when MH is involved (Chib and
Jeliazkov, 2001)
◦ works very well for high-dimensional latent data models

TimeSeriesAnalysis&ItsApplications2e Shumway PDF
91% (11)
TimeSeriesAnalysis&ItsApplications2e Shumway PDF
82 pages
Solutions To Steven Kay's Statistical Estimation Book
67% (3)
Solutions To Steven Kay's Statistical Estimation Book
16 pages
Time Series Analysis and Its Applications (Instructor's Manual) (Robert H. Shumway, David S. Stoffer)
100% (1)
Time Series Analysis and Its Applications (Instructor's Manual) (Robert H. Shumway, David S. Stoffer)
81 pages
SolutionsShumway PDF
No ratings yet
SolutionsShumway PDF
82 pages
Continuous - Time - Vs Discrete Time
No ratings yet
Continuous - Time - Vs Discrete Time
23 pages
Brief On: Mepco Transformer Reclamation Workshop
No ratings yet
Brief On: Mepco Transformer Reclamation Workshop
54 pages
IntroBayesTimeSeries1
No ratings yet
IntroBayesTimeSeries1
72 pages
Solution 4 Problem 1: A A ( 1, +1) : Iid Data
No ratings yet
Solution 4 Problem 1: A A ( 1, +1) : Iid Data
18 pages
Non-Linear Models With BSSM
No ratings yet
Non-Linear Models With BSSM
1 page
L6 - Kalman Filter
No ratings yet
L6 - Kalman Filter
15 pages
A Step by Step Mathematical Derivation A
No ratings yet
A Step by Step Mathematical Derivation A
32 pages
MIT14 384F13 Problems
No ratings yet
MIT14 384F13 Problems
7 pages
Statistics
No ratings yet
Statistics
60 pages
HMWK 4
No ratings yet
HMWK 4
5 pages
Factor Models
No ratings yet
Factor Models
59 pages
Eviews VAR Mit
No ratings yet
Eviews VAR Mit
5 pages
Merged Exercises
No ratings yet
Merged Exercises
238 pages
STAT4027 Assignment 1: Lewis Hastie
No ratings yet
STAT4027 Assignment 1: Lewis Hastie
26 pages
Exercise 3 Computer Intensive Statistics
No ratings yet
Exercise 3 Computer Intensive Statistics
10 pages
Slides Estimation PDF
No ratings yet
Slides Estimation PDF
17 pages
MCMC: Gibbs Sampling: D K k1 k+1 D
No ratings yet
MCMC: Gibbs Sampling: D K k1 k+1 D
7 pages
Last Week: 4.2 Cramer-Rao Lower Bound: 2 2 Fisher Bilgisi CRB
No ratings yet
Last Week: 4.2 Cramer-Rao Lower Bound: 2 2 Fisher Bilgisi CRB
9 pages
Particle Filter Tutorial
No ratings yet
Particle Filter Tutorial
8 pages
CHP 4
No ratings yet
CHP 4
5 pages
R Akne Ovningar Empirisk Modellering
No ratings yet
R Akne Ovningar Empirisk Modellering
23 pages
DynamicFactorModelsFactorAugmentedVectorAutoregressions
No ratings yet
DynamicFactorModelsFactorAugmentedVectorAutoregressions
43 pages
Note 6: EECS 189 Introduction To Machine Learning Fall 2020 1 Multivariate Gaussians
No ratings yet
Note 6: EECS 189 Introduction To Machine Learning Fall 2020 1 Multivariate Gaussians
9 pages
stats 102c notes
No ratings yet
stats 102c notes
6 pages
Derivations NIP Course: 1 Encoding Lectures
No ratings yet
Derivations NIP Course: 1 Encoding Lectures
7 pages
Extensions Beyond Linear Regression: Topics in Data Science
No ratings yet
Extensions Beyond Linear Regression: Topics in Data Science
66 pages
Parameter Estimation 1: Linear Least Squares
No ratings yet
Parameter Estimation 1: Linear Least Squares
7 pages
partialnongaussian
No ratings yet
partialnongaussian
18 pages
Homework1 2024
No ratings yet
Homework1 2024
2 pages
The Kalman Filter Explained: November 27, 2009
No ratings yet
The Kalman Filter Explained: November 27, 2009
12 pages
326 Formulas
No ratings yet
326 Formulas
3 pages
Lecture 4 - Estimation - BMSLec03
No ratings yet
Lecture 4 - Estimation - BMSLec03
20 pages
Sheffield Workshop2013 Osborne
No ratings yet
Sheffield Workshop2013 Osborne
86 pages
ED23D008 Lakshmi S
No ratings yet
ED23D008 Lakshmi S
23 pages
hw5 Sol
No ratings yet
hw5 Sol
17 pages
weighted-ensemble-2003.02316v3
No ratings yet
weighted-ensemble-2003.02316v3
41 pages
msqe_metrics_1_ps2
No ratings yet
msqe_metrics_1_ps2
11 pages
METULecture 1
No ratings yet
METULecture 1
15 pages
Unit 3 - Estimation And Prediction: θ 1 2 n 1 2 n 1 1 2 2 n n
No ratings yet
Unit 3 - Estimation And Prediction: θ 1 2 n 1 2 n 1 1 2 2 n n
14 pages
Exercises Predictors
No ratings yet
Exercises Predictors
14 pages
Observers and Kalman Filters: CS 393R: Autonomous Robots
No ratings yet
Observers and Kalman Filters: CS 393R: Autonomous Robots
37 pages
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
No ratings yet
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
9 pages
The Functional Central Limit Theorem and Testing For Time Varying Parameters
No ratings yet
The Functional Central Limit Theorem and Testing For Time Varying Parameters
38 pages
TP_stat_inf_103957
No ratings yet
TP_stat_inf_103957
32 pages
Econometrics 2019 PDF
No ratings yet
Econometrics 2019 PDF
143 pages
HW 5 Sol
No ratings yet
HW 5 Sol
20 pages
Exam 2013 2014
No ratings yet
Exam 2013 2014
4 pages
SDET Formulae MidSem2 2018 Ver3
No ratings yet
SDET Formulae MidSem2 2018 Ver3
2 pages
Stein 2011 DiffFilter
No ratings yet
Stein 2011 DiffFilter
20 pages
Notes On GMM Estimation in Dynamic General Equilibrium Models
No ratings yet
Notes On GMM Estimation in Dynamic General Equilibrium Models
38 pages
Simon Shaw Bayes Theory
No ratings yet
Simon Shaw Bayes Theory
72 pages
Exam 2013 Solution
No ratings yet
Exam 2013 Solution
5 pages
2023 Tarea Curso Identificacion
No ratings yet
2023 Tarea Curso Identificacion
10 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Computer Solved: Nonlinear Differential Equations
From Everand
Computer Solved: Nonlinear Differential Equations
Joe J. Ettl
No ratings yet
Complex Variables I Essentials
From Everand
Complex Variables I Essentials
Alan D. Solomon
No ratings yet
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
Monash_Estimation-1
No ratings yet
Monash_Estimation-1
107 pages
dsge_var
No ratings yet
dsge_var
59 pages
Computer_Tutorial_1
No ratings yet
Computer_Tutorial_1
4 pages
challenges
No ratings yet
challenges
11 pages
MonteCarloIntegration
No ratings yet
MonteCarloIntegration
9 pages
lecture 8
No ratings yet
lecture 8
37 pages
FDI International Economic
No ratings yet
FDI International Economic
23 pages
TimeSeriesAnalysisLectureOne
No ratings yet
TimeSeriesAnalysisLectureOne
46 pages
A Political Economy Study on the Trade Policies o~
No ratings yet
A Political Economy Study on the Trade Policies o~
28 pages
Rideshare Payload User's Guide
No ratings yet
Rideshare Payload User's Guide
102 pages
Sem 8
No ratings yet
Sem 8
17 pages
VAMP 300F/M: Protection IED
No ratings yet
VAMP 300F/M: Protection IED
374 pages
A910 en 20230912 Emea
No ratings yet
A910 en 20230912 Emea
2 pages
UXfortheweb
No ratings yet
UXfortheweb
120 pages
How To Steal Binance Funds
100% (2)
How To Steal Binance Funds
4 pages
What Is Mobile Computing
No ratings yet
What Is Mobile Computing
9 pages
Restaurant Working
No ratings yet
Restaurant Working
1 page
BCA Updated Syllabus NFLnOAK
No ratings yet
BCA Updated Syllabus NFLnOAK
19 pages
Grasso System Control
No ratings yet
Grasso System Control
22 pages
Time Series Lecture Notes-Ch-5
No ratings yet
Time Series Lecture Notes-Ch-5
27 pages
Boltight Brochure November 2019
No ratings yet
Boltight Brochure November 2019
20 pages
Download Full Swarm Evolutionary and Memetic Computing 4th International Conference SEMCCO 2013 Chennai India December 19 21 2013 Proceedings Part I 1st Edition M. Fatih Tasgetiren PDF All Chapters
100% (1)
Download Full Swarm Evolutionary and Memetic Computing 4th International Conference SEMCCO 2013 Chennai India December 19 21 2013 Proceedings Part I 1st Edition M. Fatih Tasgetiren PDF All Chapters
55 pages
Single Span Beam: Untitled: Input Tables
No ratings yet
Single Span Beam: Untitled: Input Tables
1 page
Growth Hacking Strategy for Onboarding SaaS Companies and Job Seekers
No ratings yet
Growth Hacking Strategy for Onboarding SaaS Companies and Job Seekers
6 pages
Kunci Gitar Calum Scott - You Are The Reason Chord Dasar Mudah @ PDF
No ratings yet
Kunci Gitar Calum Scott - You Are The Reason Chord Dasar Mudah @ PDF
3 pages
Amitanshu Singh
No ratings yet
Amitanshu Singh
4 pages
ISDP & Y Rollout Platform Integration Between Huawei ISDP and X Company
No ratings yet
ISDP & Y Rollout Platform Integration Between Huawei ISDP and X Company
5 pages
Bill of Quantities and Estimates
No ratings yet
Bill of Quantities and Estimates
2 pages
How Has The Development of Personal Computer Hardware and Software Reversed Some of The Trends Brought On by The Industrial Revolution?
No ratings yet
How Has The Development of Personal Computer Hardware and Software Reversed Some of The Trends Brought On by The Industrial Revolution?
9 pages
Managing Geotechnical Risk in Multi-Pit Operations
No ratings yet
Managing Geotechnical Risk in Multi-Pit Operations
12 pages
EEE 4604 Experiment 03
No ratings yet
EEE 4604 Experiment 03
4 pages
Week 1-4 Statistics Notes
No ratings yet
Week 1-4 Statistics Notes
91 pages
Selector 1 Selector 0 Salida 0 0 0 0 1 E1 1 0 E2 1 1 1
No ratings yet
Selector 1 Selector 0 Salida 0 0 0 0 1 E1 1 0 E2 1 1 1
3 pages
Ite 0001-1 - Lecture
No ratings yet
Ite 0001-1 - Lecture
50 pages
Boh Robotic600 SMCW 1.2
No ratings yet
Boh Robotic600 SMCW 1.2
2 pages
chapter 2
No ratings yet
chapter 2
88 pages
SLD - SAP Solution Manager Setup
No ratings yet
SLD - SAP Solution Manager Setup
3 pages
This Study Resource Was: Assessment Task 1 Establish Team Performance Plan
No ratings yet
This Study Resource Was: Assessment Task 1 Establish Team Performance Plan
4 pages