Bayesian Model Averaging of Space Time Car Models With Applicatio-1
Bayesian Model Averaging of Space Time Car Models With Applicatio-1
DigitalCommons@URI
2018
Recommended Citation
Sun, Caoxin, "Bayesian Model Averaging of Space Time Car Models with Application to U.S. House Price
Forecasting" (2018). Open Access Master's Theses. Paper 1181.
https://digitalcommons.uri.edu/theses/1181
This Thesis is brought to you for free and open access by DigitalCommons@URI. It has been accepted for inclusion
in Open Access Master's Theses by an authorized administrator of DigitalCommons@URI. For more information,
please contact [email protected].
BAYESIAN MODEL AVERAGING OF SPACE TIME CAR MODELS
BY
CAOXIN SUN
MASTER OF SCIENCE
IN
STATISTICS
2018
MASTER OF SCIENCE THESIS
OF
CAOXIN SUN
APPROVED:
Thesis Committee:
Todd Guilfoos
Nasser H. Zawia
DEAN OF THE GRADUATE SCHOOL
2018
ABSTRACT
Forecasting the house price growth rate helps to regulate risks associated with
the housing sector and further helps to stabilize the economy. However, due
to the volatility in the housing market, forecasting the house price growth
rate has been a tough task. In this thesis, we built a conditional autoregres-
quarterly observations from 1976 to 1994 and tested forecasting capability over
inhibited the allowance for the model and coefficients shifts over time. Our
model is based on a hierarchical structure that allows BMA to average out the
effects from predictors along with CAR model to account for the remaining
thesis advisor, Dr. Gavino Puggioni, who helped to shape the framework of
After taking his Bayesian Statistics class, I decided to pursue a degree in this
department. He is the mentor that guided me through the program and lead
Ventz and Dr. Todd Guilfoos) who gave me constructive and insightful advice
for the sleepless nights we were working together before deadlines, and for all
iii
TABLE OF CONTENTS
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . iii
TABLE OF CONTENTS . . . . . . . . . . . . . . . . . . . . . . . iv
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
CHAPTER
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
List of References . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.5 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . 15
iv
Page
List of References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
List of References . . . . . . . . . . . . . . . . . . . . . . . . . . 35
List of References . . . . . . . . . . . . . . . . . . . . . . . . . . 46
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
v
LIST OF FIGURES
Figure Page
1 Time Series of real house price growth rate and all the co-
variates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
vi
Figure Page
vii
LIST OF TABLES
Table Page
viii
CHAPTER 1
Introduction
1.1 Studying Housing Market
sector constitutes a significant share of the GDP and it is the largest compo-
nent of household wealth in the U.S. [1]. The Bureau of Labor Statistics has
ican homeowners goes toward housing [2]. Starting from the late 1990s, the
U.S. housing market was in a boom period until 2007 when the sub-prime crisis
U.S. housing price has received attention from governments, real estate devel-
opers and investors. However, it has been a challenging task due to the strong
For the past four decades, many economists have adopted time series ap-
proaches to study the relationship between the U.S. house price and the socio-
dynamic models that allow for time-varying coefficients. These models could
stability seen in real estate data. Most common methods include, but are not
limited to, regime switching models ([9],[10]) and AR, VAR, GARCH models
other’s housing prices. [2] grouped real house price into 8 Bureau of Economic
1
Analysis (BEA) regions. The within region correlation is larger than the be-
tween region correlation with only one exception. Other studies, such as [14],
proved the necessity of using spatial model accounting for dependent residuals
arose from only using ordinary least squares. There had been several studies
that accounted for spatial autocorrelation, which were summarised in [2]. One
error panel data models have also been largely used, though it favors a rel-
atively small number of regions just as STARMA models do. Other authors
housing price.
In the very recent study [12], the authors implemented Dynamic Model Se-
lection and Dynamic Model Averaging methods and demonstrated the im-
One limitation, however, is that the natural spatial structures existing in the
data are not considered in their model. In fact, univariate analysis on each
and a random effect. Fixed effect models the influence from the predictors and
the random effect captures the spatial-temporal variations in the data after
removing the fixed effect. We hope to achieve better forecasts by taking into
2
1.2 Areal Data and Spatial CAR
Unlike point process where points are all neighbors to each other on a
continuous surface, areal data have well defined boundaries and observed data
are frequently aggregations within the boundaries or the areal units themselves
constitute the units of observation [17]. CAR models are widely used to de-
scribe the spatial variation of areal data. CAR models have been extensively
used for the analysis of spatial data in diverse areas, such as demography,
Modeling spatial autocorrelation for areal data requires the creation of neigh-
bor structures and corresponding weight matrix. There are multiple ways
neighbor, after which the neighbor object is converted into a spatial matrix
that represents the weights of the spatial link between spatial units Sk and Ss .
When little is known about the spatial process, a common approach is to take
binary representation in which one is for neighbors and zero otherwise [17].
CAR model is a method for smoothing areal data that was originally proposed
by [19]. In a CAR model, the spatial component of the center areal unit is
nents from all the other units, and the weights come from previously created
weight matrix. If we use yk to indicate any observed value of interest, the full
be viewed as the local variance for areal unit k. The CAR model is often used
3
hierarchical model, we normally use φk to represent the spatial random effect
X ωks φs τ2
[φk |φs , s 6= k] ∼ N ( , ) (2)
s
ωk+ ωk+
where D is the diagonal matrix with elements ωk+ . The precision matrix
Σ−1
φ = (D−W ) is singular so that the above joint distribution is improper [20].
After some modifications, a proper CAR model has the following covariance
[22] :
σ 2 Σ−1
φ = ρs (D − W ) + (1 − ρs )I (4)
As for the fixed effects, a set of predictors are needed to capture covariates
fitting. This variable selecting problem has a natural Bayesian solution [23].
for multiple comparisons and efficient model space exploration as well as lower
Since the number of variables is relatively small in our case, we can exhaust
all possible models and compute their posterior probability in the model space
4
M. Suppose a set of K models M = M1 , ..., Mk are under consideration for
where p(Mk ) is the prior for model Mk and p(Y |Mk ) is the marginal
List of References
[5] S. Holly and N. Jones, “House prices since the 1940s: cointegration, de-
mography and asymmetries,” Economic Modelling, vol. 14, no. 4, pp.
549–565, 1997.
[6] J. M. Quigley, “A simple hybrid model for estimating real estate price
indexes,” Journal of Housing Economics, vol. 4, no. 1, pp. 1–12, 1995.
5
[9] S. Hall, Z. Psaradakis, and M. Sola, “Switching error-correction models of
house prices in the united kingdom,” Economic Modelling, vol. 14, no. 4,
pp. 517–527, 1997.
[12] L. Bork and S. V. Møller, “Forecasting house prices in the 50 states using
dynamic model averaging and dynamic model selection,” International
Journal of Forecasting, vol. 31, no. 1, pp. 63–78, 2015.
[16] B. Van Dijk, P. H. Franses, R. Paap, and D. Van Dijk, “Modelling regional
house prices,” Applied Economics, vol. 43, no. 17, pp. 2097–2110, 2011.
[19] J. Besag, “Spatial interaction and the statistical analysis of lattice sys-
tems,” Journal of the Royal Statistical Society. Series B (Methodological),
pp. 192–236, 1974.
[20] S. Guha and L. Ryan, “Spatio-temporal analysis of areal data and discov-
ery of neighborhood relationships in conditionally autoregressive models,”
2006.
[21] D. Brook, “On the distinction between the conditional probability and
the joint probability approaches in the specification of nearest-neighbour
systems,” Biometrika, vol. 51, no. 3/4, pp. 481–483, 1964.
6
[22] B. N. Leroux BG, Lei X, “Estimation of disease rates in small areas: a new
mixed model for spatial dependence,” in Statistical models in epidemiol-
ogy, the environment, and clinical trials. Springer, 2000, p. 179191.
7
CHAPTER 2
areal units S = S1 , ..., SK and data are recorded for each time unit for t =
convenient candidate for modeling this type of data. The hierarchical structure
is as following [1]:
We define here Ykt as the observed house price growth rate at time t and
invariant to space and time changes. φkt has a length of K ×T . We expand the
second line of the two equations above to show the dimension of each variable
as the following:
µ11 x11,0 ... x11,p φ11
.. .. ...
. .
β0
µK1 xK1,0 ... xK1,p φK1
..
= +
µ12
x12,0 ... x12,p
. φ12
.
..
.
..
βp .
..
µKT xKT,0 ... xKT,p φKT
(i) Ykt |µkt ∼ f (ykt |µkt , σ 2 ). At the first stage, the observed value of the
8
house price growth rate Ykt at specific time t and location k is from a
(ii) g(µkt ) = Xkt β + φkt . At the second stage, the likelihood was chosen to
Therefore, the g function is just the identity link. With identity link,
g(µkt ) becomes µkt and it is approximated by the fixed effects Xkt β that
effects φkt that models the patterns seen in the data after fitting the fixed
effects. We assume here that the second stage of the model is a simple
linear mixed effects model, that both of the components are additive and
as a prior for this random effect φkt . The random effect φkt can be modeled in
several methods.
t − t̄
φkt = β1 + φk + (α + δk )
T
PK !
ρint j=1 ωkj φj 2
τint
φk |φ−k , W ∼ N ,
ρint K
P PK
j=1 ωkj + 1 − ρint ρint j=1 ωkj + 1 − ρint
PK !
ρslo j=1 ωkj δj 2
τslo
δk |δ−k , W ∼ N ,
ρslo K
P PK
j=1 ωkj + 1 − ρslo ρint j=1 ωkj + 1 − ρslo
In this model, a linear trend with time is assumed. Each areal unit k
has its own variation of intercept φk and slope δk from the mean linear
9
(ii) Time Autoregressive CAR(CARar)
decomposed as:
T
Y
f (φ1 , φ2 , ..., φT ) ∼ f (φ1 ) f (φj |φj−1 )
j=2
10
where ρT is the temporal autoregressive coefficient: ρT = 1 means strong
tive gave comparable results and were superior to CARlinear. Because CARar
is the simpler method and it gave slightly better forecasting results than
Imagine we have M models with model index m{1, 2, ..., M } and in total
11
P
in which case qγ = γj the number of non-zero parameters under model Mm
We used g-prior to simplify our posteriors for the model averaging process.
y = Xβ +
do not use g-prior, we would need to sample from the full conditionals us-
ing the Markov-Chain Monte-Carlo procedure. If we let β and 1/σ have the
priors as multivariate normal(β0 , Σ0 ) and gamma(ν0 /2, ν0 σ02 /2), we obtain the
12
following:
β ∼ multivariate normal(m, V)
m = (Σ−1 T 2 −1 −1 T 2
0 + X X/σ ) (Σ0 β0 + X y/σ )
V = (Σ−1 T 2 −1
0 + X X/σ )
n
X
SSR(β) = (yi − β T xi )2
i=1
information from n/g observations is assumed for our prior, we can simplify
β ∼ multivariate normal(m, V)
g
m= (XT X)−1 XT y
g+1
g
V= σ 2 (XT X)−1
g+1
Now we would like to apply g-prior to our model. The value of g is commonly
covariates including the intercept. The other parameters have the following
assignment of priors:
ρS , ρT ∼ Uniform(0, 1)
13
Here the prior for σ2 has the underlying form of σ2 ∼
models spatial autocorrelation that remains in the data after the covariates
effects are accounted for [10]. CARBayesST is the first dedicated software for
part of the code from the package and added Bayesian Model Averaging for
our model. We used Gibbs Sampling to sample from the full conditional distri-
butions since the closed forms of the posterior distributions are not available.
(i) Sample the model index m{1, 2, ..., M } according to the posterior prob-
abilities p(Mm |Y ):
where βˆm is the ordinary least squares estimate under model Mm . Also,
14
(iii) Sample β from the full conditional:
g m T m −1 m g 2 m T m −1
p(β|Y, σ) ∼ N ((X ) X ) X Y, σ ((X ) X )
g+1 g+1
2.5 Forecasting
2.5.1 Bayesian Prediction
Under iteration s and model γ, we set the last time point estimation of φkt
in the training set as the initial value for φt for prediction. Then we produce
derived from squared forecast error (SFE). SFE for model j is computed as:
S
1 X (s)
SF Ek,t,j = (Ŷ − Yk,t )2
S s=1 k,t,j
(s)
where Ŷk,t is the prediction of house price growth rate Y at location k and
time t from iteration s and model j; Yk,t is the observed Y at the same time
15
and location.
The benchmark model was chosen as expanding window OLS fit with only
intercept:
Pi=t+s−1
i=1Yi
ŶM EAN,t+s =
t+s−1
ratio r that is the ratio between MSFE from model j relative to the MSFE
2012:4
1 X
(10)
M SF Ek,M EAN = SF Ek,t,M EAN
T t=1995:1
M SF Ek,j
rk =
M SF Ek,M EAN
List of References
[3] B. N. Leroux BG, Lei X, “Estimation of disease rates in small areas: a new
mixed model for spatial dependence,” in Statistical models in epidemiol-
ogy, the environment, and clinical trials. Springer, 2000, p. 179191.
16
[4] A. Rushworth, D. Lee, and R. Mitchell, “A spatio-temporal model for
estimating the long-term effects of air pollution on respiratory hospital
admissions in greater london,” Spatial and spatio-temporal epidemiology,
vol. 10, pp. 29–38, 2014.
[9] A. Zellner, “On assessing prior distributions and bayesian regression anal-
ysis with g-prior distributions,” Bayesian inference and decision tech-
niques, 1986.
[10] D. Lee, “Carbayes version 4.6: An r package for spatial areal unit mod-
elling with conditional autoregressive priors,” University of Glasgow,
Glasgow, 2017.
[11] L. Bork and S. V. Møller, “Forecasting house prices in the 50 states using
dynamic model averaging and dynamic model selection,” International
Journal of Forecasting, vol. 31, no. 1, pp. 63–78, 2015.
17
CHAPTER 3
Application
Data are retrieved from the provided appendix of the article [1]. The same
data were used in order to make the modeling and forecasting comparable.
where Pk,t denotes the level of real house prices in state k and time t. The orig-
real house price growth rate along with the other 9 predictors for each state.
We divided the whole dataset into training and testing set. Training period
split our data to stay in consistency with [1]’s methodology and so our fore-
income ratio, unemployment rate, real per capita income growth and labor
force growth) and national level ones (30-year mortgage rate, the spread be-
tween 10-year and 3-month Treasury rates, industrial production growth, real
consumption growth and housing starts) were used. Table 1 summarized the
covariates and their abbreviations that are used later in this chapter. Figure
1 included the time series plots of both the dependent variable (house price
growth rate) and the independent variables (all the 9 predictors) to show their
18
Figure 1: Time Series of real house price growth rate and all the
covariates.
19
Table 1: Predictors and Abbreviations
Since Alaska and Hawaii do not share borders with the other states, they
would generate vectors of zeros in the spatial neighbor matrix that can lead
to computational error. Therefore, these two states were ruled out from our
analysis. Modeling and forecasting in this paper are based on the remaining
states.
the 48 states. Out of the multiple neighbor defining ways, such as contiguity-
based method to create our neighbor matrix. We have explored several other
neighbor structures while they did not exhibit significant improvement in our
final model. Hence, here we skip the discussion on the selection of neighbor
that allows any two states sharing at least one point on the boundary to be
20
weight matrix W , where ωij = 1 if jth area is neighbor of ith area; and ωij = 0
We first tabulated the correlations among the BEA regions and plotted
the time series plot for each region to gain the first insight into the spatial
and temporal structure in our data. Previous studies have explored the cor-
relations within and between BEA regions to check spatial interactions. The
compositions for each BEA region are shown in Figure 3). BY studying the
correlation matrix between and within the BEA regions for the duration of the
year 1984 to 2011 [3] and the year of 1975 to 2003 [4] house price, it was found
that the within region correlations are larger than the between region correla-
tions with few exceptions. A similar feature is shown in our study using data
1975-2012 (see Table 2), that the diagonal correlation coefficients are generally
larger than the off-diagonal numbers. Exceptions include Great Lakes - Far
21
West and Southwest - Far west.
NE ME GL PL SE SW RM FW
NE 0.417 - - - - - - -
ME 0.398 0.400 - - - - - -
GL 0.289 0.345 0.400 - - - - -
PL 0.164 0.212 0.323 0.429 - - - -
SE 0.250 0.306 0.372 0.243 0.458 - - -
SW 0.147 0.211 0.315 0.261 0.329 0.375 - -
RM 0.090 0.148 0.339 0.243 0.299 0.374 0.400 -
FW 0.272 0.372 0.449 0.262 0.367 0.383 0.352 0.375
Figure 4 shows that there are similar patterns across the country as well
as regional similarities regarding the time series of real house price growth rate
which roughly correspond to the 1975 - 1993 time period. Since then, the
housing market growth rate stabilized and possessed a slight upward trend
22
until near the 110th quarter (2003) when large uphill and downhill movements
occurred. Therefore, forecasting over the last half of the time series by studying
the first half is not an easy task, since the test set contains features that are
tion existing in the data by using Moran’s test and time series autocorrelation
function (ACF).
Moran’s I test is one of the commonly used tests for spatial autocorrelation:
Pn Pn
n i=1 j=1 ωij (yi − ȳ)(yj − ȳ)
I = Pn Pn P n 2
i=1 j=1 ωij i=1 (yi − ȳ)
where the observations at areal units are denoted as yi , yj and spatial structure
as ωi,j [2]. Moran’s I values for each time point was plotted and shown in Figure
5. More than 70% of the time there was significant spatial autocorrelation
present. This finding further suggests the necessity of using CAR model to
Autocorrelation function ρt,s shows the correlation between time points t, s as:
ρt,s = Corr(Yt , Ys )
The ACF plot that has the correlation coefficient against lags was used to check
spatial autocorrrelation embedded in the data (Figure 6). The first five lags
by ±0.16. The mean of the coefficients across states also went above the range
for the first five lags, with the first and fifth lag right at the boundary. The
ACF plot of the original data exhibits temporal autocorrelation within the
23
Figure 4: Time Series of real house price growth rate. 48 states are
grouped in the eight BEA regions. Dashed line separates training data
from test data. Note that the data was transformed by a factor of 400
(see equation 11).
24
Figure 5: P-value from Moran’s I test of observed house price growth
rate over time. Red dashed line is the 0.05 significant level.
Figure 6: ACF plot of observed house price growth rate against time.
Thick black line is the mean across the 48 states. Two dashed lines
are ±0.16.
25
3.3 Results
3.3.1 Model Selection Results
of the selected model sizes. Figure 7 and Figure 8 reveal the distribution of
the selected sample size for BMA-CAR without and with dilution prior re-
spectively. The two dilution prior models we tested gave rather similar results
thus only one of them is listed here. The two bar charts both reflected the
centrality around model sizes of four and five and the reduction of frequency
towards the two ends. The latter model (BMA-CAR without dilution prior)
sampled models with a size of five slightly more than the latter (BMA-CAR
ited models. Table 3 and Table 4 summarize the top five selected mod-
els without and with dilution prior h(γ) = γ. Results from the two
tables are almost identical, with the most visited models having four
rate, labor force growth, mortgage rate and housing starts. These five
models account for > 73% of all the models been visited, and in fact,
the top two models which are PIR + UNE + LFG + MOR + HOS and
UNE + LFG + MOR + HOS alone account for > 60%. The combination
[1] plotted out the median value along with the 16th and 84th percentiles of
the model sizes foretasted over the testing period across the 50 states. The
model sizes exhibited an increasing trend, with the median slowly climbed up
from two to four during the span of the forecasting period. The 16th and
84th percentiles band were about one variable away from the median and the
26
width stayed stable during all time. Therefore, the variations of model sizes
across states and time were high, which is not too surprising considering the
boom and bust cycle occurred in our forecasting period. Their finding of the
variation in model sizes across time and space explained why our model, CAR-
BMA that uses global covariates across states and assumes the coefficients are
constant over time, did an excellent jobs in prediction in some states and not
27
Figure 8: Posterior model size distribution after fitting BMA-CAR
with dilution prior and power of 1.
Table 4: Top 5 Most Visited Models with dilution prior and power of
1.
Moran’s I tests were also conducted on the residuals after fitting of BMA-
and if so, to what extent. BMA-CAR with dilution priors gave comparable
were removed. In fact, only 11% of the spatial autocorrelation were significant
28
which is a drastic decline compared to 70% before model fitting. The ACF
plot of the residuals is displayed in Figure 10. Except for the line (California)
that showed significant coefficients for the first five lags, the other states all
exhibited patterns of oscillations centered around the mean 0 after the first lag.
Also, most of the coefficients were reduced with regard to the amplitude. A
much higher proportion of coefficients were within the bounds when compared
29
Figure 10: ACF plot of residuals against time. Thick black line is the
mean across the 48 states. Two dashed lines are ±0.16.
We first would like to check the overall forecasting performance over time
by using the squared forecasting error difference between the MEAN bench-
And made a little modification in that we summed CDSFE to each time point
In Figure 11, we plotted out the sum of CDSFE for 48 states as a total:
48
X
CDSF Ek,t
k=1
[1] plotted CDSFE against time for DMA, DMS, BMA, EW and AR1. EW
is a model that used equal weighting from K OLS regression models; AR1 is
30
a lag 1 time series model. As pointed out in the paper [1], plotting CDSFE
against time reveals the when our model is superior to the MEAN benchmark
(positive slope) and those in which the MEAN benchmark predicts better
(negative slope). Compared to DMA, DMS, BMA, EW and AR1, our model
showed excellent performance for the latter bust period (after the 60th quar-
ter or 2010:Q1) and that is comparable to DMA/DMS and much better than
BMA. However, the boom period from the mid 90s to 2006 and the initial
housing market meltdown time of 2007-2008, CDSFE from our model was
Regardless of the scale of the forecast error, it is worth noting that similar
patterns applied to almost all of the CDSFE lines where 2008:4, 2010:1 and
els DMA and DMS. [1] did not give an explanation. However, we attribute
the observation to policy changes. These three time points are in accordance
with the time lines of the Home Affordable Refinance Program (HARP) pro-
gram. This program was initiated by the U.S. government to help those with
mortgage problems refinance their home equities. There were three main time
points that marked the initiation of this policy and the modifications, namely
HARP 1.0, HARP 2.0 and HARP 3.0, that correspond to loan to the value
(LTV) threshold of 105%, 125% and no restriction [6]. Due to the high ac-
cordance between the timelines of the HARP program and the performance
31
Figure 11: Cumulative squared forecast error difference.
After checking the overall picture for 50 states, we would like to compare
sults across the states for four of our models (BMS-CAR, BMA-CAR, BMA-
(DMA, DMS, BMA, BMS) from [1]. The columns refer to the MSFE ra-
tio (Equation 10) across the states, which are the average, standard deviation,
the benchmark MEAN model. Note that the results from [1] which are DMA,
DMA, BMA, BMS in Table 5 were based on 50 states rather than 48. BMS-
CAR is a model similar to BMA-CAR, except that only the model with the
highest posterior probability (PIR + UNE + LFG + MOR + HOS) was used
for modeling.
For the BMA-CAR model, an average MSFE ratio of 0.818 for BMA-CAR as
shown in Table 5 is smaller than the reported MSFE ratio for BMA(0.858)
32
and BMS(0.903), indicating that incorporating spatial dynamics did improve
the forecast accuracy compared to solely static Bayesian Model Selection and
Bayesian Model Averaging [1]. Adding dilution prior did not improve the fore-
compared to the others. Since BMA-CAR model gave the best forecasting per-
To take a closer look at which states generated better and worse forecasts,
Most of the worst predicted states were center around the middle of the country
while the coastal states tend to be predicted okay. We ranked the states by
MSFE ratios and the top four and bottom four states were (Idaho, Georgia,
To discuss the reason behind this forecasting pattern, we would need to first
introduce the concept of volatility that came from [7]. Volatility was defined
33
states tend to have higher volatility than the interior states. The relationship
between the volatility and the MSFE level implies better forecast in the interior
However, when it comes to MSFE ratio, coastal states tend to have smaller
ratios that represent larger performance gain than the MEAN benchmark. As
a matter of fact, MSFE ratio is only a relative number that it is not linked
to the forecast performance directly. For those interior (stable) states, using
the MEAN benchmark is not a bad choice since the historic average may not
be too different from the current point. As a result, using very complicated
models (such as CAR-BMA) will not cause great gain the performance. In
fact, two of the four least volatility states (Iowa, Kentucky see Figure 15) are
among the worst MSFE ratio states from our model BMA-CAR (Figure 13).
difference between the most and least volatile states. We also plotted out
the best performing states and worst performing states from our BMA-CAR
model (Figure 13 and 14) to discuss the model forecast performance variation
In Figure 13, numbers from the forecast were plotted against the observed.
In the bottom four figures that show the worst predicted states, BMA-CAR
tends to overestimate the growth rate before the recession and underestimate
34
Figure 12: Forecasting results across the 48 states.
List of References
[1] L. Bork and S. V. Møller, “Forecasting house prices in the 50 states us-
ing dynamic model averaging and dynamic model selection,” International
Journal of Forecasting, vol. 31, no. 1, pp. 63–78, 2015.
[5] A. Goyal and I. Welch, “Predicting the equity premium with dividend
ratios,” Management Science, vol. 49, no. 5, pp. 639–654, 2003.
35
Figure 13: Forecast vs. observed: best predicted states (top 4) vs.
worst (bottom 4). Red dashed line represents forecast from BMA-
CAR; blue dashed line represents forecast from MEAN benchmark.
Blue line is the historic data. Vertical line is the separation of the
boom (1995:1-2006:4) and bust period (2007:1-2012:4 ).
36
Figure 14: MSFE ratio: best predicted states (top 4) vs. worst (bot-
tom 4). Red dashed line represents the MSFE for BMA-CAR; black
line is MSFE for MEAN benchmark. Vertical line represents the sep-
aration of the boom and bust period. Notice the different scale on the
y-axis.
37
Figure 15: Forecast vs. observed: top 4 volatility states (top) vs.
bottom 4 (bottom).
38
Figure 16: MSFE ratio: top 4 volatility states (top) vs. bottom 4
(bottom).
39
org/ web/ 20120111193528/ http:// www.fhfa.gov/ webfiles/ 22721/
HARP release 102411 Final.pdf , [Online; accessed 3-April-2018].
40
CHAPTER 4
the coastal states and not so well for the interior states. We compare our
forecasting results with the forecasting results from [1]. Here we talk about
the results in the context of MSFE ratio that represnt the forecast gain
from using just the MEAN benchmark. Even though the forecasting results
from DMA [1] are not the true model (there is no true model anyway),
DMA is a more flexible model that allows for both state-wise and time-
wise model variations. Moreover, DMA gave smaller forecasting error than
Our posterior distribution for model space favored models of four and five
variables. However, discoveries from [1] suggested that many of the low
variables were needed in their model, and larger model sizes occurred more
often around the end of the forecasting period when there was a financial
crisis. Consequently, our model potentially selected too many variables for
these low volatility states and it leads to overfitting and bad forecasting
performance.
performance. [1] pointed out that the housing market is segmented so that no
41
single variable drives the whole housing market. They plotted the inclusion
probability (median, 16th and 84th percentiles) of each predictor against time
and found that the recent economic depression causes model changes. Most
of the variables were contained between 30% − 50% across time, with the
right after 2007. In addition, there was a large spread of variable inclusion.
For example, housing starts inclusion at the beginning of 2008 ranged from
below 25% to near 100%. Thus, in the bust period, housing starts was
only an important factor for some states (Arizona and Nevada). Meanwhile,
degree of space and time variation in choice of variables near the financial
crisis time makes it difficult to predict with globally assumed variables such as
in our model. The important drivers, such as housing starts and price-income
distributions, with each selected 58% and 99% of the time. High inclusions
of housing starts and price-income ratio helped with the performance for
certain states that did have these predictors as underlying factors such as
Idaho and Georgia (refer to Figure 15 and 16). Nevertheless, containing these
predictors suppressed the model performance at states like Iowa and Kentucky.
To conclude, the overall large model sizes chosen by our model and inclu-
sion of predictors that are unimportant for those stable states give rise to less
forecast gain in the interior states than in the coastal states. Our model had
globally fixed effects and constant coefficients for each state and time. This
restricted the ability to capture the spatial variations regarding model shift as
42
well as the temporal fluctuations seen in the housing market.
The original analysis from [1] developed house prices for each state individu-
ally, while neglecting the fact that house price of the neighboring states have
we were able to ”borrow strength” across the states to come up with im-
proved estimate for the house price in each [2]. As to the degree of which
forecast performance improved, the lower and upper bounds are BMA/BMS
BMA model was not able to beat the model with time-variant features. One of
the challenges in our study comes from the great difference seen in the house
price growth rate data between the training period and the testing period.
House price growth rate underwent significant fluctuations in the 1970s (see
Figure reffig:TS 8 regions). The magnitudes of these variations are even larger
than those observed near the recent economic recession. Several authors have
noted a marked decline in the volatility of real activity and in the volatility
and persistence of inflation since the early 1980s [3], [4], [5]. This structural
change in the economy may have implications for the housing market.
may not be feasible in our case. The states with high and low volatility showed
volatility in our model can help reduce this MSFE ratio difference and improve
43
what was achieved by DMA by running a space-time CAR model while just
Figure 17, all states except for Nebraska had MSFE ratio smaller than one so
that in 47/48 states the MEAN benchmark was beaten. Contrary to forecast-
ing results from DMA or BMA-CAR, results from CAR show generally better
predictions near the coast than inland, with the exception of South Dakota
and Nebraska.
Our model could be improved at the data acquisition step, in that the
data aggregated at the state level may not be reflective on the housing market
at smaller scale. By leveraging out the housing data at the state level, we
would lose information about the variations in the housing data at smaller
structure in the New York City and the rest of the New York State.
We can instead use data collected at the county level, which is more sim-
44
ilar to the scale of the housing prices study in real life. Data collected at
the metropolitan statistical area (MSA) level is also available and can be a
good candidate if the primary interest is to study the hosuing market in the
population density at its core and close economic ties throughout the area.
Currently, the United States Office of Management and Budget (OMB) has
defined 382 Metropolitan Statistical Areas (MSAs) for the United States [6].
One of the most populous MSA is New York-Newark-Jersey City that spans
to three states (New York, New Jersey and Pennsylvania) [7]. Although this
MSA covers three states, people tend to live and commute to work within this
one area and it makes sense to study them as a whole. Real life applications
include real estate investors that use MSA data to study housing trends and
model currently used geometric definition of neighbors, that the states sharing
common borders were called ”neighbors”. This may not work well for areas
such as New England, where the states tend to be very small but highly
based on the distance from their centroids. In this way, we are not limited to
Future research includes creating state-wise variables for each predictor and
each state allows for variable and coefficient shift between states and further
adds to the flexibility of BMA-CAR that assumes global fixed effects. And due
45
to the large model space(29∗48 ) to sample from, Metroplis Hastings algorithm
is more suitable in this case so that only a portion of the models will be visited.
List of References
[1] L. Bork and S. V. Møller, “Forecasting house prices in the 50 states us-
ing dynamic model averaging and dynamic model selection,” International
Journal of Forecasting, vol. 31, no. 1, pp. 63–78, 2015.
[3] O. Blanchard and J. Simon, “The long and large decline in us output
volatility,” Brookings papers on economic activity, vol. 2001, no. 1, pp.
135–164, 2001.
[5] J. H. Stock and M. W. Watson, “Has the business cycle changed? evi-
dence and explanations,” Monetary policy and uncertainty: adapting to a
changing economy, pp. 9–56, 2003.
46
BIBLIOGRAPHY
47
Crawford, G. W. and Fratantoni, M. C., “Assessing the forecasting perfor-
mance of regime-switching, arima and garch models of house prices,” Real
Estate Economics, vol. 31, no. 2, pp. 223–243, 2003.
Goyal, A. and Welch, I., “Predicting the equity premium with dividend ratios,”
Management Science, vol. 49, no. 5, pp. 639–654, 2003.
Guha, S. and Ryan, L., “Spatio-temporal analysis of areal data and discov-
ery of neighborhood relationships in conditionally autoregressive models,”
2006.
Guirguis, H. S., Giannikos, C. I., and Anderson, R. I., “The us housing market:
asset pricing forecasts using time varying coefficients,” The Journal of real
estate finance and economics, vol. 30, no. 1, pp. 33–53, 2005.
Hall, S., Psaradakis, Z., and Sola, M., “Switching error-correction models of
house prices in the united kingdom,” Economic Modelling, vol. 14, no. 4,
pp. 517–527, 1997.
Holly, S. and Jones, N., “House prices since the 1940s: cointegration, demog-
raphy and asymmetries,” Economic Modelling, vol. 14, no. 4, pp. 549–565,
1997.
48
Kuethe, T. H. and Pede, V. O., “Regional housing price cycles: a spatio-
temporal analysis using us state-level data,” Regional studies, vol. 45,
no. 5, pp. 563–574, 2011.
Lee, D., “Carbayes version 4.6: An r package for spatial areal unit modelling
with conditional autoregressive priors,” University of Glasgow, Glasgow,
2017.
Lee, D., Rushworth, A., and Napier, G., “Spatio-temporal areal unit modelling
in r with conditional autoregressive priors using the carbayesst package.”
Leroux BG, Lei X, B. N., “Estimation of disease rates in small areas: a new
mixed model for spatial dependence,” in Statistical models in epidemiol-
ogy, the environment, and clinical trials. Springer, 2000, p. 179191.
Quigley, J. M., “A simple hybrid model for estimating real estate price in-
dexes,” Journal of Housing Economics, vol. 4, no. 1, pp. 1–12, 1995.
49
Rushworth, A., Lee, D., and Mitchell, R., “A spatio-temporal model for esti-
mating the long-term effects of air pollution on respiratory hospital admis-
sions in greater london,” Spatial and spatio-temporal epidemiology, vol. 10,
pp. 29–38, 2014.
Rushworth, A., Lee, D., and Sarran, C., “An adaptive spatiotemporal smooth-
ing model for estimating trends and step changes in disease risk,” Journal
of the Royal Statistical Society: Series C (Applied Statistics), vol. 66,
no. 1, pp. 141–157, 2017.
Stock, J. H. and Watson, M. W., “Has the business cycle changed? evidence
and explanations,” Monetary policy and uncertainty: adapting to a chang-
ing economy, pp. 9–56, 2003.
Valentini, P., Ippoliti, L., and Fontanella, L., “Modeling us housing prices
by spatial dynamic structural equation models,” The Annals of Applied
Statistics, pp. 763–798, 2013.
Van Dijk, B., Franses, P. H., Paap, R., and Van Dijk, D., “Modelling regional
house prices,” Applied Economics, vol. 43, no. 17, pp. 2097–2110, 2011.
Zellner, A., “On assessing prior distributions and bayesian regression analysis
with g-prior distributions,” Bayesian inference and decision techniques,
1986.
50