Applications of Higher-Order Hidden Markov Models in Exotic Commodity Markets
Applications of Higher-Order Hidden Markov Models in Exotic Commodity Markets
Scholarship@Western
2-8-2018 10:00 AM
Supervisor
Mamon, Rogemar.
The University of Western Ontario
Part of the Applied Statistics Commons, Finance and Financial Management Commons, Management
Sciences and Quantitative Methods Commons, Numerical Analysis and Scientific Computing Commons,
and the Statistical Models Commons
Recommended Citation
Xiong, Heng, "Some applications of higher-order hidden Markov models in the exotic commodity markets"
(2018). Electronic Thesis and Dissertation Repository. 5226.
https://ir.lib.uwo.ca/etd/5226
This Dissertation/Thesis is brought to you for free and open access by Scholarship@Western. It has been accepted
for inclusion in Electronic Thesis and Dissertation Repository by an authorized administrator of
Scholarship@Western. For more information, please contact [email protected].
Abstract
The liberalisation of regional and global commodity markets over the last several decades
resulted in certain commodity price behaviours that require new modelling and estimation
approaches. Such new approaches have important implications to the valuation and utili-
sation of commodity derivatives. Derivatives are becoming increasingly crucial for market
participants in hedging their exposure to volatile price swings and in managing risks asso-
ciated with derivative trading. The modelling of commodity-based variables is an integral
part of risk management and optimal-investment strategies for commodity-linked portfo-
lios. The characteristics of commodity price evolution cannot be captured sufficiently by
one-state driven models even with the inclusion of multiple factors. This inspires the adop-
tion of regime-switching methods to rectify the one-state multi-factor modelling inadequa-
cies. In this research, we aim to employ higher-order hidden Markov models (HOHMMs)
in order to take advantage of the latent information in the observed process recorded in
the past. This hugely enhances and complements the regime-switching features of our ap-
proach in describing certain variables that virtually determine the value of some commodity
derivatives such as contracts dependent on temperature, electricity spot price, and fish-price
dynamics. Our push for the utility of the change-of-probability-measure technique facil-
itates the derivation of recursive filtering algorithms. This then establishes a self-tuning
dynamic estimation procedure. Both the data-fitting and forecasting performances of vari-
ous model settings are investigated.
This research work emerged from four related projects detailed as follows. (i) We start
with an HMM to model the behaviour of daily average temperatures (DATs) geared to-
wards the analysis of weather derivatives. (ii) The model in (i) is extended naturally by
showcasing the capacity of an HOHMM-based approach to simultaneously describe the
DATs’ salient properties of mean reversion, seasonality, memory and stochasticity. (iii) An
HOHMM-driven jump process augments the HOHMM-based de-seasonalised temperature
process to capture price spikes, and the ensuing filtering algorithms under this modelling
framework are constructed to provide optimal parameter estimates. (iv) Finally, a multi-
dimensional HOHMM-modulated set up is built for futures price-curve dynamics pertinent
to financial product valuation and risk management in the aquaculture sector. We examine
the performance of this new modelling set up by considering goodness-of-fit and out-of-
sample forecasting metrics with a detailed numerical demonstration using a multivariate
i
data set compiled by the Fish Pool ASA.
This research offers a collection of more flexible stochastic modelling approaches for pric-
ing and risk analysis of certain commodity derivatives on weather, electricity and fish
prices. The novelty of our techniques is the powerful capability to automate the param-
eter estimation. Consequently, we contribute to the development of financial tools that aid
in selecting the appropriate and optimal model on the basis of some information criteria
and within current technological advancements in which continuous flow of observed data
are now readily accessible in real time.
ii
Co-Authorship Statement
I hereby declare that this thesis incorporates materials that are direct results of my main
efforts during my doctoral study. All research outputs (jointly authored with Dr Rogemar
Mamon) led to two published papers and two manuscripts under review in refereed jour-
nals, and these are detailed below.
The content of chapter 2 appeared as a full paper in the journal Computational Manage-
ment Science; see reference [44] in chapter 4.
The results of chapter 3 were published in the Journal of Computational Science; see ref-
erence [37] in chapter 4.
Chapter 4 is based on a manuscript that is currently under review in the journal Applied
Energy.
The source of chapter 5 is a manuscript under consideration for publication in the Journal
of Economic Dynamics and Control.
I certify that this thesis is fully a product of my own work. It was conducted from Septem-
ber 2014 to present under the supervision of Dr Mamon at the University of Western On-
tario.
London, Ontario
iii
To my Dad in Heaven
“I miss you, my first & forever hero.”
iv
Acknowledgements
First and foremost, I would like to express my deepest gratitude to my supervisor Dr Ro-
gemar Mamon for his innumerably valuable guidance on my graduate study. This doctoral
research enjoys steady progress and is building momentum for potential impact due to his
constructive comments and excellent advice. More importantly, his infinite patience, thor-
ough knowledge and deep insights have carried me through some toughest times of my
academic life since day one at Western.
I would also like to sincerely thank Dr Matt Davison for his invaluable help and support. He
is always generous with his time to discuss some of my academic and professional issues.
I am extremely appreciative of his well-thought out suggestions and tenacious encourage-
ment throughout my graduate studies.
I gratefully acknowledge the useful feedback, precious time, and energy of my other thesis
committee members: Dr Marcos Escobar-Anel, Dr Lars Stentoft, and Dr Francois Watier.
Their helpful comments and suggestions certainly strengthen this dissertation at various
levels. Furthermore, many thanks are due to the outstanding faculty and staff of the De-
partment of Statistical and Actuarial Sciences (DSAS) for their support in various ways. I
also sincerely thank the financial support provided by the DSAS and the Ontario Graduate
Scholarship/Queen Elizabeth II Scholarship Program.
I take this chance to thank all my colleagues and friends in Canada. I am so blessed to
be in your great company, affording moral support in the accomplishment of this lifetime
milestone.
Last but not least, I would like to express my special thanks to my family, most especially
my parents and grandma, for their selfless love, unconditional support, and unfailing belief
on my ability to reach greater heights and possibilities. I will do my best to be a better man
and make you all proud.
v
Contents
Abstract i
Dedication iv
Acknowledgements v
List of Figures x
List of Appendices xv
1 Introduction 1
1.1 Research motivation and objectives . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Overview of HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Extension to HOHMMs . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Evaluation with forward-backward algorithm . . . . . . . . . . . . 6
1.2.4 Decoding with Viterbi algorithm . . . . . . . . . . . . . . . . . . . 7
1.2.5 Training with expectation-maximisation algorithm . . . . . . . . . 8
1.2.6 Change of measure method in HOHMMs . . . . . . . . . . . . . . 11
1.3 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 Putting a price tag on temperature . . . . . . . . . . . . . . . . . . 13
1.3.2 A self-updating model driven by a higher-order hidden Markov
chain for temperature dynamics . . . . . . . . . . . . . . . . . . . 13
1.3.3 A higher-order Markov chain-modulated model for electricity spot-
price dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
vi
1.3.4 Modelling and forecasting futures-prices curves in the Fish Pool
market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
vii
3.2.2.1 HMM-modulated OU process . . . . . . . . . . . . . . . 68
3.2.2.2 HOHMM-modulated Ornstein-Uhlenbeck process . . . . 69
3.3 Recursive filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.3.1 Reference probability measure . . . . . . . . . . . . . . . . . . . . 70
3.3.2 Calculation of recursive filters . . . . . . . . . . . . . . . . . . . . 72
3.4 Optimal parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . 75
3.5 Numerical implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.5.1 Analysis of the deterministic component . . . . . . . . . . . . . . . 78
3.5.2 Analysis of the stochastic component . . . . . . . . . . . . . . . . 79
3.5.3 Validity of model evaluation . . . . . . . . . . . . . . . . . . . . . 80
3.5.4 Implementation aspects of the HOHMM-OU filtering . . . . . . . . 81
3.5.4.1 Initial values for the parameter estimation . . . . . . . . . 82
3.5.4.2 Evolution of parameter estimates and comparison under
HMM and HOHMM settings . . . . . . . . . . . . . . . 83
3.5.5 Model selection and other diagnostics . . . . . . . . . . . . . . . . 88
3.5.5.1 Assessment of predicted temperatures . . . . . . . . . . . 88
3.5.5.2 Error analysis and model selection . . . . . . . . . . . . . 91
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
viii
4.5.2.2 Implementing the filtering procedure . . . . . . . . . . . 121
4.5.3 Discussion of model performance . . . . . . . . . . . . . . . . . . 127
4.5.3.1 Forecasting and error analysis . . . . . . . . . . . . . . . 127
4.5.3.2 Selection of suitable model setting . . . . . . . . . . . . . 132
4.5.3.3 The valuation of expected future spot at delivery . . . . . 135
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5 Modelling and forecasting futures-prices curves in the Fish Pool market 144
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.2 Model setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.3 Filtering and parameter estimation . . . . . . . . . . . . . . . . . . . . . . 152
5.4 Numerical application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5.4.1 Fish pool exchange and data description . . . . . . . . . . . . . . . 156
5.4.2 Implementation of filters and estimation . . . . . . . . . . . . . . . 160
5.4.3 Model performance and selection . . . . . . . . . . . . . . . . . . 162
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6 Conclusion 180
6.1 Summary of research contributions . . . . . . . . . . . . . . . . . . . . . . 180
6.2 Further research directions . . . . . . . . . . . . . . . . . . . . . . . . . . 182
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
ix
List of Figures
x
3.6 Evolution of parameter estimates for κ, ϑ, and % under a 2-state HOHMM-
based model with 95% confidence level . . . . . . . . . . . . . . . . . . . 87
3.7 One-step ahead forecasts under a 3-state HOHMM-based model . . . . . . 89
3.8 Comparison of the expected HDD and actual HDD in a 3-state HOHMM-
based model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.9 Comparison of one-step-ahead forecasts in 1-, 2-, and 3-state HOHMM-
based models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.10 Evolution of AICs for the 1-, 2-, and 3-state HMM- and HOHMM-based
models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
xi
5.6 AIC, AICc and BIC for the 1-, 2-, 3-state HMM/HOHMM-based model for
salmon futures prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5.7 One-step ahead and out-of-sample predictions under the 3-state HOHMM
for futures prices with expiries on 31 Jan 2017, 28 Feb 2017, and 31 Mar
2017 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
xii
List of Tables
4.1 Descriptive statistics for daily electricity spot price (DESP) . . . . . . . . . 118
4.2 Parameter estimates for the seasonal component . . . . . . . . . . . . . . . 118
xiii
4.3 Interval of standard errors for parameter estimates under 1-, 2-, 3-state
HMM- and HOHMM-based models . . . . . . . . . . . . . . . . . . . . . 128
4.4 Error analysis of HMM- and HOHMM-based models . . . . . . . . . . . . 132
4.5 Bonferroni-corrected p-values for the t-test performed on the RMSEs in-
volving the DSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.6 Number of estimated parameters under HMM-OU with jumps and HOHMMs-
OU with jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.7 Comparison of selection criteria AIC . . . . . . . . . . . . . . . . . . . . . 134
4.8 Optimal parameter estimates for the 2-state HOHMM . . . . . . . . . . . . 136
5.1 The data periods of future contracts (maturities of 1–6 months) covered by
the moving window in the filtering procedure . . . . . . . . . . . . . . . . 158
5.2 Futures contracts with maturity up to 6 months and expiration on 29 Jan 2016158
5.3 Descriptive statistics for log-futures prices (maturities of 1 – 6 months) . . . 159
5.4 Error analysis of HMM- and HOHMM-based models under 1-, 2-, and 3-
state settings for salmon futures prices . . . . . . . . . . . . . . . . . . . . 167
5.5 Bonferroni-corrected p-values for the paired t-test performed on the RM-
SEs involving salmon futures prices . . . . . . . . . . . . . . . . . . . . . 168
5.6 Number of estimated parameters under HMM and HOHMM settings for
salmon futures prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
5.7 Out-of-sample error analysis of HMM- and HOHMM-based models under
1-, 2-, and 3- state settings for futures prices . . . . . . . . . . . . . . . . . 171
xiv
List of Appendices
xv
Chapter 1
Introduction
The financial solutions refer to exotic commodity-based derivatives whose values are de-
pendent on weather measurements, and prices in the electricity and fish markets. Taking
suitable positions comprising the correct number of contracts in derivatives trading forms
strategies in managing risk exposure to certain pertinent factors. Research studies in the
aforementioned markets are based on one-state models. However, they might not accu-
rately capture the stylised behaviours of price evolution, especially during periods of fi-
nancial crises or when positive news affect market sentiments that drive prices suddenly.
These situations require models with more capabilities to account for abrupt fluctuations of
various statistics, and one simple but powerful way is the embedding of a regime-switching
approach enriched by a memory-capturing mechanism.
1
2 Chapter 1. Introduction
The general mechanism of HMMs in finance can be concisely explained by the signal the-
ory, well-known in engineering. Outputs produced from any processes in both theory and
practice normally are considered as signals. In finance, we then can treat collected data as
observed signals, such as prices and indices of underlying assets. However, these observa-
tions might contain noise or be distorted by various errors from their generating procedures.
The estimation and measurements for real sources might be corrupted. It is then not possi-
ble to reveal the hidden state of the model only through the observed symbols. Fortunately,
the application of HMMs is capable of filtering the noise out and unravelling the distortion.
The hidden variables and information in our observations then can be efficiently estimated
and further investigated.
Following Cappé et al. [3], an HMM comprises a bivariate discrete process {Xt , Ot } with
t ∈ Z+0 . Let {Xt } be a Markov chain encapusulated in an observed process {Ot }, which is a
corrupted version of {Xt } due to some noise. Even though the latent process {Xt } is hidden
to the outside, it can be observable through {Ot } under an HMM. As in Rabiner [29], a set
of parameters (N, n, Π, Λ, P) needs to be introduced in order to define an HMM completely.
• N, the number of hidden states in the model. Even though they are latent, physical
significance to the states can be detected for many practical applications. We denote
the state in the model as qt at time t, and individual states as
S t = {s1 , s2 , . . . , sN }
• n, the number of distinct observation signals per hidden state at time t. For instance,
“rainy day” and “non-rainy day” can be treated as two distinct observed symbols. We
denote the signal as “Yt ” at time t. The individual signals are presented by
Yt = {y1 , y2 , . . . , yn }
• Π = {π ji } stands for the state transition probability distribution that governs transi-
tions among states, where
π ji = P qt+1 = si | qt = s j , 1 ≤ i, j ≤ N,
π ji = 1, π ji ≥ 0.
PN
and i=1
• P = {pi (m)} denotes the probability distribution of observed symbols in hidden state
i , where
pi (m) = P (Ot = ym | qt = si ) , 1 ≤ i ≤ N, 1 ≤ m ≤ n,
We assume the Markov property holds. As shown in its definition, the current state st
is independent of all the states prior to t − 1, given st−1 . The observed symbols meets
the Markov assumption as well with respect to their corresponding states.
The intuitive explanation for this assumption is that the state transition distribution
over next state given current state does not change over time.
Once proper values are assigned to the set of parameters (N, n, Π, Λ, P) under the afore-
mentioned assumptions, the HMM then can be utilised to produce an observation sequence
O, the general procedures of which include the following steps. We start with setting an
initial state in terms of Λ and then choose Ot = ym according to pi (m). π ji will be applied
when transiting to another state qt+1 = si . By taking increment by 1 for t till T , we shall
produce O through an appropriate HMM.
1.2. Literature review 5
Various literatures also find out the longer memory property in many financial series, such
as the long term dependence in the volatility of S&P 500 returns [12] and in percentage
changes on Treasury debt security yields [13]. For this sort of financial series, a growing
body of evidence suggests that a regular HMM cannot effectively capture their stylised be-
haviours ([17] and [21]), thereby raising a need to expand the HMM literature. It results
in some scholars examing applications of HOHMMs in the financial field. For instance,
Siu et al. proposed a higher-order Markov-switching model with the drift and the volatility
modulated by a discrete-time high-order Markov chain for measuring risk of a risky port-
folio [28]. Xi et al. introduced an analysis of asset allocation strategies under both HMM
and HOHMM settings, and concluded that the HOHMM-based approach outperforms the
HMM-based strategy for certain levels of transaction costs [20].
Equations (1.1) and (1.2) increase the number of parameters exponentially in terms of the
order of the HOHMM setting. They lead to more complicated processes to address the
three classic problems originally in HMMs, the results of which will be elaborated in the
following sections.
N
X
αt+1 (it−k+2 , . . . , it+1 ) = αt (it−k+1 , . . . , it ) P Ot+1 | S , qt+1 = sit+1 ·
it−k+1 =1
P qt+1 = sit+1 | S , qt−k+1 = sit−k+1 , . . . , qt = sit
N
X
= αt (it−k+1 , . . . , it ) πit−k+1 ,...,it+1 pit+1 (Ot+1 ) , (1.3)
it−k+1 =1
where αk (i1 , . . . , ik ) = Λi1 ,...,ik · kj=1 pi j O j , for 1 ≤ j ≤ k. The required probability is then
Q
obtained by P (O | S ) = iNT −k+1 ,...,iT =1 αT (iT −k+1 , . . . , iT ) as the sum of all forward variables.
P
As in the case of the forward algorithm, this joint conditional probability can be obtained
through a recursive equation given by
N
X
βt (i1 , . . . , ik ) = P Ot+k+1 , . . . , OT | S , qt+1 = sit+1 , . . . , qt+k = sit+k ·
it+k =1
P Ot+k | S , qt+k = sit+k · P qt+k = sit+k | S , qt = sit , . . . , qt+k−1 = sit+k−1
N
X
= βt+1 (i2 , . . . , ik , j) πi2 ,...,ik , j p j (Ot+k ) , (1.4)
j=1
k
Y
= P q1 = si1 . . . qk = sik | S · P O j | S , q j = si j
j=1
k
Y
= Λi1 ,...,ik pi j O j
j=1
Eventually, we can find out argmax1≤qt−k+1 ,...,qT ≤N θT (iT −k+1 , . . . , iT ) through the Viterbi itera-
tion. To recover entire most likely state sequence, we further define a back-tracked array
that traces θt (it−k+1 , . . . , it ) for each t and it−k+1 , . . . , it as
Since we have φT (iT −k+1 , , . . . , iT ) = argmax1≤qt−k+1 ,...,qT ≤N θT (iT −k+1 , . . . , iT ), the remaining qt
then can be obtained recursively by setting φk (i1 , , . . . , ik ) = 0.
Generally, there are two main applications of the EM algorithm. One situation occurs
when likelihood functions are not analytically solvable but can be tractable by assuming
the occurrence of additional missing parameters. The other one happens where data is
incomplete or missing owing to the possible limitation of observations. In our study of
1.2. Literature review 9
HOHMMs, the state sequence Q is hidden and viewed as the missing data, whilst the
process O is observed. The EM algorithm is presented in the discrete density case for
the subsequent discussion of the Baum-Welch algorithm and to remain consistent with the
previous discussion in evaluation and decoding. We assume the complete-data likelihood
function L(S | Q, O) = P(Q, O | S ), and have
P (Q, O | S ) = P (Q | O, S ) P (O | S ) . (1.5)
Let Scbe the set of reestimated parameters that generates an improved model compared
to that of S . Considering equation (1.5) and the log likelihood function l(S
c | Q, O) =
P
log P O | Sc = Q log P Q | S c, we get
log P Q, O | S
c = log P Q | O, S
c + log P O | S
c (1.6)
By taking a sum over Q and multiplying equation (1.6) by P(Q | O, S ), we can further
transform it into
P O|S c X X
log = P (Q | O, S ) log P O, Q | S c− P (Q | O, S ) log P (O, Q | S )
P (O | S ) Q Q
X P (Q | O, S )
+ P (Q | O, S ) log . (1.7)
Q P Q | O, Sc
It is straightforward to show that Q P (Q | O, S ) log ≥ 0. Let A S , S
P(Q|O,S ) c =
P
P Q|O,S
c
Q P (Q | O, S ) log P O, Q | S . With the previous notation, we can rewrite equation
P c
(1.7) as
l S c| O − l (S | O) ≥ A S , S c − A (S , S ) , (1.8)
where the strict inequality holds unless P Q | O, Sc = P (Q | O, S ) or Sc= S . The EM
n o
procedure is iteratively constructed from the sequence Sc(k) initiating with some value
k≥1
S
c(0) . Every iteration contains two steps:
1. Expectation step: calculate A S , S c(k) .
2. Maximisation step: determine S
c(k+1) = argmax A S , S
S
c(k) .
reached.
Two auxiliary variables need to be defined first in terms of forward and backward variables
for properly presenting the Baum-Welch algorithm. We denote εt (i1 , . . . , ik+1 ) as the prob-
ability of being in state si1 , . . . , sik+1 at time t, . . . , t + k correspondingly given the model and
the observation sequence, where
By summing εt (i1 , . . . , ik+1 ) over t, we get the expectation number of transitions from the
state sequence si1 , . . . , sik+1 . The expectation number of transitions from the state sequence
si1 , . . . , sik can be generated alike by summing up ξt (i1 , . . . , ik ) over t. Re-estimation for-
mulas then can be given by
ε[
t (i1 ) = ξt (i1 )
[ (1.11)
εt (i1 , . . . , ik+1 )
PT −k
π\
i1 ,...,ik+1 = PT −k PN
t=1
(1.12)
t=1 ik+1=1 εt (i1 , . . . , ik+1 )
ξt ( j)
PT −k
t=1,s.t Ot =yk
j (yk ) = Pn PT −k
p\ (1.13)
k=1 t=1,s.t Ot =yk ξt ( j)
The Baum-Welch training process thereby can be performed by assuming a starting set of
parameters S0 . Forward and backward variables, αt and βt , are then computed according to
equations (1.3) and (1.4) respectively, followed by calculating auxiliary variables εt and ξt
using equations (1.9) and (1.10). The parameters are re-estimated iteratively by equations
1.2. Literature review 11
(1.11) to (1.13) till estimates converge. As we mentioned before, the Baum-Welch algo-
rithm is viewed as a special case of the EM algorithm with the same goal of maximising
P(O | S ) by adjusting S given an HOHMM. To deal with hidden states and incomplete
data, the EM algorithm becomes significantly powerful in the parameter estimation and
filtering processes of HOHMMs.
To address filtering problems of HOHMMs in our research, relevant processes and optimal
filters can be generated generally through two ways: the semi-martingale method and the
change of probability measure approach. The prior one is direct but extremely complicated,
whilst the latter one is indirect but efficient in filtering applications. Zakai [22] pioneers
the application of change of measure technique in stochastic filtering. Elliott et al. [6] de-
rive optimal filters with this method for HMMs based on Girsanov’s Theorem. Mamon et
al. [14] develop closed-form solutions of the recursive filters for estimating optimally the
parameters of a commodity price model via change of measure.
Considering the wide use of this approach mostly in HMMs, one additional but critical
step ought to be implemented for our HOHMM setting. That is to transform a higher-order
Markov chain into a first-order Markov chain rather than to derive the change of probability
measure for HOHMMs directly. Kriouile et al. [10] develop an equivalent model specific
to transiting a second-order discrete HMM to a regular HMM. An detailed example is de-
scribed to convert a second-order two state Markov chain into a regular 2-state one with
efficient recursive algorithms [15]. Based on these previous work for processing second-
order HMMs, Du Preez extends order-reducing algorithms to a general HOHMM setting,
which enables any HOHMM to be transformed into its corresponding first-order HMM [5].
The essential idea is similar to the common method that converts higher-order differential
equations into a system of first-order differential equations. We follow their computing
algorithms by introducing a mapping variable that transforms a higher-order Markov chain
into a regular one. Then the change of probability measure method can be utilised to find
our filtering algorithms.
Equivalent to the real world measure P, an ideal measure P̃ is defined as a reference prob-
ability from a discrete-time version of Girsanov’s Theorem [9]. Under P̃, observations are
independent and identically random variables, and the Markov chain follows the same dy-
12 Chapter 1. Introduction
namics as those under P. To find optimal filters, an easy-to-compute framework then can
be carried out by Fubini’s Theorem, which allows the interchange of expectations and sum-
mations [11]. The results and calculations under the ideal measure P̃ can be traced back to
the real world measure P by invoking a reverse measure change. The derivation process of
reference probability optimal filters can be explicitly illustrated in Figure 1.1.
This thesis consists of six chapters. An overview of HMMs and underpinnings of HOHMMs
is given in this Chapter; see previous section. The main contents focus on the results of
four related projects. We develop HMM and HOHMM settings for the dynamics of daily
average temperatures (DATs) for weather derivatives in Chapters 2 and 3, respectively. The
modelling, filtering and estimation problems for the respective electricity-spot and salmon
futures prices are addressed in Chapters 4 and 5. Some concluding remarks are given in
Chapter 6. The synopsis of the projects are briefly introduced below.
1.3. Structure of the thesis 13
A model for the evolution of DATs is put forward to support the analysis of weather deriva-
tives. The goal is to capture simultaneously the mean-reversion, seasonality and stochas-
ticity properties of the DATs process. An OU process modulated by an HMM is proposed
to model the both mean-reversion and stochasticity of a deseasonalised component. The
seasonality part is modelled by a combination of linear and sinusoidal functions. OU-
HMM filtering algorithms are established for the evolution of switching model parameters.
Consequently, adaptive parameter estimates are obtained. Numerical implementation of
the estimation technique using a data set compiled by the National Climatic Data Center
(NCDC) was conducted. A sensitivity analysis of the option prices with respect to model
parameters is included.
We develop a model for the evolution of DATs that could benefit the analysis of weather
derivatives in finance and economics as well as the modelling of time series data in meteo-
rology, hydrology and other branches of the sciences and engineering. Our focus is to cap-
ture the mean-reverting, seasonality, memory and stochastic properties of the temperature
movement and other time series exhibiting such properties. To model both mean-reversion
and stochasticity, a deseasonalised component is assumed to follow an OU process mod-
ulated by a higher-order hidden Markov chain, which takes into account short/long range
dependence in the data. The seasonality part is modelled through a combination of linear
and sinusoidal functions with appropriate coefficients and arguments. Furthermore, we put
forward a parameter estimation approach that establishes recursive HOHMM filtering algo-
rithms customised for the regime-switching evolution of model parameters. Quantities that
are functions of HOHMC characterise these filters. Utilising the EM method in conjunction
with the change of measure technique, optimal and self-updating parameter estimates are
obtained. We illustrate the numerical implementation of our model and estimation tech-
nique using a 4-year Toronto DATs data set compiled by the NCDC. We perform pertinent
model selection and validation diagnostics to assess the performance of our methodology.
It is shown that a 2-state HOHMM-based model best captures the empirical character-
istics of the temperature data under examination on the basis of various error-based and
information-criterion metrics.
14 Chapter 1. Introduction
Over the last three decades, the electricity sector worldwide underwent massive deregula-
tion. Power market participants have encountered a growing number of challenges due to
competition and other pertinent factors. As electricity is a non-storable commodity, its price
is extremely sensitive to changes in supply and demand. The evolution of electricity prices
exhibits pronounced mean reversion and cyclical patterns, possesses extreme volatility and
relatively frequently occurring spikes, and manifests presence of memory property. These
observed features necessitate the development of models aimed to simultaneously capture
such price characteristics for forecasting, risk management, and valuation of electricity-
driven derivatives. This study tackles the modelling and estimation problems under a new
paradigm that integrates the deterministic calendar seasons and stochastic factors govern-
ing electricity prices. The de-seasonalised component of our proposed model has both the
jump and mean-reverting properties to account for spikes and periodic cycles alternating
between lower price returns and compensating periods of higher price returns. The pa-
rameters of the de-seasonalised model components are also modulated by a higher-order
hidden Markov chain (HOHMC) in discrete time. This provides a mechanism to extract
latent information from historical data. The HOHMC’s state is interpreted as the “state of
the world” resulting from the interaction of various forces impacting the electricity market.
Filters are developed to generate optimal estimates of HOHMC-relevant quantities using
the observation process, and these provide online estimates of model parameters. Empiri-
cal demonstrations using the daily electricity spot prices, compiled by the Alberta Electric
System Operator (AESO), show that our HOHMM approach has considerable merits in
terms of price data fitting and forecasting metrics. Implications of our model to the pricing
of an electricity forward contract are also examined.
This project aims to capture the evolution and salient features of fish-futures prices with a
flexible and dynamic approach via a higher-order hidden Markov model (HOHMM). The
parameters of a proposed futures-price model under a multivariate setting are governed by
a discrete-time HOHMM to account for the random switching of market or economic states
over time. Multi-dimensional filters derived, along with the application of the expectation-
1.3. Structure of the thesis 15
[2] W. Ching. Markov Chains: Models, Algorithms and Applications (2nd;2;2nd 2013;
ed).
[3] O. Cappé, E. Moulines and T. Ryden. Inference in Hidden Markov Models (2006).
[5] J. A. Du Preez. Efficient training of high-order hidden markov models using first-order
representations. Computer Speech & Language, 12(1)(1998), 23–39.
[6] R. J. Elliott, L. Aggoun and J. B. Moore. Hidden Markov Models: Estimation and
Control (Corr. 2nd. printing. ed.) (1997).
[7] R. J. Elliott and H. Yang. Forward and backward equations for an adjoint process,
Festschrift for G. Kallianpur, Springer Verlag, Berlin, Heidelberg, New York. (1992).
[8] J. D. Hamilton. A new approach to the economic analysis of nonstationary time series
and the business cycle. Econometrica 57(2)(1989), 357–384.
[10] A. Kriouile, J. Mari and J. Haon. Some improvements in speech recognition algo-
rithms based on HMM (1990).
16
REFERENCES 17
[11] M. Loeve. Probability Theory, fourth edn, Springer Verlag, Berlin, Heidelberg, New
York. (1978).
[12] N. Lobato and N. E. Savin. Real and spurious long-memory properties of stock-
market data. Journal of Business & Economic Statistics 16(3) (1998), 261-268.
[13] J. McCarthy. Tests of long-range dependence in interest rates using wavelets. Quar-
terly Review of Economics and Finance 44(1) (2004), 261–268.
[14] R. Mamon, C. Erlwein, B. Gopaluni, Adaptive signal processing of asset price dy-
namics with predictability analysis, Information Sciences, 178 (1) (2008), 203–219.
[15] I. L. MacDonald and W. Zucchini. Hidden Markov and Other Models for Discrete-
Valued Time Series (1997).
[16] L. Rabiner, A tutorial on hidden Markov models and selected applications in speech
recognition, Proceedings of the IEEE, 77 (2) (1989), 257–286.
[17] T. Rydén, T. Teräsvirta, and SÅsbrink. Stylized facts of daily return series and the
hidden Markov model. Journal of applied econometrics. 5(1)(1998), 217–44.
[18] T. Siu, W. Ching, E. Fung, M. Ng, X. Li, A high-order markov-switching model for
risk measurement, Computers and Mathematics with Applications, 58 (1) (2009), 1–
10.
[19] A. Viterbi. Error bounds for convolutional codes and an asymptotically optimum de-
coding algorithm. IEEE Transactions on Information Theory 13(2)(1967), 260-269.
[21] S. Yu, Z. Liu, M. S. Squillante, C. Xia and L. Zhang. A hidden semi-markov model
for web workload self-similarity (2002).
2.1 Introduction
We are all familiar with car and home insurance policies. The cost and overall value of these
contracts are determined by the likelihood of occurrence of particular calamities, disasters
and accidents, amongst other risk factors. But, how does one insure against the vagaries of
weather? Specifically, what is the price of a contract whose value depends on how hot or
cold it will be on a given day in the future?
We shall consider a modelling approach that provides support for temperature-linked con-
tracts and weather-related derivatives. It is estimated that about 39.1% of the U.S. gross do-
mestic product is weather-sensitive [13], and over 90% of weather derivatives are temperature-
based [14]. The first weather-derivative transaction took place in 1997 executed by Aquila
Energy and embedded in a power contract [8], and in 1999 the first exchange-traded tem-
perature derivative was launched in the Chicago Mercantile Exchange (CME). The volume
of weather derivatives, since then, has grown rapidly both in the exchange and over-the-
counter (OTC) markets.
Weather-driven futures and options trading in the CME are written on the following indices:
temperature heating degree days (HDD), cooling degree days (CDD) and cumulative av-
erage temperature (CAT). These indices are not tradeable or storable. Hence, the usual
no-arbitrage valuation methodology is not necessarily valid in pricing such contracts [10].
Assuming the existence of an appropriate pricing measure, our aim is to develop a model
that could accurately capture the salient features of the temperature data aiding the accurate
pricing of weather derivatives.
18
2.1. Introduction 19
In the literature, several studies were conducted to deal with the valuation of weather deriva-
tives. The historical burn analysis (HBA) is adopted by most investors given its ease of
replication [12]. This is because the HBA assumes that the distribution of the expected
value of the temperature-based derivatives simply follow the historical data; there is no
construction of any stochastic models and fitting for the dynamics of the underlying vari-
ables. It is argued, however, in [25] that HBA is bound to be biased and prone to large
pricing errors. As a suitable alternative, time-series methods are proposed [25] and the
marginal substitution value principle of mathematical economics could be employed for
pricing [10]. Indifference pricing was also considered in [38], and a comparison of HBA,
Black-Scholes-Merton approximation, and an equilibrium Monte Carlo simulation was ex-
amined in [32].
A one-state stochastic process cannot describe the behaviour of temperature with great pre-
cision and flexibility, especially when there are drastic changes in regimes attributed to
occasional climatic changes. Thus, the regime-switching OU model is considered in [14],
which seems to be the first paper embedding a regime-switching approach to capture the
evolution of the time series temperature data. However, no dynamic estimation, not even
a static one, was provided in [14] leaving a big gap in implementing a regime-switching
approach involving observed data until the publication of [37] that employs a higher-order
HMM. We note though that [37], extending the developments in [36], focuses only estima-
tion and model fitting, and the pricing and risk measurement of pertinent contracts under
such a model setting remain unaddressed. In this chapter, we apply hidden Markov model
(HMM) filtering algorithms to provide optimal parameter model estimates for a successful
implementation of the valuation and risk management of weather futures and option prod-
ucts.
Our exposition is organised in the following way. Section 2.2 presents a self-contained for-
20 Chapter 2. Putting a price tag on temperature
mulation of the temperature modelling using a discrete-time HMM modulating the model
parameters. In Section 2.3, we derive recursive filtering equations for various quantities
of HMM via the change of reference probability measure method. Our self-updating pa-
rameter estimation scheme is laid out in Section 2.4. Numerical work demonstrating the
applicability of our proposed model and estimation technique to a four-year Toronto tem-
perature data set is detailed in 2.5; and in this section, mechanics of the best-model selection
are also outlined through forecasting and penalised log-likelihood criteria. In Section 2.6,
the valuation of a temperature option is discussed and a sensitivity analysis of the option
price with respect to parameters of our proposed model is included. Section 2.7 gives some
concluding remarks.
Notation: All vectors and matrices will be denoted by bold English/Greek letters in lower-
case and bold capitalised English/Greek letters, respectively.
As weather derivatives are written on HDD, CDD and CAT, we put forward a model to
capture these measurements. HDD and CDD quantify the respective demands of heating
and cooling of a particular location. Since temperature could be observed at any time of a
given day, given the time interval [τ1 , τ2 ] with τ1 < τ2 , we can regard it as an Itô process so
that the respective continuous-time functional forms of HDD, CDD and CAT are
Z τ2
HDD = max (T base − T s , 0) ds,
Z τ1τ2
CDD = max (T s − T base , 0) ds,
τ1
Z τ2
and CAT = T s ds, (2.1)
τ1
where T base is the base temperature, usually given as 65◦ F, or 18◦ C in the market. In
practice, the daily average temperatures (DATs) are measured on a daily basis so that the
2.2. Model description 21
(2.2)
T max + T min
where T t is the DAT on day t computed as . Clearly, CAT is simply the sum
2
of DATs over the contract period. Weather contracts typically mature in a month or sea-
son. An HDD contract’s period spans the period covering October in one year to April of
the next year. A CDD contract’s period covers the warm season, i.e., starting April until
October of the same year. Overlapping months, October and April, in the CDD and HDD
contracts are termed as transition or shoulder months. Following the rationale in [5], the
needed indices as well as the dynamic representation of DATs will be obtained from T t ,
which we are going to model.
Suppose (Ω, F , P) is an underlying probability space for the process T t given by,
T t = Xt + S t . (2.3)
where Bt is a standard Brownian motion under P. The solution, by Itô’s lemma, of (2.5) is
Z t
Xt = X s e−α(t−s)
+ (1 − e −α(t−s)
)θ + ξe−αt
eαu dBu (2.6)
s
22 Chapter 2. Putting a price tag on temperature
The discretised versions of (2.6) and (2.7), by approximating the distributions of their re-
spective stochastic-integral components, are given by
r
1 − e−2α4tk+1
= e−α4tk+1 Xk + 1 − e−α4tk+1 θ + ξ zk+1 ,
Xk+1 (2.8)
2α
where 4tk+1 = tk+1 − tk and {zk+1 } is a sequence of independent and identically distributed
(IID) standard normal random variables.
As indicated in Section 2.1, having constant parameters in the OU process might not be
adequate in describing the dynamic switches of regimes resulting from the interactions of
various market and economic factors. In the succeeding discussion, the parameters in (2.8)
will be governed by a homogeneous Markov chain yk with finite states in discrete time to
model the behaviour of Xt and hence, T t .
To simplify the ensuing algebraic calculations, the state space is associated with the canon-
ical basis of RN , which is {e1 , e2 , . . . , eN }, ei = (0, . . . , 0, 1, 0, . . . , 0)> with 1 in the ith
position. Here, N is the total number of states and > denotes the transpose of a matrix.
The dynamics of yk is yk+1 = Πyk + ζ k+1 , where Π = (π ji ) ∈ RN×N , ζ k+1 is a martingale
increment, and π ji = P(yk = e j |yk−1 = ei ) with Nj=1 π ji = 1. Let P = (p ji ) ∈ RN×N de-
P
note the intensity matrix for the continuous-time Markov chain process at time t, where,
π (t, t+∆t)
p ji = lim∆t→0 ji ∆t , for j , i, (cf Cox and Miller [7]), with p ji ≥ 0 and i=1 p ji = 0.
PN
Given that our HMM filtering methodology is under the discrete-time framework, it is nec-
essary to recall the connection of the rate matrix P to the transition probability matrix Π,
which is its discrete-time counterpart providing inputs to the recursive filtering equations
in Section 2.3 and eventually to the option pricing formulae in Section 2.6. From Grimmett
and Stirzaker [27] (see as well Siu, et al [33]), Π is the exponential matrix of P, i.e.,
∞
X Pk
Π = exp (P) = . (2.9)
k=0
k!
Both (2.9) and (2.10) can be evaluated numerically by almost all mathematical or statistical
software packages nowadays.
s
1 − e−2α(yk )4tk+1
and T k+1 = S k+1 + e−α(yk )4tk+1 Xk + (1 − e−α(yk )4tk+1 )θ(yk ) + ξ(yk ) zk+1 .
2α(yk )
(2.12)
With yk ’s state space, αk := α(yk ) = hα, yk i, θk := θ(yk ) = hθ, yk i, and ξk := ξ(yk ) = hξ, yt i,
where h·, ·i is the inner product in RN . Clearly, yk makes the models in (2.11) and (2.12)
regime-switching.
are independent of the observed data series; see the underlying principle utilised in Elliott
et al. [19].
A discrete-time version of the Girsanov’s theorem enables us to back out P from P̃ [19].
The real-world measure P is recovered by considering the Radon-Nikodym derivative
k
dP Y
= Ψk = ϕl , k ≥ 1,
dP̃ Fk l=1
where
1
1−e−2α(yl−1 )4tl−1 − 2
φ ξ(yl−1 )−1
2α(yl−1 )
β(yl−1 )
ϕl = q . (2.13)
1−e−2α(yl−1 )4tl−1
ξ(yl−1 ) 2α(yl−1 )
φ (Xl )
So, for instance, to estimate a scalar quantity U under P, Bayes’ theorem for conditional
expectation gives
Ẽ[Ψk Uk |Xk ]
E[Uk |Xk ] = (2.14)
Ẽ[Ψk |Xk ]
· ·
where E[ ] and Ẽ[ ] are the conditional expectations under P and P̃, respectively. As in
[18], we consider an Fl -adapted Ul , having the form
γk (Uk )
Write γk (Uk ) := Ẽ[Ψk Uk |Xk ]. Therefore, equation (2.14) can be expressed as .
γk (1)
Since γk (Uk ) = γk (Uk h1, yk i) = h1, γk (Uk yk )i, the filter for any adapted process U has the
representation
h1, γk (Uk yk )i
E[Uk |Xk ] = .
h1, γk (yk )i
Invoking Theorem (5.3) of [15] and tailoring it to our modelling framework and notation,
the recursion for γk (Uk yk ) is
N
X h
γk (Uk yk ) = Λi (Xk ) hei , γk−1 (Uk−1 yk−1 )iΠei + hei , γk−1 (gk yk − 1)iΠei
i=1
i
+ (diag(Πei ) − Πei ⊗ Πei )γk−1 (hk hei , yk−1 )i + γk−1 (rk hei , yk−1 i) f (Xk )Πei ,
(2.16)
·
where ⊗ stands for the tensor product of vectors, diag( ) is a diagonal matrix, and Λi (Xk )
is defined by
2
Xl−1 + (1 − e Xl−1 + (1 − e
−αi 4(l−1) −αi 4(l−1) −αi 4(l−1) −αi 4(l−1)
e )θi Xl e )θi
Λ (Xl ) = exp .
i
− −
q 2 q 2
1−e−2αi 4(l−1) 1−e−2αi 4(l−1)
ξi 2 ξi
2αi 2αi
(2.17)
For the estimation of model parameters in the next section, we require the recursive filters
of the following scalar quantities.
(ii) The number of occupations up to time k, i.e., the amount of time that yk spent in state
er :
k
X
Okr = hyl−1 , er i = Ok−1
r
+ hyk−1 , er i. (2.19)
l=1
(iii) The auxiliary process dependent on yk for the function f up to time k in state er :
k
X
Tkr ( f ) = f (Xl )hyl−1 , er i = Tk−1
r
( f ) + f (Xk )hyk−1 , er i. (2.20)
l=1
26 Chapter 2. Putting a price tag on temperature
N
X
γk (Jksr yk ) = Λi (Xk )hγk−1 (Jk−1
sr
yk−1 ), ei iΠei + Λr (Xk )γk−1 (hyk−1 , er i)π sr e s . (2.22)
i=1
For the auxiliary process Tkr in (2.20), where f has the form f (X) = X, f (X) = X 2 , or f (X) =
Xl+1 Xl , we utilise (2.16) with Uk = Tkr ( f ), U0 = gk = hk = 0, and rk = hyk−1 , er i to have
N
X
γk (Tkr ( f )yk ) = Λi (Xk )hγk−1 (Tk−1
r
( f )yk−1 ), ei iΠei + Λr (Xk )hγk−1 (yk−1 ), er i f (Xk )Πer .
i=1
(2.24)
The goal is to find the estimate of υ that maximises L(υ), b υ ∈ argmaxυ∈Υ L(υ). Now, consider
dPυ
" #
Q(υ;b υm ) = Ebυm log bυ Xk . (2.26)
dP m
The EM algorithm is implemented as follows. First, let m = 0 and choose the initial value
υ
υ0 for υ. The first iteration is computed as Q(υ;b
b υ0 ) = Ebυ0 [log dP
dP
b υ1 is found to
υ0 |Xk ], and b
υ1 ;b
make Q(b υ0 ) ≥ Q(υ;b
υ0 ). Second, the E-step computes (2.26). Third, the M-step finds
υm+1 such that Q(b
b υm+1 ;b
υm ) ≥ Q(υ;b
υm ); b
υm+1 ∈ argmaxυ∈Υ Q(υ;b
υm ). The E-step and M-step
υm+1 −b
are repeated until some stopping criterion is met, e.g., |b υm | < ε, where ε is sufficiently
small.
Let δ (yk ) = e−α(yk )4tk and η (yk ) = 1 − e−α(yk )4tk θ (yk ) . From (2.8), Xk+1 has a normal
distribution with
1 − e−2α(yk )4tk
mean µ(yk ) = δ (yk ) Xk + η (yk ) and variance 2 (yk ) = ξ2 (yk ) .
2α(yk )
Hence, at state i,
2 2
1 ηi − 4tik log δi
αi = − log δi , θi = and ξi =
2
. (2.27)
4tk 1 − δi 1 − δ2i
The complete parameter estimation of (2.8) is attained by applying the EM algorithm and
using the adaptive filters (2.18), (2.19) and (2.20). We then have the next result.
Tbki Xk2 + δ2i Tbki Xk2 + η2i B
bi + 2η2 δi Tbi (Xk−1 ) − 2δi Tbi (Xk−1 , Xk ) − 2ηi Tbi (Xk )
k i k k k
i2 =
b ,
O
bt
k
(2.30)
28 Chapter 2. Putting a price tag on temperature
and
Jcji
π ji = k .
b (2.31)
Obi
k
Remark 1: Having calculated δi , ηi and i2 in conjunction with (2.27), our proposed model
αi , b
in (2.8) is fully determined as {b ξi , b
θi } is explicitly available. The establishment of the
recursive filters implies that parameter estimates are self-calibrating, i.e., they are updated
automatically every time there is new information.
Remark 2: The filtering and parameter estimation procedure, as laid out above, is distinct
from and more clear-cut than the one given in Erlwein and Mamon [19]. In this research
work, the EM estimates are provided directly for each parameter appearing in the mean of
the OU process; whereas in [19], the EM estimates rely first on the estimate of the entire
drift component.
Table 2.1 shows the DATs’ descriptive statistics, which could guide in selecting initial val-
ues. Table 2.2 presents the parameter estimates of the deterministic S t with an adjusted
R-squared (coefficient of determination) of 0.835. This means 83.5% of the variation in
the response T t can be explained well by the model given the regressor variables in equa-
tion (2.4) except for e2 and e3 . Therefore, these variables were eliminated. The fitted
seasonality component S t is presented in Figure 2.1 together with the actual DATs. The
characteristics of the plotted S t are congruous with the temperature movement in both the
summer and winter seasons. With the given values of T t and S t , Xt is calculated and then
treated as the observation process for our filtering and parameter estimation implementa-
tion.
The seasonal trend of the actual data is adequately captured even in the presence of Xt
and some noise. The deseasonalised stochastic component Xt = T t − S t from (2.3) is also
depicted in Figure 2.2. We process Xt in 73 groups with 20 data points in each processing
window, where 4t = 1 day. So, we are updating roughly every 3 weeks with the aid of the
recursive filtering equations. Other filtering window sizes were also tested and we find that
the size only slightly affects the results; similar outcomes are produced even with different
window sizes.
Temperature (celcius) Temperature (Celsius)
30
-20
-10
0
10
20
-20
-10
0
10
20
30
40
01/01/2011 01/01/2011
01/03/2011 01/03/2011
01/05/2011 01/05/2011
01/07/2011 01/07/2011
01/09/2011 01/09/2011
01/11/2011 01/11/2011
01/01/2012 01/01/2012
01/03/2012 01/03/2012
01/05/2012 01/05/2012
01/07/2012 01/07/2012
01/09/2012 01/09/2012
01/11/2012 01/11/2012
01/01/2013 01/01/2013
times
times
01/03/2013 01/03/2013
01/05/2013 01/05/2013
01/07/2013 01/07/2013
01/09/2013 01/09/2013
01/11/2013 01/11/2013
01/01/2014 01/01/2014
01/05/2014 01/05/2014
Figure 2.1: Fitted seasonal component and actual observations
01/07/2014 01/07/2014
01/09/2014 01/09/2014
actual observations
01/11/2014 01/11/2014
fitted seasonal component
Chapter 2. Putting a price tag on temperature
2.5. Numerical implementation 31
where
1 − e−2α4tk
δ = e−α4tk , η = 1 − e−α4tk θ, 2 = ξ2 .
2α
Using equation (2.32), the likelihood function of Xk is
m
(Xk − η − δXk−1 )2
!
Y 1
L (Xk ; δ, η, ) = √ exp − , (2.33)
k=1 2π 2%2
We use, with appropriate adjustments in our problem formulation, the R function ‘optim’
δ = 0.6518, b
to solve equation (2.34). The results are b η = −0.001624 and b = 0.4271.
These are employed as benchmarks to select initial values for parameter estimates in the
modelling framework with more than 1 regime. They serve as a good guide in launching
the filtering recursions producing optimal values for various quantities leading to the EM
parameter estimates of the model. We must also ensure that > 0 to avoid bizarre outcomes
when equation (2.34) is applied.
To process the data, we apply recursive filtering equations (2.22)-(2.24), and then use the
results in 73 passes to feed into (2.28)-(2.31) to find the optimal parameter estimates. The
implementation was made on two-state and three-state HMCs, and Figure 2.3 displays the
movement, through each algorithm pass, of the optimal estimates for θ, α, and ξ under a
2-state set up. All three parameters exhibit convergence regardless of the choice of initial
values provided they do not substantially deviate from the values suggested in our one-state
ML estimation.
32 Chapter 2. Putting a price tag on temperature
The choice of initial values could impact the speed of convergence, but eventually all pa-
rameter estimates approach certain unique values. The parameter estimates in the second
state are characterised by lower mean-reverting speed, but higher mean-reverting level and
volatility in comparison to those in state 1. Figure 2.4 depicts the transition probabilities
with stable evolutions. A similar pattern in the evolution of parameter estimates for θ, α,
and ξ under the 3-state model is shown in Figure 2.5. In this case, we see that the mean-
reverting level and volatility in state 1 are the highest but with the lowest speed. The prob-
ability transition estimates π ji are displayed in Figure 2.6. The behaviours of the parameter
evolutions in a two-state setting are similar. Stability in the evolution of estimates for θ, α, ξ
and Π is attained by the self-calibrating HMM algorithm after approximately 26 algorithm
passes in both the 2-state and 3-state HMM settings. To quantify the variance of the various
∂
2
estimators, we calculate the Fisher information matrix I (v) = −Ev ∂v 2 log L (X; v) , which
bounds the asymptotic variance of the MLEs and v is a vector of parameters. The MLE is
consistent and possesses an asymptotically normal sampling distribution [36]. Hence, we
utilise the limiting distribution of the MLE for b v to dynamically obtain the 95% confidence
1
interval. For a generic (scalar) MLE b vi ∈ b vi ± 1.96 p .
v, this is given by b
I bv ii
The entries of the Fisher information matrix are derived from the log-likelihood expressions
given in the Appendix A and the results are summarised below.
J cji Tbi X 2
k k−1 Obi
I π ji = 2 , I (δi ) =
k
, I (ηi ) = 2k , and
π ji i2
i
O
bi 3 bi 2 bi 2
I (i ) = − k
+ T X + O ϑ + δ T
2 bi
X 2
− 2Tbki (Xk ) ηi
i
2
i4 k k k t i k k−1
− 2Tbi (Xk−1 , Xk ) δi + 2Tbi (Xk−1 ) ηi δi .
k k
The standard errors (SEs) of all parameter estimates for the 1-, 2- and 3-state models are
getting smaller as more algorithm passes are being run. The narrow ranges of the SEs
shown in Table 2.3 indicate that precise estimates are achieved by the proposed estimation
technique here (via HMM filtering and the EM algorithm).
0.0
θ
-0.4
θ1
θ2
-0.8
10 20 30 40 50 60 70
Algorithm steps
1.0
α1
α2
α
0.6
0.2
10 20 30 40 50 60 70
Algorithm steps
1.1
ξ1
ξ2
ξ
0.9
0.7
10 20 30 40 50 60 70
Algorithm steps
Figure 2.3: Evolution of parameter estimates for θ, α, and ξ under a 2-state model
34 Chapter 2. Putting a price tag on temperature
0.8
Transition probabilities
π11
π21
π12
π22
0.6
0.4
0.2
10 20 30 40 50 60 70
Algorithm steps
Table 2.3: Interval of standard errors for parameter estimates under 1-, 2- and 3-state mod-
els
2.5. Numerical implementation 35
1.0
0.0
θ
θ1
θ2
θ3
-1.0
10 20 30 40 50 60 70
Algorithm steps
α1
1.2
α2
α3
α
0.6
0.0
10 20 30 40 50 60 70
Algorithm steps
0.6 0.8 1.0 1.2
ξ1
ξ2
ξ3
ξ
10 20 30 40 50 60 70
Algorithm steps
Figure 2.5: Evolution of parameter estimates for θ, α, and ξ under a 3-state model
36 Chapter 2. Putting a price tag on temperature
Transition probabilities
π11
0.7
π21
π31
π12
π22
0.5
π32
π13
π23
π33
0.3
0.1
10 20 30 40 50 60 70
Algorithm steps
We evaluate the one-step ahead predictions for DATs as an additional diagnostic for model
fitting. The expected value of the Xk+1 given the available information up to time k provides
the forecast
Figures 2.7 illustrates the one-step ahead forecasts for the deseasonalised process Xk vis-á-
vis DATs under a 2-state HMM set-up in accordance with (2.35).
A comparison of the observed seasonal HDD and the expected seasonal HDD for the en-
tire period (2011-2014) under the 2-state model is presented in Figure 2.8. Clearly, the
expected seasonal HDD obtained from the model follows closely the actual seasonal HDD.
Furthermore, we examine an upward 6-month period, in a magnified view of forecasting,
of the DATs under all 3 states settings as shown in Figure 2.9. Visually, all forecasts are
quite close to the actual DATs and the temperatures’ trends and dynamics are captured well
by our proposed self-calibrating estimation approach.
Temperature (Celsius) Temperature (Celsius)
-20
-10
0
10
20
30
-15
-10
-5
0
5
10
15
01/02/2011 01/02/2011
03/02/2011 03/02/2011
05/02/2011 05/02/2011
07/02/2011 07/02/2011
09/02/2011 09/02/2011
11/02/2011 11/02/2011
01/02/2012 01/02/2012
2.5. Numerical implementation
03/02/2012 03/02/2012
05/02/2012 05/02/2012
07/02/2012 07/02/2012
09/02/2012 09/02/2012
11/02/2012 11/02/2012
01/02/2013 01/02/2013
times
times
03/02/2013 03/02/2013
05/02/2013 05/02/2013
07/02/2013 07/02/2013
09/02/2013 09/02/2013
11/02/2013 11/02/2013
01/02/2014 01/02/2014
03/02/2014 03/02/2014
05/02/2014 05/02/2014
09/02/2014 09/02/2014
forecast of DATs
actual deseasonalized observations
11/02/2014 11/02/2014
forecast of deseasonalized observations
37
Cumulative HDD
Temperature (Celsius)
38
5
10
15
20
25
30
Oct
Nov
04/01/2011
Dec
04/08/2011
Jan
04/15/2011
2011-2012
Feb
04/22/2011
Mar
actual HDD
04/29/2011
predicted HDD
05/06/2011 Apr
05/13/2011
Cumulative HDD
05/20/2011
05/27/2011
0 1000 2000 3000 4000
06/03/2011 Oct
06/10/2011 Nov
06/17/2011 Dec
06/24/2011
Jan
07/01/2011
2012-2013
Feb
times
07/08/2011
Mar
actual HDD
07/15/2011
predicted HDD
07/22/2011 Apr
07/29/2011
Cumulative HDD
08/05/2011
0 1000 2000 3000 4000
08/12/2011
Oct
08/19/2011
08/26/2011 Nov
09/02/2011 Dec
actual DATs
09/09/2011 Jan
2013-2014
09/16/2011
Feb
09/23/2011
Figure 2.9: Comparison of one step ahead forecasts in 1-, 2-, and 3-state HMMs
Mar
actual HDD
Figure 2.8: Comparison of the expected HDD and actual HDD in a 2-state model
09/30/2011
predicted HDD
Table 2.4: Results of error analysis for Xk and T k under different model settings
Visual inspection is not sufficient in deciding which model does the job well especially
when the actual values are very close to the predicted values under several competing mod-
els. We, therefore, provide quantitative approaches in gauging the ‘best-performing’ model
on the basis of information-criterion and minimised-error metrics. Following [19], we ex-
amine the mean-squared error (MSE), root-mean-squared error (RMSE), absolute-mean
error (MAE), and relative-absolute error (RAE) in assessing the one-step ahead predictions
under the 1-, 2- and 3-state HMMs. Suppose, k = 1, 2, . . . , Xk is the actual value of the
bk is the predicted value at time k given available data up to time k − 1, X̄
process at time k, X
is the mean of all Xk ’s, and K = 1460 is the total number of predicted observations. Then
the error metrics are computed as
2 s 2
PK b PK b
k=1 X k − Xk k=1 X k − X k
MSE = , RMSE = ,
K K
PK P K
k=1 | Xk − Xk | k=1 | Xk − Xk |
b b
MAE = , and RAE = PK .
K k=1 | Xk − X̄|
b
Table 2.4 displays the various error values concerning the one-step ahead predictions for
the model’s stochastic component Xk and observed DATs T k . Whilst the differences in er-
rors are generally small, the 2-state model outperforms the 1- and 3-state models under all
4 error measures. We also investigated the 4-state HMM but found no evidence of even
further minimal improvements. Increasing the number of regimes may outweigh the bene-
fit of model flexibility as there is a corresponding increase as well in the parameters to be
estimated.
40 Chapter 2. Putting a price tag on temperature
To evaluate the statistical significance for the mean difference of errors, a t-test involving
three pairs of model settings is conducted on the RMSEs generated using the bootstrapping
method. Firstly, a bootstrapped sample of the same size 1460 is generated by sampling
with replacement from the original squared errors for each model setting. Secondly, the
square root of the average squared errors for every bootstrapped sample is computed. After
running the two-step bootsrapped procedure 10,000 times, the bootstrapped RMSEs are
then produced for the paired t-test.
We report in Table 2.5 the Bonferroni-corrected p-values accounting for the pairwise set-
tings. With a significance level of 5%, we see that the results for the 1- and 2-state settings
are statistically different and the same conclusion holds for the results involving 1- and 3-
state settings. The results for the 2- and 3-state settings though are not statistically different.
This implies two things: (1) there is a benefit in adding regime-switching features to the
model setting; and (2) a 2-state model is sufficient for the DATs series under examination.
These findings support the observations in Section 2.5.3.1.
To balance the model’s goodness of fit and complexity, we also adopt the Akaike Infor-
mation Criterion (AIC) in choosing the ‘best-performing’ model. The AIC is calculated
as
where ω is the number of estimated parameters included in the model, and log L (X; v)
denotes the model’s loglikelihood evaluated using the data X and a set of parameters v. The
preferred model has the lowest AIC value. In terms of the observed Xk ’s and parameter set
v, the loglikelihood is given by
B
Xk − η (yk−1 ) − δ (yk−1 )
X 1
log L (Xk ; v) = ,
log √ − (2.37)
k=1 2π (yk−1 ) 2 2 (yk−1 )
2.6. Application to the pricing of temperature-dependent contracts 41
From equation (2.36), a model that has high loglikelihood is favoured and a complex model
(in the sense of having many parameters) is penalised. Looking at (2.11) and Π, we obtain
the number of parameters to be estimated for each model setting and this is shown in Table
2.6. These inputs are necessary to complete the calculation of (2.36).
With the aid of equation (2.37) and Table 2.6, we calculated the AIC values in Table 2.7
for the HMM settings as well as the setting with no regime-switching. Table 2.7 shows that
both the 2- and 3-state HMM settings have lower AIC values than that of the 1-state model.
It indicates that there is merit in embedding the regime-switching feature into the model.
Also, the 2-state HMM model produces the lowest AIC value. So, this result together with
the error analysis leads us to conclude that the 2-state HMM setting is the most suitable for
our data set under examination.
also implemented to examine the extent of each parameter’s influence to the value of a
temperature-based derivative.
In the United States and Canada, the CME weather contracts are mainly written on the HDD
or CDD index with a monthly or seasonal duration, whilst weather derivatives in Europe are
normally based on the CAT index. Table 2.8 summarises the specifications of typical CME
temperature-based futures and options. These products written on accumulated indices
over a calendar duration are financially settled at the end of the index or contract period.
The most common option style so far is a European-type option in both the OTC market
and CME, which may only be exercised at its expiration date. A cap on the maximum
payout is normally set for OTC weather contracts, whereas there is currently no such limit
for those trading in the CME. In contrast to temperature-based option contracts, regular
options traded in the CME typically specify index futures as the underlying variables. A
standardised temperature-based contract normally contains five basic elements listed in
Table 2.9.
2.6. Application to the pricing of temperature-dependent contracts 43
Element Description
Underlying variable A temperature-based index or index futures
Contract period Accumulated and consecutive calendar period
Contract (tick) size A dollar amount attached to each HDD, CDD or CAT point
Land-based station A specific station that reports observations
including temperature for a particular area
Payoff function A function that converts a tradeable temperature-based index
into a cash flow in terms of the above four elements.
In the literature, there are two mainstream approaches in modelling the underlying geared
towards the pricing of temperature-driven contracts. The first approach is to employ a
continuous-time process modelling the DATs. For instance, Bellini [3] proposed an ex-
tended OU model for DATs and to price HDD/CDD contracts assuming a Lévy-type noise
process. Benth and Šaltytė-Benth [5] computed futures and option prices using a continuous-
time autoregressive model for the temperature dynamics under a normality assumption.
The second approach is to adopt a discrete-time process to model DATs considering that
temperature indices are defined as discrete sums of DATs rather than integrals over a spe-
cific period. Moreno [30] demonstrated that it is not necessary to estimate DATs continu-
ously and the goodness of fit using a discrete-time process is deemed satisfactory.
The first approach facilitates various computations for valuation via stochastic calculus and
the latter one is mostly used in the actuarial field under a discrete-time framework. The
pricing conducted in this chapter employs a continuous-time OU process but such a pro-
cess is discretised in order to estimate the model parameters. Once parameter estimates are
available, as in our case, any temperature-based derivative can be theoretically valued by
taking the expectation of its discounted payoff at contract’s maturity. We derive the pricing
formulae for temperature-based futures and options, and include a specific pricing example
via simulation.
The market of weather derivatives is relatively less liquid and this complication is exacer-
bated as well by the fact that the temperature itself cannot be either stored or traded. A
zero risk premium could not be necessarily assumed because weather contract prices are
44 Chapter 2. Putting a price tag on temperature
not independent of the regime shifts in our model and investors’ risk aversions especially
in such a somewhat illiquid market. Hence, the price of risk process λt of the underlying
variable ought to be considered in order to determine a fair price for a temperature-based
contract.
We first note that there is also an issue regarding the risk-neutral valuation for illiquid and
incomplete markets such as the market for weather derivatives. This arises from the lack
of observed market prices that supposedly could determine the risk-neutral parameters and
the martingale measure Q. In any event, the price obtained under Q can be regarded as a
benchmark similar to the idea of pricing other exotic and structured products not yet avail-
able in the market. This “benchmarking principle” was espoused in Gao et al. [25] in which
the risk-neutral valuation was still employed to get no-arbitrage prices for a guaranteed an-
nuity option; this was done even though market information is insufficient to reflect certain
conditions concerning insurance-related contingent claims. Moreover, the risk-neutral val-
uation was applied to price a product dependent on an underlying (i.e., mortality rate) that
is also non-tradeable and storable.
Akin to the pricing under Q is the concept of the market price of risk λt at time t, which
links the P to the Q−dynamics of the underlying variable. The quantity λt could be assumed
zero if only to attain simplicity and tractability. But, some recent findings indicate that λt
associated with the temperature variable is significantly different from 0; see Xu et al. [38],
2.6. Application to the pricing of temperature-dependent contracts 45
Benth et al. [4], and Elias et al. [14]. In this chapter, we shall investigate the impact of λt in
the pricing of weather contracts. This in turn requires an explicit expression for T t under Q.
The formal justification for a risk-neutral measure Q utilised for weather-derivative pricing
in our set up follows from Weron’s work [31], which is anchored intimately on λt that is
assumed a real-valued measurable and bounded process.
Through the Girsanov’s theorem, there are probability measures Q and Qλ equivalent to P,
such that
Z t
λ
BtQ = Bt + λ s ds = Bt + λt t, (2.38)
0
λ
where BQ is a standard Brownian motion under Qλ and Bt is a P−Brownian motion driv-
ing our deseasonalised process Xt in equation (2.5). Benth and Šaltytė-Benth [5] stated the
existence of a flexible class of risk-neutral probabilities (e.g., Qλ ) that can be obtained from
the time-varying λt to fit the forward curves. As Qλ is a risk-adjusted or risk-neutral prob-
ability measure (see page 151 of [31] and page 173 of [2]), this signifies that Qλ coincide
with the needed Q for the purpose of our valuation.
Referring to the construction arguments in [31], we obtain the condition θ Q = θ − λαt ξ , where
θ Q is defined under Q. So extending [31] to our framework, λt links the parameters of our
proposed model through
λt ξ(yt )
θ∗ (yt ) = θ(yt ) − , (2.39)
α(yt )
where α(yt ), ξ(yt ), and θ(yt ) are given in equation (2.11). Here, θ∗ (yt ) is the new parameter
under Q that replaces θ(yt ) in the OU process after taking into account λt , which encapsu-
lates both the (i) price of risk due to regime switching and (ii) market price of risk (due to
the shocks from Bt ).
investigate the effect of its variation in the succeeding sensitivity analysis. For weather-
derivative valuation, we utilise
and
As asserted in Geman and Leonardi [26], the hypothesis for accumulated-degree days in
pricing weather contracts on temperature indices being normally distributed is accepted
by both researchers and practitioners. In Benth and Šaltytė-Benth [5], the DATs are mod-
elled by an OU process having a seasonal volatility, and the HDD/CDD-index pricing was
accomplished under the normality assumption. More recently, Alexandridis and Zapranis
[2] verified the suitability of such a normality assumption. Under the same assumption
together with the principles employed in [2] and [5], we derive pricing formulae for HDD
futures and options under the OU-HMM settings.
It is assumed that the contract’s period is [τ1 , τ2 ], the contract size is 1 monetary unit for
each index point, and the risk-free interest rate r f is constant with continuous compound-
ing. The HDD futures price F H (t, τ1 , τ2 ) is evaluated under two scenarios: (i) at time t
before the contract period, i.e., 0 ≤ t ≤ τ1 < τ2 , and (ii) at a time t within the contract
period, i.e., 0 ≤ τ1 ≤ t < τ2 .
h R τ2
From Benth and Šaltytė-Benth’s reasoning [5], e−r f (τ2 −t) EQ τ max (T base − T v , 0) dv
i 1
must be defined as
"Z τ2 #
F H (t, τ1 , τ2 ) = E Q
max (T base − T v , 0) dvFt . (2.43)
τ1
For the European HDD call option, we consider two types: a call dependent on the HDD
itself and a call dependent on the HDD futures price. The pricing results, whose proofs
are given in Appendix B, are presented in Propositions 2.6.1– 2.6.3 below under the model
described in (2.41).
2.6. Application to the pricing of temperature-dependent contracts 47
Proposition 2.6.1 The HDD futures price for the contract period 0 ≤ τ1 < τ2 is given by
Z τ2 !
M (t, v, Xt )
0 ≤ t ≤ τ1 < τ2
A (t, v) D dv
τ1 A (t, v)
F H (t, τ1 , τ2 ) = E [H|Ft ] =
Q
Z t
max (T base − T u , 0) du 0 ≤ τ1 ≤ t < τ2 ,
τ1
τ2
Z !
M (t, v, Xt )
+
A (t, v) D dv
t A (t, v)
where
Z τ2
H= max (T base − T v , 0) dv,
τ1
with Φ and φ are the cumulative and probability density functions of a standard normal
distribution, respectively.
Proposition 2.6.2 The price of a European call option written on an HDD futures price at
time t is given by
Z Fm
C FH (t, K, τT ) = e −r(τT −t)
F H − K g F H dF H ,
K
where
Z τ2 !
M (t, v, Xt )
FH = dv, Fm = max F H ,
A (t, v) D
τ1 A (t, v)
τT is the expiration date, t ≤ τT ≤ τ1 ≤ τ2 , K is the strike price, r is the risk-free interest
rate, and g(x) is the probability density function of a random variable X.
Proposition 2.6.3 The price of a European call option written on HDD at time t is given
by
Z Hm
C H (t, K, τT ) = e −r(τT −t)
(H − K) g (H) dH,
K
48 Chapter 2. Putting a price tag on temperature
where
Z τT
H= max (T base − T v , 0) dv, Hm = max (H) ,
t
τT is the expiration date, t ≤ τT , K is the strike price, r is the risk-free interest rate, and
g(x) is the probability density function of a random variable X.
The normal distribution for modelling HDD provides a good fit in several winter months
for almost all locations [28]. However, it is almost impossible to obtain a good fit in mod-
elling the HDD in other months with the use of the normal distribution most especially in
December when cold temperatures give rise to heavy tails in the empirical distribution of
the HDD. Therefore, this assumption is not quite realistic for Proposition 2.6.3 in pricing a
one-month weather option contract. Although an analytic valuation formula can be gener-
ated due to the simplicity of a Gaussian distribution, such distribution is probably going to
be adequate only in valuing a seasonal HDD option rather than a contract for one month.
To compute the prices of futures and options written on HDD, we note the path-dependent
nature of the expected payoff, which can be calculated efficiently using the Monte-Carlo
(MC) simulation method. We perform the valuation under a 2-state HMM model and
include as well the price of the risk process. In particular, we evaluate
h i
F Hd d (t, τ1 , τ2 ) = VEQ H d Ft , (2.45)
2.6. Application to the pricing of temperature-dependent contracts 49
h i
C Hd d (t, K, τT ) = e−r(τT −t) VEQ max H d − K, 0 Ft , (2.46)
where V is the contract size, and H d is defined as ττ21 max (T base − T, 0) in (2.45) and
P
PτT
t max (T base − T, 0) in (2.46) consistent with the respective notations employed in Propo-
sitions 2.6.1 and 2.6.3.
We examine closely the valuation of HDD call options through a sensitivity analysis. Also,
in order to make a valid comparison between the option prices generated under our model
and the results in Elias et al. [14], we make certain conditions uniform between the con-
tract in our study and that in [14]. Consider a European call with effectivity of 01 – 31 Dec
2012, and t = 0, V = $20, and K = $580 in equation (2.46). We set r = 1%, which is
the 1-year T-bill’s yield as published by the Bank of Canada in 2012, to proxy the interest
rate. Having optimal estimates for θ (yk ), α (yk ), and ξ (yk ) in Section 2.5, and granting λt is
known or assigned, a value for θ∗ (yt ) in accordance with (2.39) is immediate. Moreover, T
and H d can be calculated based on equations (2.41) and (2.46), respectively. The optimal
estimates for parameters to compute T and H d are shown in Table 2.10.
Assuming that the risk premium solely depends on the market price of risk associated
with the underlying variable, the average of the two risk premium values for the two cities
(Chicago and New York) located close to Toronto, which is λt = 6.68%, is utilised in our
illustrative example.
In their pricing calculations, Elias et al. [14] assumed r = 5% in 2008, and used an expected
50 Chapter 2. Putting a price tag on temperature
Parameter estimate δ
b η
b
b θ
b α
b ξ
b θ∗
b
Regime 1 0.7146 0.0390 0.6229 0.1367 0.3361 0.7301 -0.0084
Regime 2 0.5166 -0.0889 0.6929 -0.1839 0.6604 0.9301 -0.2779
Table 2.10: Optimal parameter estimates for the 2-state HMM model with and without the
market price of risk
HDD in their payoff function. In our case, after simulating the underlying variable H d using
the optimal estimates in Table 2.10, we perform 10,000 simulations to evaluate the expected
payoff in (2.46). Then, by embedding λt into a 2-state HMM-based model, we get the call
option price written on HDD as
!
31 31
d
C Hd 0, 580, = e−0.01 365 20EQ max H d − 580, 0 F0 = $861.33. (2.47)
365
If λt is not considered in equation (2.47), the price is just $507.53. The SEs of the MC
simulations with and without λt are 8.9369 and 6.3323 respectively, and the corresponding
95% confidence intervals are [843.81, 878.85] and [495.12, 519.94]. In this case, 10,000
simulations are sufficient to obtain a desired level of accuracy at the significance level of
5%.
Figure 2.10 exhibits the variation of the HDD option prices with and without the price of
risk process, over a range of strike HDD. As the strike price K increases from 540 to 620,
both option prices decline. They reach zero at K = 623.10 and K = 605.49 respectively.
Indeed, the price of risk process does make a substantial difference in pricing an HDD call
option for each strike HDD.
Figure 2.11 shows the influence of λt on the HDD call option price by changing its value
from 0 to 20%. The option price rises with an increasing λt , which is consistent with the
2.6. Application to the pricing of temperature-dependent contracts 51
1500
monthly option price with λ t
monthly option price without λ t
0 500
Figure 2.10: One-month call option price with initial state 1 versus HDD strike
results from Elias et. al [14] and it constitutes a significant portion of the temperature-based
derivative’s price.
HDD call option price
1500
0 500
Figure 2.11: HDD call option price with initial state 1 versus market price of risk
To easily interpret our numerical results, we denote states 1 and 2 as “hot” and “cold”
regimes, respectively, corresponding to the hot and cold fronts in the 2-state HMM-temperature
52 Chapter 2. Putting a price tag on temperature
where p, the intensity parameter, is the rate of entering into/leaving from one state to an-
other. The use of the matrix P facilitates in probing the option-price sensitivity associated
with the jump probability of the Markov chain as we only have one parameter to focus on,
which is p in this case. For such a purpose, a convenient and practical representation of
matrix Π, defined in (2.9), is
1 1 1 1 −2p 1 −1
Π = exp (P) = + e . (2.48)
2 1 1 2 −1 1
In our numerical demonstration, the values of p are set to 0.1, 0.3, 0.5, 0.7, and 0.9 and our
simulations consider starting states 1 and 2 at t = 0 separately. The tremendous impact of
the intensity parameter is shown in Table 2.11; a comparison with the pricing results for
the HDD call options with no-regime switching model is included. When the initial regime
is state 2 (cold), the intensity negatively affects the HDD call option, i.e., a higher intensity
implies a faster rate of switching into a “hot” state. This leads to a smaller HDD, and so
a lower option price based on equation (2.46). When we start with a “cold” regime and
during the month of December as considered in our example, HDD calculated from the
two-state HMM model ought to be generally higher than the ones generated from simula-
tion with initial state 1 (hot).
On the other hand, when the initial regime is state 2 (cold), lower p’s produce higher op-
tion prices. This is because when p is low, the probability of remaining in the initial state
is high. Option prices produced by a no-regime switching model are very close to the ones
generated from the 2-state HMM model with p = 0.1 and initial state 2 in Table 2.11. This
is expected since a very small p indicates a very slow switching from the “cold” state to a
“hot” state within the context of the December contract period.
We utilise an increment level of 0.2 for the sensitivity analyses involving θ∗ , α, and ξ.
The values of optimal estimates listed in Table 2.10 are regarded as starting benchmarks
when adding or lowering increment levels. Figure 2.12 depicts the option prices computed
through varying values of θ1∗ . It is apparent that a change in θ1∗ greatly influences the option
2.6. Application to the pricing of temperature-dependent contracts 53
price. With initial state 1, as θ1∗ increases, the option price dramatically decreases and with
initial state 2, we have the opposite conclusion for the option price. This is because when
the mean-level θ∗ is low (cold), HDD is high giving a high positive HDD payoff. Figure
2.13 depicts the impact of varying θ2∗ on the option price. Even though the prices decline
along with the increase of θ2∗ similar to the impact trend of θ1∗ , the prices under θ2∗ are a
bit more spread out upwards compared to those under θ1∗ . This is because θ2∗ represents the
mean level under a cold state. Once we increase θ2∗ , even though option prices decline, any
subsequent switches to the hot state is not sufficient to drastically perturb the price level
under state 2.
Figures 2.14 and 2.15 display the respective plots of option prices by varying the values of
α1 and α2 . Although a slight increase in option price can be identified with a corresponding
increase in α1 , no drastic fluctuations appear under both “cold” and “hot” initial regimes.
The same can be observed in the option prices when α2 is varied in Figure 2.15. However,
the option prices can be seen to be a wee bit higher under initial state 2 for both α1 and α2 .
The impact of ξ1 on the option price starting with the respective “hot” and “cold” states
is illustrated in Figure 2.16. The effect of varying ξ2 on the option price is also presented
in Figure 2.17. The option price decreases with increasing ξ1 irrespective of whether the
initial state is 1 or 2, whilst the opposite is true for the option price when ξ2 increases.
Moreover, ξ1 seems to have a slightly more effect on the variability of option prices than
ξ2 does given the same increment. An increasing volatility under state 1 (hot) leads to a
rising temperature level, which in turn yields a decreasing option price. The results also
suggest that the variation of ξ2 under initial state 2 contributes more to the variation of the
option prices when compared to those obtained from the same variation levels of ξ2 under
the initial state 1 as shown in Figures 2.16 and 2.17. Higher volatility ξ2 under the initial
“cold” state results to higher option prices along with the fact that the sensitivity analysis
is conducted for the cold month period of December in our illustrative case.
Remark 4: The standard errors of option prices calculated in (2.47) of subsection 2.6.4,
as well as of those displayed in Figure 2.11 and Table 2.11 of subsection 2.6.5 fall in the
range [$6.67, $9.61].
54 Chapter 2. Putting a price tag on temperature
HDD call option price: initial state 1 HDD call option price: initial state 2
1500
0 500
540 560 580 600 620 540 560 580 600 620
Strike HDD Strike HDD
HDD call option price: initial state 1 HDD call option price: initial state 2
1500
θ*2 = − 0.2779 θ*2 = − 0.2779
θ*2 = − 0.0779 θ*2 = − 0.0779
θ*2 = 0.1221 θ*2 = 0.1221
0 500
0 500
540 560 580 600 620 540 560 580 600 620
Strike HDD Strike HDD
HDD call option price: initial state 1 HDD call option price: initial state 2
α1 = 0.1361 α1 = 0.1361
HDD call option price
α1 = 0.3361 α1 = 0.3361
1500
1500
α1 = 0.5361 α1 = 0.5361
α1 = 0.7361 α1 = 0.7361
α1 = 0.9361 α1 = 0.9361
0 500
0 500
540 560 580 600 620 540 560 580 600 620
Strike HDD Strike HDD
HDD call option price: initial state 1 HDD call option price: initial state 2
α2 = 0.2604 α2 = 0.2604
HDD call option price
1500
α2 = 0.6604 α2 = 0.6604
α2 = 0.8604 α2 = 0.8604
α2 = 1.0604 α2 = 1.0604
0 500
0 500
540 560 580 600 620 540 560 580 600 620
Strike HDD Strike HDD
HDD call option price: initial state 1 HDD call option price: initial state 2
ξ1 = 0.3301 ξ1 = 0.3301
HDD call option price
ξ1 = 0.5301 ξ1 = 0.5301
1500
1500
ξ1 = 0.7301 ξ1 = 0.7301
ξ1 = 0.9301 ξ1 = 0.9301
ξ1 = 1.1301 ξ1 = 1.1301
0 500
0 500
540 560 580 600 620 540 560 580 600 620
Strike HDD Strike HDD
HDD call option price: initial state 1 HDD call option price: initial state 2
ξ2 = 0.5301 ξ2 = 0.5301
HDD call option price
1500
ξ2 = 0.9301 ξ2 = 0.9301
ξ2 = 1.1301 ξ2 = 1.1301
ξ2 = 1.3301 ξ2 = 1.3301
0 500
0 500
540 560 580 600 620 540 560 580 600 620
Strike HDD Strike HDD
2.7 Conclusion
We implemented the parameter estimation of a temperature model, comprising of seasonal
and stochastic components with mean reversion, designed to accurately capture the charac-
teristics unique to the dynamics of the DATs. As the model parameters are modulated by
HMM, the HMM-based filtering technique, via change of reference probability measure,
along with the EM algorithm was employed to provide self-tuning parameters. We tested
the model and estimation method on a 4-year Toronto’s DATs. The one-step ahead fore-
casting errors and the AIC metrics were generated, and on their bases, the 2-state HMM is
deemed to adequately model the 4-year Toronto DATs data set. Furthermore, we derived
an expression for the price of HDD futures and option contracts by taking into account the
price of risk process. Sensitivity analyses, under a 2-state HMM setting, were also per-
formed by probing the behaviour of the option price as the intensity of the transition matrix
and other model parameters were varied. We found that the mean-reverting level θ and
volatility ξ have significant effects on pricing temperature-linked options.
Our work complements that in [14] and further elevates the use of regime-switching model
for temperature modelling at a level that supports an interactive platform, given a data set,
for the pricing and risk management of weather derivatives. A natural extension of this
work is to investigate dynamic risk management under our setting and using filtered and
optimal parameter estimates obtained from our estimation approach. The results could
then be benchmarked with those from current estimation methodology for weather-linked
58 Chapter 2. Putting a price tag on temperature
contracts. A stress-testing comparison using the framework put forward here and other
modelling approaches could be pursued as well in the context of portfolio optimisation,
in the spirit of [24] for instance; or risk measurement similar to the objectives of [25]
but for weather derivatives instead; and modelling of weather futures prices constituting a
multivariate time series resembling to the analysis in [12].
References
[3] F. Bellini, The weather derivatives market: modelling and pricing temperature, Ph.D.
Thesis, (2005) University of Lugano, Switzerland.
[4] F. Benth, W. Hardle and B. Cabrera, Pricing of asian temperature risk, in Anonymous
Berlin, Heidelberg: Springer Berlin Heidelberg, (2011) 163–199.
[8] S. Campbell, F. Diebold, Weather forecasting for weather derivatives, Journal of the
American Statistical Association, 100 (469) (2005) 6–16.
[9] M. Cao, J. Wei, Weather derivatives valuation and market price of weather risk, Jour-
nal of Futures Markets, 24 (11) (2004) 1065-1089.
[10] W. Ching, T. Siu, L. Li, Pricing exotic options under a high-order Markovian regime
switching model, Journal of Applied Mathematics and Decision Sciences, (2007) 1–
15, doi:10.1155/2007/18014.
59
60 REFERENCES
[12] P. Date, R. Mamon, A. Tenyakov, Filtering and forecasting commodity futures prices
under an HMM framework, Energy Economics, 40 (2013) 1001–1013.
[13] M. Davis, Pricing weather derivatives by marginal value, Quantitative Finance, 1 (3)
(2001) 305–308.
[14] B. Dischel, At least: a model for weather risk, Weather Risk Special Report, Energy
and Power Risk Management, March issue (1998) 3032.
[15] G. Dorfleitner, M. Wimmer, The pricing of temperature futures at the Chicago Mer-
cantile Exchange, Journal of Banking and Finance, 34(6) (2010) 1360–1370.
[16] J. Dutton, Opportunities and priorities in a new era for weather and climate services,
Bulletin of the American Meteorological Society, 83 (9) (2002) 1303-131.
[18] R. Elliott, Exact adaptive filters for Markov chains observed in Gaussian noise, Auto-
matica, 30 (1994) 1399–1408.
[19] R. Elliott, L. Aggoun, J. Moore, J., Hidden Markov Models: Estimation and Control,
Springer, New York (1995).
[20] R. Elliott, T. Siu, A. Badescu, Bond valuation under a discrete-time regime switching
term-structure model and its continuous-time extension, Managerial Finance, 37 (11)
(2011) 1025-1047.
[22] C. Erlwein , R. Mamon, An online estimation scheme for a HullWhite model with
HMM-driven parameters, Statistical Methods and Applications, 18 (1) (2009) 87–
107.
REFERENCES 61
[23] C. Erlwein, F. Benth, R. Mamon, HMM filtering and parameter estimation of an elec-
tricity spot price model, Energy Economics, 32 (5) (2010) 1034–1043.
[25] H. Gao, R. Mamon, X. Liu, Risk measurement of a guaranteed annuity option under
a stochastic modelling framework, Mathematics and Computers in Simulation, 132
(2017) 100–119.
[30] M. Moreno, Riding the temp, Weather Derivatives, FOW Special Supplement, De-
cember (2000).
[31] J. Norris, Markov chains, Cambridge University Press, Cambridge & New York
(1997).
[33] T. Siu, C. Erlwein, R. Mamon, The pricing of credit default swaps under a Markov-
modulated Merton’s structural model, North American Actuarial Journal, 12 (1)
(2008) 19–46.
[34] A. van der Vaart, Asymptotic Statistics, Cambridge University Press, Cambridge and
New York (1998).
62 REFERENCES
[35] R. Weron, Modeling and forecasting electricity loads and prices: A statistical ap-
proach. Hoboken, NJ; Chichester, England; John Wiley and Sons. (2006)
[36] X. Xi, R. Mamon, Parameter estimation of an asset price model driven by a weak
hidden Markov chain, Economic Modelling, 28 (1) (2011), 36–46.
3.1 Introduction
Over $2.4 trillion in economic losses and nearly 2 million deaths globally have been re-
ported as a result of weather-related hazards since 1971; see [34]. Catastrophes, such as
floods, tornadoes and hurricanes, are well-publicised as a result of the heavy casualties and
enormous property losses that they cause. But, whilst these weather-driven disasters are
getting more attention, we note that even some slight abberations of weather conditions, al-
though non-catastrophic, could undeniably still be matters of consequence too. They may
not produce immediately tremendous losses, yet their accompanying potential risks could
materialise later in the future with more frequency as well as lead to more widespread im-
pact on business and economy. Weather risk, for instance, is best depicted by crop yields
that may drop substantially due to prolonged droughts or rainy season. Poor weather hugely
affects planned construction schedules. Cold winters increase operational costs of energy
consumers.
Dutton and John [13] concluded that weather and climate risks exist in nearly 33.3% of
the private industry activities and 39.1% of the U.S. gross domestic product is weather-
sensitive. In order to hedge the risks associated with such non-catastrophic weather condi-
tions, the so-called weather derivatives were created. Their values depend on the underlying
63
64 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
In contrast to the traditional financial derivatives whose underlying assets are typically
bonds and stocks, weather derivatives are dependent on non-negotiable underlying indices
that have no price themselves. Such indices are introduced to quantify weather phenomena
but casually employed to construct the basis of a weather derivative contract. As stated
by Elias et al. [14], over 90% of weather derivatives are temperature-based. Weather fu-
tures and options traded in the CME are written on temperature heating degree day (HDD),
cooling degree day (CDD) and cumulative average temperature (CAT) indices. The most
distinctive feature of these temperature-based indices is that, unlike conventional financial
underlying asset, they are not tradeable or storable. Therefore, the traditional no-arbitrage
pricing approach, utilised in the Black-Scholes modelling framework, is not applicable to
pricing weather derivatives [10]. Without delving into the challenges of derivative valua-
tion concerning the choice of an appropriate pricing measure, which are outside the scope
of this research, our aim is to develop a model that could accurately capture the salient
features of the evolution of temperature data. Our proposed model then may be useful in
accurately pricing weather derivatives.
In the literature, several studies were conducted to address the pricing of weather deriva-
tives. It is asserted in Dorfleitner and Wimmer [12] that the historical burn analysis (HBA)
is normally adopted by most investors given its straightforwardness and ease of replication.
Nevertheless, Jewson and Brix [25] argued that HBA is bound to be biased and prone to
large pricing errors, and so, time-series methods are proposed in [25] instead. Davis [10]
dealt with weather derivative pricing by relying on the marginal substitution value principle
of mathematical economics.
3.1. Introduction 65
More recent studies considered the Ornstein-Uhlenbeck (OU) process in reproducing the
dynamic behaviour of the temperature series. In comparison to simulating temperature
indices in the HBA method, modelling the daily average temperature (DAT) directly gen-
erates more accurate outcomes since it contains more information under both regular and
extreme conditions. The estimated DAT model is then utilised to develop corresponding
temperature indices. These indices are used to determine the price of various weather
derivatives together with their risk management. Dischel [11] first proposed the concept
of adopting a continuous-time stochastic process to capture the dynamics of temperature.
Alaton et al. [1] improved Dischel’s model by incorporating seasonalities in the mean with
a sinusoidal function. Benth and Saltyte-Benth [3] studied DAT variations with an OU pro-
cess, where the noise follows a generalised hyperbolic Lévy process.
However, the application of a one-state stochastic process in almost all of the above-
mentioned papers may not describe the behaviour of temperature with great accuracy, es-
pecially during drastic changes in regimes or emergence of entirely different states as in
the occurrence of occasional climatic changes. This consideration inspires the extension of
the typical OU mean-reverting process with the addition of the regime-switching feature. It
seems that the research of Elias et al. [14] was by far the only paper that employs a regime-
switching approach to model the stochastic behaviour of temperature. They applied lattices
to construct corresponding models and concluded that the HDD and CDD generated by a
two-state process were closer to the observed data than those obtained by a single stochas-
tic process.
To take several steps of research progress farther away from the accomplishments in [14],
the purpose of this work is to develop self-calibrating higher-order hidden Markov model
(HOHMM) filtering algorithms that provide improved accuracy in capturing the behaviour
and features of temperature, which will be beneficial for better valuation and risk man-
agement of weather futures and option products. An HOHMM of order k is an HMM
that takes into account the HMM values up to lag k; in the literature, HOHMM is also
termed as a weak HMM as the memoryless assumption is weakened or relaxed to account
for information revealed in the last k steps. The usual HMM has found many financial
and economic applications, and pioneering developments, with the illustration of a two-
state regime-switching model, are highlighted in Hamilton [22]. Elliott et al. [6] made
significant landmarks in the estimation methodology of HMM via the change of measure
technique for model identification after processing batches of data. Since then, researchers
66 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
The departure of our contributions from the current state of the art in Markov-switching
temperature modelling is highlighted by the following accomplishments in the study: (i)
Our proposed modelling approach simultaneously captures four important stylised facts of
temperature evolution, which are the mean-reversion, seasonality, memory and stochas-
ticity. (ii) Although adopted from the literature in Markovian regime-switching models
for economic and financial variables, to the best of our knowledge, this research is the
first to put forward a dynamic estimation procedure that readily gives parameter updates
whenever a new set of data becomes available. (iii) The embedding of the HOHMM in
the OU-modelling framework is also new in an attempt to capture the memory property in
the temperature data series. (iv) Within the OU-modelling set up, we develop the filtering
results relying on a transformation that converts a second-order HOHMM into a regular
HMM [26]; hence, our approach enables the estimation of higher-order HOHMM to be
implemented via an HOHMM-order reduction. (v) Finally, systematic and sufficient model
implementation details are laid out including various diagnostics and validation based on
techniques from statistical analysis and inference. Such details are beneficial for practition-
ers in building efficient computing platforms and interfaces with their business software as
well as to other scientists in constructing a modelling framework that examines the move-
ments and ability to control the dynamics of similar or related phenomena.
3.2. Model description 67
The remaining parts of this chapter are structured as follows. Section 3.2 presents the for-
mulation of temperature modelling with a discrete-time higher-order hidden Markov chain
governing the model parameters. In Section 3.3, the change of probability measure is ap-
plied to derive recursive filtering equations for quantities that are functions of HOHMM and
necessary to carry out an online parameter estimation scheme in Section 3.4. The numer-
ical implementation of our proposed model and estimation method is detailed in Section
3.5 involving a data set of daily temperatures collected at the Toronto Pearson Airport. The
determination of the most appropriate model setting is given in Section 3.5 by comparing
the respective forecasting performance and penalised-log likelihood of different competing
set ups. Section 3.6 concludes.
where T base is the base temperature, usually given as 65◦ F, or 18◦ C in the market, T t is the
T max + T min
daily average temperature on day t, computed as . The contracts written on
2
these indices have a weekly, monthly or seasonal duration. HDD contracts typically cover
the cold months from October in one year to April of the following year, whilst the period
of the CDD contracts is the warm season that runs from April to October of the same year.
The overlapping months, October and April, are called the transition or shoulder months.
Some authors of previous papers (e.g., Dorfleitner and Wimmer [12]) attempted to price
68 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
Suppose (Ω, F , P) is an underlying probability space that supports the modelling of the
observed process T t , the DAT on day t. Following Benth et al. [2],
T t = Xt + S t . (3.1)
Equation (3.1) has two components: Xt that is assumed to follow an OU process with an
HOHMM-governed parameters and S t , which is a deterministic function devised to pin
down the seasonal trends and DATs’ mean reversion. As in Campbell and Diebold [5], the
seasonal component is given by
3 " ! !#
X 2π 2π
S t = at + b + ch sin dh t + eh cos dh t , (3.2)
h=1
365 365
Consider the stochastic differential equation (SDE) for the OU process Xt given by
where Bt is a standard Brownian motion on some probability space. By Itô’s lemma and
for s ≤ t, the SDE in (3.3) has the solution
Z t
Xt = X s e
−α(t−s)
+ (1 − e −α(t−s)
)θ + ξe
−αt
eαu dBu . (3.4)
s
3.2. Model description 69
where 4tk+1 = tk+1 − tk and {zk+1 } is a sequence of independent and identically distributed
(IID) standard normal random variables.
We assume that we have a homogeneous Markov chain yk with finite states in discrete time.
Its state space is associated with the canonical basis {e1 , e2 , . . . , eN }, where ei = (0, . . .
, 0, 1, 0, . . . , 0)> ∈ RN with 1 in the ith position, N stands for the total number of states, and
> denotes the transpose of a matrix. In a typical OU-HMM setting (e.g., Date et al. [9]),
Xk with representation in (3.5) has parameters dependent on the Markov chain yk , i.e.,
In equation (3.6), the speed of mean-reversion α, the mean level of the process θ, and
the volatility ξ are all governed by yk making the model regime switching. Given the
representation of the Markov chain’s state space, αk := α(yk ) = hα, yk i, θk := θ(yk ) =
hθ, yk i, and ξk := ξ(yk ) = hξ, yt i, where h·, ·i is the inner product in RN .
The second-order hidden Markov chain will be used as a prototype to develop the HOHMM
setting; such prototype’s simplicity will facilitate the discussion of concepts associated with
a generalised Markov chain of lag k (k = 1, 2, . . .). Under this setting, the parameters α, θ
and ξ are governed by a second-order Markov chain ywk , which is defined on the stochastic
basis (Ω, F , {Fk }, P) with the canonical basis {e1 , e2 , . . . , eN }. Write Fk := Fkw ∨ FkB for
the global filtration, where Fkw is the filtration generated by σ{yw0 , . . . , ywk } and FkB is the
filtration generated by Bt .
A discrete-time second-order hidden Markov chain ywk at current time k depends on states
that occurred at two prior lag times k − 1 and k − 2. The key principle in the filtering
and estimation of HOHMM is the utility of a mapping that transforms an HOHMM into a
70 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
regular HMM. In our case, a mapping, say, ζ converts the second-order Markov chain into
the usual Markov chain. The transformation ζ is defined as
where ebc denotes a unit vector with 1 in its ((b−1)N +c)th position. Then, the new Markov
chain satisfies the relation
where {δwk+1 }k≥1 is a martingale increment. So, EP [δwk+1 |Fkw ] = 0 and Π is the associated
N 2 × N 2 probability transition matrix.
Suppose ywk is a second-order hidden Markov chain that drives the model parameters of Xk .
Write
Hdcb := P(ywk+1 = ed |ywk = ec , ywk−1 = eb ),
where k ≥ 1, and d, c, b ∈ {1, 2, . . . , N}. As in Siu et al. [28], H is an associated N × N 2
probability transition matrix given by
h111 h112 · · · h11N ··· h1N1 h1N2 ··· h1NN
211 h212 · · · h21N ··· · · · h2NN
h h2N1 h2N2
H = . .. .. .. .. .. .. .. .
.. . . . ··· . . . .
hN11 · · · hN1N · · · hNN1 hNN2 · · · hNNN
3.3. Recursive filtering 71
... .. ..
. . 0 ··· 0 ··· h1N1 ··· h1NN
.. ..
..
. . .
··· ··· ···
0
0 0 0
.. .. ..
. . .
0 ··· 0 0 ··· 0 ···
. .. .. .. .. ..
.. . . . . . ··· 0 ··· 0
0 ··· 0 0 ··· 0 · · · hNN1 · · · hNNN
where
% yk = ξ yk
w w
.
2α ywk
As we do not observe the state of ywk , it needs to be estimated and to this end, we work as
noted above under an ideal reference probability measure Q whose existence is justified by
the Kolmogorov’s extension theorem. Again, under Q, {zk }k≥1 is a sequence of IID N(0, 1)
random variables and also independent of ywk .
Let Xk = σ(X0 , X1 , . . . , ). We introduce the Girsanov’s density to back out P from Q given
72 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
We establish adaptive filtering processes, under P, for estimators of quantities that are
functions of ζ(ywk+1 , ywk ) as per equations (3.8) and (3.9), but by doing the analysis and
computations under measure Q. Write
sk (cb) : = P ywk = ec , ywk−1 = eb | Xk
= E Ψk | X k .
Q w
(3.15)
3.3. Recursive filtering 73
Along with equations (3.14) and (3.15), the conditional expectation of ζ(ywk+1 , ywk ) under the
real-world probability measure P could be written explicitly as
vk vk
sk = P N = . (3.16)
c,b=1 hv k , ecb i hv k , 1i
As noted in Xi and Mamon [32], a diagonal matrix is needed to put recursive processes
in place under an HOHMM. Let Dk be an N 2 × N 2 diagonal matrix to estimate several
functions of ζ(ywk+1 , ywk ), where
dk1 0 ··· 0
... ..
0 0 .
..
..
. .
0 dkN
...
Dk =
..
..
. .
dk1 0
.. ..
.
. 0 0
0 ··· ··· 0 dkN
k
X
Atsr
k := hywl−2 , er ihywl−1 , e s ihywl , et i. (3.18)
l=2
2. The number of occupations up to time k, which is the length of time that Markov
chain yw spent in state et , 2 ≤ l ≤ k and t = 1, . . . , N, is given by
k
X
Btk := hywl−1 , et i = Btk−1 + hywk−1 , et i. (3.19)
l=2
3. The number of occupations up to time k with the length of time that Markov chain
yw spent in state (et , e s ), 2 ≤ l ≤ k and s, t = 1, . . . , N, is calculated as
k
X
Bts
k := hywl−1 , et ihywl−2 , e s i. (3.20)
l=2
74 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
4. The auxiliary process related to the Markov chain yw for the function f up to time k
in state et , 2 ≤ l ≤ k, t = 1, . . . , N, is computed as
k
X
Ctk ( f ) := f (Xl )hywl−1 , et i = Ctk−1 ( f )
l=2
5. The conditional expectation of ζ(ywk+1 , ywk ) in Equation (3.16) can be written recur-
sively in terms of the diagonal matrix Dk+1 as
In order to get dynamic updates for Gk every time an observed value Xk arrives or a batch
of Xk ’s becomes available, we provide recursions via the vector quantity Gk ζ(ywk , ywk−1 ) that
bk .
will ultimately lead to an efficient updating of the scalar quantity G
Write
γk w (Gk ) := EQ Ψwk Gk |Xk .
(3.24)
By equation (3.14), the scalar γkw (Gk ) can be calculated using the vector quantity Gk ζ(ywk , ywk−1 )
by noticing
which shows G
bk can be obtained purely from calculations under Q and can be updated
dynamically if we have recursions for γkw (Gk ). So, to implement (3.26) for the quantities
Atsr t ts t
k , Bk , Bk , and Ck ( f ), we utilise the semi-martingale representation (3.9) to derive their
unnormalised recursive filtered estimates.
Kt , 1 ≤ t ≤ N
be an N 2 × N 2 matrix with eit on its ((i − 1) N + t)th column and 0 elsewhere. Then we have
the following result.
Proposition 1: The vector recursions involving the respective quantities in equations (3.18)–
(3.21) are
γk+1
w
k+1 ζ(yk+1 , yk ) =ΠDk+1 γk Ak ζ(yk , yk−1 ) + hsk , e sr idk+1 hΠe sr , ets iets ,
Atsr w w w tsr w w t
(3.27)
γk+1
w
Btk+1 ζ(ywk+1 , ywk ) =ΠDk+1 γkw Btk+1 ζ(ywk , ywk−1 ) + Kt dk+1
t
Πsk ,
(3.28)
γk+1
w
k+1 ζ(yk+1 , yk ) =ΠDk+1 γk Bk+1 ζ(yk , yk−1 ) + hsk , ets idk+1 Πets ,
Bts w w w ts w w t
(3.29)
γk+1
w
Ctk+1 ( f )ζ(ywk+1 , ywk ) =ΠDk+1 γkw Ctk+1 ( f )ζ(ywk , ywk−1 ) + f (Xk+1 )Kt dk+1
t
Πsk .
(3.30)
Proof The derivations are similar to those given in Mamon et al. [26].
log-likelihood functions is not straightforward for such functions with complicated struc-
tures like ours, we employ the EM algorithm, which is an effective iterative method for
parameter estimation especially for an exponential family of models [29].
We shall show that our EM results give optimal estimates in terms of the filters in (3.22)
and (3.27)–(3.30). Consider a family of probability measures {Pυ , υw ∈ Υw } on (Ω, F w ).
w
The algorithm is implemented by setting the initial Pυ0 and then changing from Pυ0 to Pυ
w
thereby giving Π updated entries through the filtering algorithms of the relevant quantities.
Recall that yw is an HOHMC with transition matrix H = (htsr ) under Pυ . Hence, a new
w
probability measure Pbυ must be constructed, under which yw is still an HOHMC but with
w
From Elliott et al. [6], the appropriate density in conjunction with the EM algorithm for
our successive estimation of the transition probabilities is
dPυ
w
= Λk , Λ0 = 1 and
dPυ0 Xk
N b hyl−2 ,er ihyl−1 ,e s ihyl ,et i
w w w
k Y
Y htsr
Λk = . (3.31)
l=2 t,s,r=1
htsr
htsr
b
When htsr = 0, b
htsr = 0; in this case, we set = 1. The resulting expression for b
htsr and
htsr
those for the rest of the parameters, computed as well via the EM algorithm, are given in
the succeeding summary.
Proposition 2: The EM estimates, at state t based on a data series up to time k (k ≥ 1), for
3.5. Numerical implementation 77
Ctk (Xk−1 , Xk ) − ϑt b
b Ct (Xk−1 )
κt =
b k
Ctk Xk−1
b 2
γkw bCtk (Xk−1 , Xk ) − ϑt γkw bCtk (Xk−1 )
= , (3.32)
γkw bCtk Xk−1
2
Ctk (Xk ) − κt b
b Ctk (Xk−1 )
ϑt =
b
Btk
b
γk+1
w
Ctk (Xk ) − κt γkw b
b Ctk (Xk−1 )
= , (3.33)
γkw b Btk
Ctk Xk2 + κt2 b
b Ctk Xk−12
Btk + 2ϑt κt b
+ ϑ2t b Ctk (Xk−1 )
%t =
b2
Bt
b
k
Ctk
−2κt b (Xk−1 , Xk ) − 2ϑt b
Ctk (Xk )
(3.34)
Bt
b
k
Atsr
b
htsr =
b k
, ∀ pairs (t, s) , t , s. (3.35)
B sr
b
k
Thus, when a temperature series is available at time k, we could get automatically new
parameters κt , ϑt , %t , and htsr by running the filtering recursions of the Markov chain given
in Proposition 1.
Table 3.1 shows some descriptive statistics for the DAT, which guided us with certain refer-
ences in choosing the initial values in the implementation process. The values of estimated
parameters from best fitting our data to the specified S t are exhibited in Table 3.2. The
adjusted coefficient of determination R2 shows that 83.5% of the variation in the response
T t can be well explained by the model with all the regressor variables in equation (3.2)
except for e2 and e3 . Consequently, these predictors are eliminated during the process of
variable selection on the basis of the scoring criteria built in the function ‘step’, including
the adjusted R2 and Akaike information criterion (AIC).
The fitted seasonality component S t is plotted in Figure 3.1 together with the actual DAT,
which displays high-seasonality property. The plot depicts four crests and four troughs
occurring in summer and winter, respectively, of the 4-year period. The main characteristics
of the two graphs neatly jibe with the temperature movements in the four seasons of each
year. The remaining component Xt is treated as our observation process which we shall use
3.5. Numerical implementation 79
40
fitted seasonal component
Temperature (Celsius)
actual observations
30
20
10
0
-10
-20
01/01/2011
01/03/2011
01/05/2011
01/07/2011
01/09/2011
01/11/2011
01/01/2012
01/03/2012
01/05/2012
01/07/2012
01/09/2012
01/11/2012
01/01/2013
01/03/2013
01/05/2013
01/07/2013
01/09/2013
01/11/2013
01/01/2014
01/03/2014
01/05/2014
01/07/2014
01/09/2014
01/11/2014
times
The cyclical component of the model captures very well the underlying seasonal trend of
the actual data amidst the noisy values of Xt . The deseasonalised stochastic component
Xt = T t − S t is displayed in Figure 3.2.
We process the data set in 73 batches, and so there are 20 data points in each group with
4t = 1 day. This means that the parameters are updated roughly every 3 weeks. Other
filtering-window sizes can certainly be adopted in the data processing. Our experiment
shows that the choice of the window size has little effect on the numerical outcomes. A
batch of 20 data points is fairly adequate to cover the arrival of new temperature recordings
that might influence changes in temperature such as wind power, ocean currents, and other
meteorological events.
80 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
20
Temperature (celcius)
10
-10
-20
01/01/2011
01/03/2011
01/05/2011
01/07/2011
01/09/2011
01/11/2011
01/01/2012
01/03/2012
01/05/2012
01/07/2012
01/09/2012
01/11/2012
01/01/2013
01/03/2013
01/05/2013
01/07/2013
01/09/2013
01/11/2013
01/01/2014
01/03/2014
01/05/2014
01/07/2014
01/09/2014
01/11/2014
times
Our online filtering method is consistent with the above statistical principle because we
use the past and current data through Xt to obtain parameter updates that encapsulate the
information up to present time t. Such parameter updates are used to process a new batch
of accumulated information in order to obtain a succeeding new set of parameter updates.
Also, utilising the same parameter updates, we obtain our predicted values for the calcu-
lation of error analysis metrics. Hence, the data set, used in prediction and calculation of
log-likelihood measures for testing goodness of fit and assessing model complexity in sec-
tion 3.5.5, is different from and does not overlap with that used in model estimation.
3.5. Numerical implementation 81
Table 3.3: Minimum and maximum parameter estimates for S t ’s coefficients covering
twelve 4-year moving windows as described in Section 3.5.3
Our method may appear not in agreement though with the above principle if S t is taken
into account. This is because the S t function must be calculated first using the entire data
set before the Xt data series can be produced. We argue that there is no inconsistency here
whenever the coefficients of the S t component remain almost constant as time evolves.
Such is exactly the case for the S t0 s coefficients estimated from our daily temperature read-
ings covering the 4-year period under study (01 Jan 2011 – 31 Dec 2014) and coefficients
estimated from the twelve prior 4-year period moving windows going backwards (i.e., Data
Window I: 01 Jan 2007– 31 Dec 2010; Data Window II: 01 Dec 2006 – 30 Nov 2010; Data
Window III: 01 Nov 2006 – 31 Oct 2010; etc). Table 3.3 displays the [minimum, max-
imum] values of the 12 estimates for each coefficient. Clearly, the variation is so small.
Thus, we take the average of each set of 12 estimates as a proxy for a given coefficient
value so that the use of values in Table 3.2 (as close approximations to the corresponding
proxies) for S t to determine Xt is justified in adherence to the above statistical principle.
lag and our current HOHMM approach still does not include a mechanism to provide an
optimal lag estimate based on the data series. However, for the purpose of illustrating our
filtering implementation, we verified that our temperature data series exhibits memory, val-
idating the appropriateness of the HOHMM.
Remark: Indeed, if for a given data set, a lag order of greater than 2 is formally nec-
essary (in a statistical-inference sense), then the transformation in (3.7) can be repeatedly
utilised until a 2nd-order HOHMM set up is obtained, and therefore, our current filtering
and estimation results could be applied in a straightforward manner. Despite the capac-
ity for sophistication of being able to include as many lags as needed, there is also the
practical consideration, especially from the perspective of implementation in the industry,
to balance between the benefit of having a flexible but complex model and the associated
formidable computational cost. Of course, given the power of supercomputers, we envision
that our optimal processing results in this chapter for large lags can be efficiently imple-
mented someday and the dreaded curse of dimensionality can be appreciably alleviated.
With the rapid continuing development in computing architectures, we hope to see that
the complicated part of re-coding and extension of filtering algorithms as the lag becomes
bigger could be facilitated with much ease.
The filtering algorithms are implemented by first finding the initial estimates of parameters
1
under the assumption that the Xt is a-single state process. A value of is given to each
N
non-zero element of the matrix Π. We detail the procedures in setting initial values for κ,
ϑ, and %. Given that {zk+1 } in (3.10) are IID standard normal, the likelihood function of Xk+1
is
m
(Xk+1 − ϑ − κXk )2
!
Y 1
L (Xk+1 ; κ, ϑ, %) = √ exp − (3.36)
k=1 2π% 2%2
where 1 ≤ m ≤ 1460 in our case. Equivalently, our task is to seek for the maximisers of the
sum of log-likelihood, i.e.,
Transition probabilities
h111
h211
0.8
h112
h212
h121
h221
h122
h222
0.4
0.0
10 20 30 40 50 60 70
Algorithm steps
We employ the R function ‘optim’ to solve the optimisation problem in (3.37), and get
κ = 0.6518, ϑ = −0.001624 and % = 0.4271 acting as benchmarks in selecting initial
values for the parameter estimation of frameworks with more than 1 regime.
3.5.4.2 Evolution of parameter estimates and comparison under HMM and HOHMM
settings
Propositions 1 and 2 aid in getting numerical results from running the self-tuning filtering
algorithms (3.27)-(3.30) on batches of data. Filtered estimates are then fed correspondingly
into parameter estimate representations in (3.32)-(3.35). The filters process new informa-
tion, and in turn, optimally update the parameter estimates. One algorithm run constitutes
an algorithm pass, and we have 73 passes in total. In every algorithm pass beyond the initial
pass, the parameter estimates from the previous pass serve as initial parameter values for
the succeeding pass. The filtering algorithm is implemented on the process Xt driven by
two-state and three-state HOHMCs.
unique values after approximately 25 passes in both the 2-state and 3-state HOHMM set
ups.
Akin to the reliability of parameter estimates is the quantification of their variability. There-
fore, we
" 2examine the variance of the various estimators via the Fisher information I (vw ) =
∂
#
−Evw log L (X; vw ) , which bounds the asymptotic variance of the ML estimates. The
∂v2w
MLE is consistent and has an asymptotically normal sampling distribution [36]. So, we
utilise the limiting distribution of the ML estimator b vw to obtain the 95% confidence inter-
1
val for the estimated vw in the form b vw ± 1.96 p . The Fisher information involved
I bvw
in each estimator is derived straightforwardly from the log-likelihood functions in the EM
algorithm calculations detailed in the Appendix C, and the final results of such derivations
are listed below.
Atsr
b
I (htsr ) = k
, (3.38)
h2tsr
2
Ctk Xk−1
b
I (κt ) = , (3.39)
%2t
Bt
b
I (ϑt ) = k
, (3.40)
%2t
Bt
b 3 t 2
I (%t ) = − 2k + 4 b C X
%t %t k k
+b Btk ϑ2t + κt2 b 2
Ctk Xk−1 Ctk (Xk ) ϑt ,
− 2b
Ctk (Xk−1 , Xk ) κt + 2b
− 2b Ctk (Xk−1 ) ϑt κt . (3.41)
The 95% confidence intervals for the parameters κ, ϑ, and % in the 2-state HOHMM-based
model are shown in Figure 3.6.
The 95% confidence intervals generated for estimated parameters throughout the 73 algo-
rithm passes are extremely narrow and this is attributed to the declining standard errors.
Standard errors of all parameter estimates for the 1-, 2- and 3-state models under both the
HMM and HOHMM settings were examined and they all become smaller as the the number
of algorithm passes increases. The narrow ranges exhibited in Table 3.4 reflect that precise
estimates are achieved by the EM algorithm with our filtering approach.
3.5. Numerical implementation 85
0.65
κ1
κ2
0.50
κ
0.35
10 20 30 40 50 60 70
Algorithm steps
-0.1 0.1
ϑ1
ϑ2
ϑ
-0.4
10 20 30 40 50 60 70
Algorithm steps
0.80
ρ1
0.65
ρ
ρ2
0.50
10 20 30 40 50 60 70
Algorithm steps
Figure 3.4: Evolution of parameter estimates for κ, ϑ, and % under a 2-state HOHMM-based
model
86 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
0.65
k1
0.50
k
k2
k3
0.35
10 20 30 40 50 60 70
Algorithm steps
0.0
J1
J2
-0.2
J3
J
-0.4
10 20 30 40 50 60 70
Algorithm steps
r1
r2
r3
0.65
r
0.50
10 20 30 40 50 60 70
Algorithm steps
Figure 3.5: Evolution of parameter estimates for κ, ϑ, and % under a 3-state HOHMM-based
model
3.5. Numerical implementation 87
0.65
10 20 30 40 50 60 70
Algorithm steps
-0.1 0.1
10 20 30 40 50 60 70
Algorithm steps
0.7
0.5
10 20 30 40 50 60 70
Algorithm steps
Figure 3.6: Evolution of parameter estimates for κ, ϑ, and % under a 2-state HOHMM-based
model with 95% confidence level
88 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
HMM
Parameter 1-state model 2-state model 3-state model
Estimates Bound of SE
Lower Upper Lower Upper Lower Upper
δi
b 3.62152 ∗ 10−90 0.84905 2.88315 ∗ 10−93 5.69605 ∗ 10−2 1.23351 ∗ 10−89 0.50619
ηi
b 3.13262 ∗ 10−90 0.81819 2.49571 ∗ 10−93 5.04276 ∗ 10−2 1.28888 ∗ 10−89 0.50535
i
b 2.21510 ∗ 10−90 0.54026 1.08302 ∗ 10−93 2.36474 ∗ 10−2 7.17870 ∗ 10−90 0.34230
p ji
b 4.79291 ∗ 10−90 1.08185 1.24810 ∗ 10−93 3.60257 ∗ 10−2 3.52893 ∗ 10−90 0.34041
HOHMM
Parameter 1-state model 2-state model 3-state model
Estimates Bound of SE
Lower Upper Lower Upper Lower Upper
κt
b 3.62152 ∗ 10−90 0.84905 1.21738 ∗ 10−96 6.33495 ∗ 10−2 1.53624 ∗ 10−90 7.84591 ∗ 10−1
ϑt
b 3.13262 ∗ 10−90 0.81819 8.94466 ∗ 10−97 5.67838 ∗ 10−2 1.33044 ∗ 10−90 7.57304 ∗ 10−1
b%t 2.21510 ∗ 10−90 0.54026 6.32678 ∗ 10−97 3.59050 ∗ 10−2 9.40836 ∗ 10−91 4.96797 ∗ 10−1
htsr
b 4.79291 ∗ 10−90 1.08185 4.74425 ∗ 10−97 7.03355 ∗ 10−2 6.79239 ∗ 10−91 8.85929 ∗ 10−1
Table 3.4: Interval of standard errors for the parameter estimates under 1-, 2-, 3-state
HMM- and HOHMM-based models
To make predictions of the DAT values over a one-step ahead time interval, we evaluate the
expected value of the observation process at time k + 1. Given Xk+1 in (3.10), we have
= κ,byk Xk + ϑ,bywk .
w
(3.42)
Figure 3.7 depicts the one-step ahead forecasts both for the deseasonalised data Xk and
DATs under the 3-state HOHMM-based model using (3.42). The very short-term predic-
tions for Xk and DATs are very close to the actual data series, and the same can also be
said for for Xk and DATs under the HMM-based models. We also present a comparison
3.5. Numerical implementation 89
Temperature (Celsius)
03/02/2011
05/02/2011
07/02/2011
09/02/2011
11/02/2011
01/02/2012
03/02/2012
05/02/2012
07/02/2012
09/02/2012
11/02/2012
01/02/2013
03/02/2013
05/02/2013
07/02/2013
09/02/2013
11/02/2013
01/02/2014
03/02/2014
05/02/2014
07/02/2014
09/02/2014
11/02/2014
times
Temperature (Celsius)
actual DATs
30 forecast of DATs
20
10
0
-10
-20
01/02/2011
03/02/2011
05/02/2011
07/02/2011
09/02/2011
11/02/2011
01/02/2012
03/02/2012
05/02/2012
07/02/2012
09/02/2012
11/02/2012
01/02/2013
03/02/2013
05/02/2013
07/02/2013
09/02/2013
11/02/2013
01/02/2014
03/02/2014
05/02/2014
07/02/2014
09/02/2014
11/02/2014
times
between the observed seasonal HDD and the predicted seasonal HDD covering the entire
period (2011-2014) under the 3-state HOHMM-based settings in Figure 3.8. Similar com-
parison was performed, though not shown, under the HMM-based settings. The seasonal
HDD forecasts obtained from the proposed models follow closely the actual seasonal HDD.
Furthermore, the magnified view of the predicted DATs under the HOHMM set ups is given
in Figure 3.9 covering a 6-month period; DATs forecasts under the HMM setting are also
generated but not shown here. All forecasts are relatively close to the actual DATs. The
dynamics and trends of the temperature series are captured well by our filtering algorithms
and self-calibrating parameter estimation method.
90 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
4000
4000
4000
3000
3000
3000
Cumulative HDD
Cumulative HDD
Cumulative HDD
2000
2000
2000
1000
1000
1000
actual HDD actual HDD actual HDD
predicted HDD predicted HDD predicted HDD
0
0
Nov
Dec
Jan
Nov
Dec
Jan
Jan
Oct
Feb
Mar
Apr
Oct
Feb
Mar
Apr
Nov
Dec
Oct
Feb
Mar
Apr
2011-2012 2012-2013 2013-2014
Figure 3.8: Comparison of the expected HDD and actual HDD in a 3-state HOHMM-based
model
25 actual DATs
Temperature (Celsius)
07/08/2014
07/15/2014
07/22/2014
07/29/2014
08/05/2014
08/12/2014
08/19/2014
08/26/2014
09/02/2014
09/09/2014
09/16/2014
09/23/2014
09/30/2014
10/07/2014
10/14/2014
10/21/2014
10/28/2014
11/04/2014
11/11/2014
11/18/2014
11/25/2014
12/02/2014
12/09/2014
12/16/2014
12/23/2014
12/30/2014
times
Figure 3.9: Comparison of one-step-ahead forecasts in 1-, 2-, and 3-state HOHMM-based
models
3.5. Numerical implementation 91
We perform an error analysis to quantify the goodness of fit of various HMM and HOHMM
settings using the criteria put forward in Erlwein et al. [19], which are the mean square error
(MSE), root mean square error (RMSE), absolute mean error (MAE) and relative absolute
error (RAE). Suppose Xk denotes the actual value at time k, X̂k stands for the one-step
ahead prediction value at time k, Ȳ is the mean of Xk ’s, and n = 1460 is the total number of
predicted values.
2 s 2
Pn Pn
k=1 X̂k − Xk k=1 X̂k − Xk
MSE = , RMSE = ,
n n
Pn Pn
k=1 | X̂k − Xk | |X̂k − Xk |
MAE = and RAE = Pk=1
n .
n k=1 |Xk − X̄|
Table 3.5 displays the error-analysis results involving Xk and DATs data under the HMM
setting whilst Table 3.6 contains the error-analysis results under the HOHMM setting. Al-
though the errors are generally small, the error metrics illustrate that the 2-state model
outperforms the 1- and 3-state models under both HMM and HOHMM frameworks. In
addition, the 1-, 2- and 3-state HOHMM-based models produce better forecasts than those
generated by their corresponding HMM settings as far as error measures are concerned.
The 4-state HMM-based and HOHMM-based settings were also examined, but no evi-
dence of even minimal improvement is achieved by making the model more complex such
as including 64 parameters in its probability transition matrix.
To determine if the error-mean differences are statistically significant in each pairwise set-
ting, we perform a t-test using the bootstrapped method. We generate RMSEs for all pos-
sible paired HOHMM settings, and then calculate, using the R function ‘p.adjust’, the
adjusted p-values with the Bonferroni’s method to control the familywise error rate. For
the 1-state HOHMM versus the 2-state HOHMM, the p-value is smaller than 0.01 so that
we can conclude there is sufficient evidence of significant difference after adding one more
regime into the one-state model. For the 2-state HOHMM vis-á-vis the 3-state HOHMM,
the p-value is much greater than 0.05, meaning that we cannot reject the null hypothesis
of no difference. The estimated p-values covering three pairs of model settings are shown
in Table 3.7. On the basis of a 5% significance level, the 1- and 2-state, and 1-and 3-state
models in both the HMM and HOHMM frameworks are statistically different, whilst the 2-
and 3-state settings are not. This suggests that there is benefit to using a regime-switching
92 Chapter 3. A self-updating model driven by HOHMC for temperature dynamics
model, and the 2- and 3-state HMM and HOHMM settings possess similar capability to
capture the DAT dynamics. That is, a two-state model is sufficient for our data.
We complement our error analysis with a likelihood-based model selection analysis via the
AIC. This estimates the Kullback-Leibler information under the ML paradigm, and given
by
where g is the number of estimated parameters in the model, and log L (X; v) denotes
the log-likelihood function of the model given the data X and a set of parameters v. A
model deemed the best and is chosen if it is able to balance between fitness (maximise the
log-likelihood function) and complexity (minimise the penalty from too many parameters)
yielding the lowest AIC value. As a function of the observation Xt , parameter sets v∗ and
vw , the log-likelihood functions under the respective HMM and HOHMM settings are
β "
X 1
log L (Xk ; v ) =
∗
log √
k=1 2πσ (yk )
Xk+1 − µ (yk )
#
− , (3.43)
2σ2 (yk )
β X N "
X 1
log L (Xk ; v ) =
w
hyk , et i log √
w
k=1 t=1 2π% (y k )
#
Xk+1 − κ yk Xk − ϑ yk
w w
− , (3.44)
2%2 ywk
where β denotes the number of observations in each pass, and N is the number of regimes.
From equation (3.10) and the matrix of transition probabilities, we can determine the total
number of parameters in each of the HMM and HOHMM settings and these are presented
in Table 3.8.
Table 3.8: Number of estimated parameters under the HMM and HOHMM settings
Employing equations (3.43) and (3.44), and Table 3.8, we compute the AIC values for
each model as we run through the algorithm passes. Figure 3.10 illustrates the evolution
of the computed AIC values for all candidate models. As the number of regimes grows
in a model, there is a substantial increase in the number of parameters especially coming
from the number of transition probabilities. Our results show that the 2-state HOHMM has
the smallest AIC value and a more stable pattern in the entire data period. Therefore the
2-state model is the best-fitting model for the dynamics of our data set, which agrees with
our error analysis.
100
80
AIC
60
40
20
0 10 20 30 40 50 60 70
Algorithm steps
Figure 3.10: Evolution of AIC for 1-, 2-, and 3-state HMM- and HOHMM-based models
modelling situation. More specifically, these models for benchmarking do not have the ca-
pability to capture the important property of mean-reversion for temperature data and thus,
they are inordinately disadvantaged in comparison to mean-reverting models. Hence, the
only meaningful benchmark not involving any Markov chain in our case is the one-state
OU process, where there is no switching of regimes. From Figure 3.10, the 1-state model
(i.e., no Markov chain) is performing poorly relative to the 2- and 3-state models. This is
also supported by the HDD error-analysis metrics shown in Table 3.9.
Table 3.9: Results of error analysis for the HDD on HMM- and HOHMM-based models
3.6. Conclusion 95
3.6 Conclusion
The major contribution of this chapter is the development of a model flexible enough to
describe a data set that exhibit seasonality, randomness, mean-reversion and memory. Such
is the case for the dynamics of DATs series, which is the most utilised underlying variable
in the creation of weather-dependent derivatives. In particular, we put forward a model fol-
lowing a deterministic seasonality component and the mean-reversion is governed by the
OU process. The stochasticity is generated by the use of the Brownian motion in the diffu-
sion component. Embedding an HOHMM facilitated the switching of parameters amongst
various conceivable regimes and also captures short or long-range dependence.
Our estimation procedure for all the model parameters is successfully achieved through
the extension of the HMM-OU filtering techniques applied to a proposed transformed
HOHMM. With the change of probability measure and EM algorithm, self-tuning HOHMM
recursive filters were obtained that supports online parameter estimation. Our results also
encompass the special case of HOHMM with lag 1, which is the case for the usual HMM
recursive filters. Our modelling methodology was tested on Toronto’s DATs covering a
4-year period. Our post-modelling diagnostics reveal reliable one-step ahead forecasts un-
der various settings. We must note though that the 2-state HOHMM provides the best
framework for the data that we investigated. This is validated by the AIC analysis and the
accompanying error analyses.
The current HMM-modulated models have been employed in many areas of finance and
economics, such as the modelling of commodity futures prices, developing asset allo-
cation strategies, valuing interest-rate products, and so on. HMM- and HOHMM-based
methods in weather derivatives are virtually nonexistent until the publication of the 2-state
regime-switching temperature model of Elias et al.[14]. This work further elevates that
development temperature modelling for weather derivatives in three respects: (i) provision
of dynamic estimation with efficient algoritms, (ii) generalisation of the usual HMM to
take advantage of information beyond lag one, and (iii) empirical results that reinforce the
choice of the optimal regime dimension. We believe that our proposed method offers an
effective alternative to available approaches in efficient modelling and forecasting temper-
ature dynamics with easily interpretable results.
measure and linking this pricing measure to the optimal estimates produced from our pro-
posed filtering procedure. Whilst this is beyond the objectives of this study, we have laid
down some groundwork to motivate that kind of research exploration. It would be also
worth examining this HOHMM-based filtering algorithms and the parameter estimates in
relation to the modelling of other weather measurements such as wind speed, level of rain-
fall, etc, either for the purpose of financial pricing or meteorological modelling and fore-
casting. Finally, an apparent weakness of this research is the modeller’s prescription of
HOHMM’s lag order before the filtering algorithms could be implemented. It is hoped that
this could be rectified in future research through the construction of statistical inference
techniques that estimate the correct lag order as implied by the data.
References
[4] J. Beran, Statistics for Long-Memory Processes, Chapman and Hall, New York
(1994).
[5] S. Campbell, F. Diebold, Weather forecasting for weather derivatives, Journal of the
American Statistical Association, 100 (469) (2005) 6–16.
[6] M. Cao, J. Wei, Weather derivatives: a new class of financial instruments, University
of Toronto Web, http://www.rotman.utoronto.ca/ wei/research/JAI.pdf (2003), Ac-
cessed March 2016.
[7] W. Ching, T. Siu, L. Li, Pricing exotic options under a high-order Markovian regime
switching model, Journal of Applied Mathematics and Decision Sciences., (2007)
1–15, doi:10.1155/2007/18014.
[9] P. Date, A. Tenyakov, R. Mamon, Filtering and forecasting commodity futures prices
under an HMM framework, Energy Economics, 40 (2013) 1001-1013.
[10] M. Davis, Pricing weather derivatives by marginal value, Quantitative Finance, 1 (3)
(2001) 305–308.
97
98 REFERENCES
[11] B. Dischel, At least: a model for weather risk, Weather Risk Special Report, Energy
and Power Risk Management, March issue (1998) 3032.
[12] G. Dorfleitner, M. Wimmer, The pricing of temperature futures at the Chicago Mer-
cantile Exchange, Journal of Banking and Finance, 34(6) (2010) 1360–1370.
[13] J. Dutton, Opportunities and priorities in a new era for weather and climate services,
Bulletin of the American Meteorological Society, 83 (9) (2002) 1303-131.
[15] R. Elliott, Exact adaptive filters for Markov chains observed in Gaussian noise, Auto-
matica, 30 (1994) 1399–1408.
[16] R. Elliott, L. Aggoun, J. Moore, J., Hidden Markov Models: Estimation and Control,
Springer, New York (1995).
[18] C. Erlwein , R. Mamon, An online estimation scheme for a HullWhite model with
HMM-driven parameters, Statistical Methods and Applications, 18 (1) (2009) 87–
107.
[19] C. Erlwein, F. Benth, R. Mamon, HMM filtering and parameter estimation of an elec-
tricity spot price model, Energy Economics, 32 (5) (2010) 1034–1043.
[20] D. Gujarati, Basic Econometrics (4th ed), McGraw-Hill, New York (2003).
[21] C. Granger, R. Joyeux, An introduction to long memory time series models and frac-
tional differencing, Journal of Time Series Analysis, 1 (1980) 49–64.
[27] R. Mamon, C. Erlwein, B. Gopaluni, Adaptive signal processing of asset price dy-
namics with predictability analysis, Information Sciences, 178 (1) (2008) 203–219.
[28] Price Waterhouse Coopers, 2011 Weather risk derivative survey, Weather Risk
Management Association,
http://library.constantcontact.com/download/get/file/1101687496358-
153/PwC+Survey+Final+Presentation+20110519PRESS.pdf
(2011), Accessed March 2016
[29] L. Rabiner, A tutorial on hidden Markov models and selected applications in speech
recognition, Proceedings of the IEEE, 77 (2) (1989) 257–286.
[31] T. Siu, W. Ching, E. Fung, M. Ng, X. Li, A high-order markov-switching model for
risk measurement, Computers and Mathematics with Applications, 58 (1) (2009) 1–
10.
[33] A. van der Vaart, Asymptotic Statistics, Cambridge University Press, Cambridge and
New York (1998).
[34] World Meteorological Organization (WMO) and the Centre for Research on the Epi-
demiology of Disasters (CRED) of the Catholic University of Louvain (UCL), Atlas
100 REFERENCES
of Mortality and Economic Losses from Weather, Climate and Water Extremes 1970-
2012, WMO Press, Belgium (2014).
[36] X. Xi, R. Mamon, Parameter estimation of an asset price model driven by a weak
hidden markov chain, Economic Modelling 28 (1) (2011) 36–46.
Chapter 4
A higher-order Markov
chain-modulated model for electricity
spot-price dynamics
4.1 Introduction
101
102 Chapter 4. An HOHMM for electricity spot-price dynamics
electricity spot prices, many financial models designed for regular commodities cannot be
adapted necessarily to the electricity markets.
Compared to common and tangible assets in the capital markets, electricity is non-storable;
when produced, it must be consumed almost instantly. This non-storability feature makes
its spot price highly sensitive to demand and supply in real time. As a result, electricity
has more dramatic price evolutions than those of other energy sources, such as crude oil
and natural gas. Since electricity cannot be stored, electricity prices are largely dependent
upon supply and demand, which exhibit pronounced seasonality patterns. Somewhat dif-
ferent from general seasonality, highly occurring multi-cyclical nature of prices is evident
in electricity markets. Annual and quarterly patterns, for example, are attributed princi-
pally to the variations in temperature and duration of daylight, especially during winter and
summer. Cyclical patterns also occur weekly, daily and even intra-daily caused by quantity
demanded that is varying in time.
Modelling electricity prices can be divided into two roughly categorical approaches. The
first approach focuses on the modelling of the entire forward curve akin to the valuation of
associated derivatives and short-term prediction of futures prices; see Islyaev and Date [19]
and Fanelli et al. [12]. The second approach concentrates on model formulation of spot
prices taking into account several and only the most relevant fundamental price drivers
observed in electricity markets; see Ziel et al. [47]. The emphasis of our research is on
the latter approach. Early research works mainly put forward stochastic models that only
capture seasonality and mean reversion in prices. The seasonal systematic patterns are typ-
ically described by sinusoidal functions [23] and the Ornstein-Uhlenbeck (OU) process is
4.1. Introduction 103
prevalently utilised to model the mean reversion as noted in Schwartz [31]. A two-factor
model in an OU-inspired setting was developed by Schwartz and Smith [32] in an effort
to explicitly portray seasonal patterns and short-term mean-reverting variations in prices at
the equilibrium level. However, most of these early literature did not take into account the
peculiarity of electricity prices and examine it thoroughly.
Motivated by the phenomenon of frequent but large price jumps observed in the electric-
ity markets, scholars began to develop modelling capabilities that can handle spikes in
electricity-price fluctuations. Deng [6] pioneered the use of jump-diffusion models with
a mean-reverting process to replicate the distinct characteristics of electricity-spot prices;
such models were fitted to data from the American markets. Benth et al. [1] added jumps
to a general exponential multi-factor mean-reverting model and performed calibration em-
ploying data from the Nord-Pool market. In Seifert and Uhrig-Homburg [34], different
models for the jump component in the electricity markets were explored and the effec-
tiveness of different jump specifications compared; such models were applied to the Euro-
pean Energy Exchange (EEX) market. Many research proponents advanced the utility of
Poisson-jump models, whilst others, in recent literature, argued that modelling the ‘jumpy’
attribute could be achieved better by introducing a regime-switching approach. Huisman
and Mahieu [18] proposed a regime-shifting jump process with a recovery state and con-
cluded that it performed better than a stochastic jump model whilst a two-regime-switching
model with ‘abnormal’ and ‘normal’ states was shown superior to a Poisson-jump model
in De Jong [7] in capturing electricity-price dynamics. Wu et al. [37] studied the Al-
berta electricity-spot market via a hidden Markov model with a multi-model identification
approach. Alberta electricity pool prices are divided into five classes, and the last three
classes with prices over $100/MWh are deemed high-pool-price regions. It was found that
price-forecasting performance could be improved substantially, especially in high-pool-
price regions, by incorporating a Markovian regime-switching mechanism.
In the context of regime-switching approaches, hidden Markov models (HMMs) have been
widely successful in many engineering, economic and financial applications; see Mamon
and Elliott [20, 25]. HMMs are beneficial for modelling processes illustrating regime-
switching dynamics via Markov chains with latent states. This approach was previously
applied in the examination of electricity-price behaviour. A non-stationary model based
on input-output HMM was developed to model time series of spot prices in the Spanish
electricity market [15]. As well, Yu and Sheblé [46] utilised HMMs to efficiently model
104 Chapter 4. An HOHMM for electricity spot-price dynamics
price movements in the electricity market and provide good forecasts adjudged from the
perspective of accuracy and dynamic information in the US market.
The assumption of first-order state transition dependency in HMM renders this model de-
ficient in taking advantage of the information content of historical data. Thus, researchers
introduced recently higher-order hidden Markov models (HOHMMs), also called weak
hidden Markov models. An HOHMM is a doubly embedded stochastic process having
an observation series and an underlying unobserved Markov chain. The probability dis-
tribution of the Markov chain’s state transition at present depends not only on the most
recent state but also on states at prior epochs into the past. The primary aim in practice
is to obtain the best estimate of the Markov chain’s current or future state, which repre-
sents the ‘filtered’ or ‘predicted’ state of the market or economic system. This estimation is
performed by taking the conditional expectations of functions of the Markov chain and ob-
served electricity spot prices that are regarded as offshoot of the interaction of many latent
factors, such as market participants’ actions, production and consumption, weather condi-
tions, transmission network, etc. In our context, HOHMM setting is designed to capture the
presence of memories in electricity market prices that will provide additional information
in the estimation and forecasting of economic regimes and other model parameters.
To the best of our knowledge, this research is the first to build an HOHMM-modulated
electricity spot-price model. The regular HMM approach is extended in our approach
and the HOHMM’s prediction performance is examined. Our work can viewed both as
an update and extension of Erlwein et al.’s model construction [7] that was based on an
exponential-OU process with a jump component under the HMM setting; albeit Erlwein et
al.’s empirical application covers the Nord-Pool market whilst ours investigates the Alberta
electricity market. The structure of the electricity industry in Alberta is unique in North
4.2. Model formulation 105
America and apparently different from that of Norway, as the former is developed from
a set of uniquely distinguishable circumstances. In particular the Alberta Electric System
Operator (AESO) relies on wind farms; and to avoid power shortages when wind drops off
and does not blow consistently, coal and natural gas plants have to take up the slack.
Modelling the behaviour of electricity spot prices is of utmost importance because they
serve as the underlying variable for the values of many traded electricity contracts and
they are key indicators in the strategic planning and investment-decision making of various
stake holders in the electricity market. Our main research contribution is the creation of
an HOHMM-OU-Poisson framework that simultaneously delineates five salient properties
of electricity spot prices. The implementation portion of this work highlights model val-
idation and post-modelling diagnostics. Capitalising on Erlwein et al.’s use of HMM on
electricity spot price modelling [7], this research further highlights HOHMC-modulated
model parameters to address memory in time series data. Our recursive filters are also ex-
pressed more compact compactly in a matrix notation.
This chapter is organised as follows. In Section 4.2, we building an electricity spot price
model whose parameters are governed by an HOHMM in discrete time. Through a change
of reference probability technique, adaptive filters are presented in Section 4.3 for the
states of the HOHMH and related quantities of the observation process. We outline the
self-calibrating parameter estimation scheme in terms of the recursive filters via the EM al-
gorithm in Section 4.4. Numerical implementation, which includes assessment of model’s
goodness of fit and prediction ability, is performed on a 4-year AESO data in Section 4.5.
Implication to forward pricing of our proposed HOHMM-modulated electricity spot price
model is also illustrated. Section 4.6 provides some concluding remarks.
models to the Alberta electricity market; however, these models could not produce spikes in
the dynamics of electricity spot prices. Invoking Benth et al. [1], we suppose the electricity
spot price S t at time t, defined on some probability space (Ω, F , P), is given by
S t = Dt eXt . (4.1)
Equation (4.1) comprises two components to handle seasonality, and stochasticity coming
from normal perturbations and jump behaviours. The deterministic function Dt accounts
for regularities in the price evolutions and periodic trends. The stochastic part Xt is as-
sumed to be an OU process (to rationalise the tendency to return to a long-run mean) plus
a compound Poisson process that deals with excessive volatilities and spikes.
component; these equip the capability to model mean reversion, price variations and spike
dynamics. The process Xt in equation (4.1) satisfies the stochastic differential equation
(SDE)
where α, θ, and ξ stand for the speed of mean reversion, mean-reverting level, and volatility,
respectively. A Brownian motion {Wt } models the random price fluctuations under a stable
market condition, whilst {Jt } is designed to pick up the price spikes. A compound Poisson
process affords capacity to model jumps, as adduced in [11, 20] and similar to their settings,
we let dJt = βdPt , where Pt is a Poisson process with a constant intensity ς and a normally
distributed jump size β ∼ N µβ , σ2β . For s ≤ t, the continuous-time solution of equation
(4.3), by Itô’s lemma, is
Z t
α(t−s)
eαu dWu
Xt =X s e + 1−e−α(t−s)
θ + ξe−αt
s
Z t
+ e−α(t−u) dJu . (4.4)
s
where 4tk+1 = tk+1 − tk for k ∈ Z∗ := Z+ ∪ {0}, υm is the occurrence time of the mth jump,
and {zk+1 } is a sequence of independent and identically distributed (IID) standard normal
random variables.
To create a regime-switching model for electricity spot prices, we begin with a homoge-
neous Markov chain yk with a finite state space {e1 , e2 , . . . , eN }, where ei = (0, . . . , 0, 1, 0, . . .
, 0)> ∈ RN with 1 in the ith position, > is the matrix transpose operator, and N is the state-
space dimension. Model parameters in equation (4.5) switch randomly, in accordance with
the dictates of yk , amongst different electricity-market regimes as time progresses. With
the canonical basis as yk ’s state space, we have αk := α(yk ) = hα, yk i, θk := θ(yk ) = hθ, yk i,
ξk := ξ(yk ) = hξ, yk i, and βk := β(yk ) = hβ, yk i, where h·, ·i is the inner product in RN . The
108 Chapter 4. An HOHMM for electricity spot-price dynamics
To completely characterise the parameter estimation and filtering under the HOHMM set-
ting, we concentrate on a second-order hidden Markov chain. Of course, in theory and
principle, the estimation and filtering for the generalised Markov chain with order greater
than 2 can be extended in a straightforward manner, notwithstanding the corresponding
computational challenge. Suppose yw is a second-order hidden Markov chain regulating
the evolution of α, θ, ξ, and β. We define Fk := Fkw ∨ Fkz ∨ FkJ as the global filtration,
where Fkw , Fkz , and FkJ are filtrations generated by {ywk }, {Wt } and {Jt }, respectively. Under
the probability space (Ω, F , {Fk }, P), the discrete-time hidden Markov chain ywk at the cur-
rent step k depends on the information revealed at two prior steps k − 1 and k − 2. Following
2
[28], write T for the RN×N transition probability matrix, and specified as
p111 p112 ··· p11N ··· p1N1 p1N2 ··· p1NN
p211 p212 ··· p21N ··· p2N1 p2N2 ··· p2NN
,
T := .. .. .. .. .. .. .. ..
. . . . ··· . . . .
pN11 ··· pN1N ··· pNN1 pNN2 ··· pNNN
where pdcb := P(ywk+1 = ed |ywk = ec , ywk−1 = eb ) with k ≥ 1 and d, c, b ∈ {1, 2, . . . , N}, which
stands for the probability that the Markov chain will be in state d at time k + 1, given that
it is in state c at time k and in state b at time k − 1.
A vital strategy in the estimation and filtering of HOHMMs is a mapping that converts an
HOHMM into an HMM, and then the usual HMM filtering methods can then be conve-
niently applied. For our second-order HMC, let $ be a transformation defined by
where ebc is a unit vector with 1 in its ((b − 1)N + c)th position. We obtain a new Markov
chain $(ywk+1 , ywk ), whose state space is the canonical basis of RN , and
2
where {mwk+1 }k≥1 is a sequence of martingale increments with E[mwk+1 |Fkw ] = 0 under the
real-world measure P. The matrix A = (a ji ) is the RN
2 ×N 2
probability transition matrix with
entries a ji := pdcb = P(ywk+1 = ed |ywk = ec , ywk−1 = eb ) for j = (d − 1)N + c, i = (c − 1)N + b,
and a ji = 0 otherwise.
To aid the filtering computations in our HOHMM setting, we re-express Xk+1 in equation
(4.6) as
where
w
κ ywk = e−α(yk )4tk+1 ,
(4.11)
w
ϑ ywk = 1 − e−α(yk )4tk+1 θ ywk ,
(4.12)
s
w
1 − e−2α(yk )4tk+1
% yk =w
ξ(ywk ) , (4.13)
2α(ywk )
110 Chapter 4. An HOHMM for electricity spot-price dynamics
P4tk+1
X w
τ ywk = e−α(yk )(4tk+1 −vh ) βh (ywk ).
(4.14)
h=1
Akin to the change of measure is an ideal reference measure P̃, whose construction is
justified by the Kolmogorov’s extension theorem [3]. As noted earlier, the corresponding
dynamics of Jk and yw are unaltered by the measure change. We may recover P from P
k
e
given an Fkw -adapted process, for k ≥ 1, through the Radon-Nikodym derivative
k
dP Y
Ψwk = = ϕwl (4.15)
dP F
e w
k l=1
and
−1 h i
φ % yl−1 2 w
Xl − ϑ yl−1 − κ yl−1 Xl−1 − τ ywl−1
w w
ϕl =
w
%2 ywl−1 φ (Xl )
w
ϑ yl−1 + κ ywl−1 Xl−1 + τ ywl−1 Xl
= exp −
%2 ywl−1
2
ϑ ywl−1 + κ ywl−1 Xl−1 + τ ywl−1
− ,
2%2 ywl−1
where φ is the probability density function of a standard normal random variable and Ψ0 =
1, {Ψl , l ∈ Z+ } is an Flw -adapted martingale under P.
>
In particular, let gk = gk (1) , gk (2) , . . . , gk (cb) , . . . , gk (NN) ∈ RN , where gk (cb) :=
2
h i
P ywk = ec , ywk−1 = eb | Xk = E h$(ywk , ywk−1 ), ecb i | Xk . A filter for $(ywk , ywk−1 ) under P is
given by
h i
Write γk := EP Ψwk $(ywk , ywk−1 )|Xk . Since c,b=1 h$(ywk , ywk−1 ), ecb i = h$(ywk , ywk−1 ), 1i = 1,
e PN
2
where 1 is an RN vector with 1 in all of its entries, we have
N
X N
X D Pe w E
γk , ecb = E Ψk $(ywk , ywk−1 )|Xk , ecb
c,b=1 c,b=1
N
X
= E Ψk
wP
$(yk , yk−1 ), ecb | Xk
w w
e
c,b=1
=E P
Ψwk | Xk .
e
(4.17)
Therefore, the filter of $(ywk+1 , ywk ) in equation (4.16) under P has an explicit representation
given by
γk γk
gk = P N =
(4.18)
c,b=1 γk , ecb
γk , 1
.
To derive recursive processes and estimate relevant quantities in terms of $(ywk+1 , ywk ), we
need to construct two N 2 × N 2 matrices by following Xi and Mamon [38]. These are Kt
with eit on its ((i − 1) N + t)th column and 0 elsewhere for 1 ≤ i, t ≤ N, and a diagonal
matrix Hk , where
h1k 0 ··· 0
... ..
0 0 .
..
...
.
0 hkN
..
Hk = .
..
...
.
h1k 0
.. ..
.
. 0 0
0 ··· ··· 0 hkN
Following Xiong and Mamon [43], we define certain scalar processes of interest involving
the HOHMC ywk . These are as follows:
112 Chapter 4. An HOHMM for electricity spot-price dynamics
(i)
k
X
Aktsr = hywl−2 , er ihywl−1 , e s ihywl , et i, (4.20)
l=2
which refers to the number of jumps from state (er , e s ) to state et up to time k, where
2 ≤ l ≤ k and r, s, t = 1, . . . , N;
(ii)
k
X
Bkt = hywl−1 , et i = Bk−1
t
+ hywk−1 , et i, (4.21)
l=2
which gives to the occupation time up to time k or the length of time that ywk spent in
state et , where 2 ≤ l ≤ k and t = 1, . . . , N;
(iii)
k
X
Bkts = hywl−1 , et ihywl−2 , e s i, (4.22)
l=2
which represents to the occupation time up to time k or the length of time that ywk
spent in state (et , e s ), where 2 ≤ l ≤ k and s, t = 1, . . . , N;
(iv)
k
X
Ckt ( f ) = f (Xl )hywl−1 , et i
l=2
=Ck−1
t
( f ) + f (Xk )hywk−1 , et i, (4.23)
which is an auxiliary process related to ywk for some function f up to time k in state
et , where 2 ≤ l ≤ k, t = 1, . . . , N. Here, f takes the functional forms f (X) = X,
f (X) = (X)2 or f (X) = Xk−1 Xk .
Consequently, the conditional expectation of $(ywk+1 , ywk ) in equation (4.18) can be written
recursively as
gk+1 = AHk+1 gk . (4.24)
Suppose Uk is any Xk -measurable process, denoting any of the quantities in equations
h i
(4.20)-(4.23). Write Dwk [Uk ] := EP Ψwk Uk | Xk and U
bk := E [Uk | Xk ]. Here, Ubk is
e
regarded as the ‘best estimate’ of Uk . Similar to the steps in establishing equation (4.18),
the conditional expectation of Uk given Xk can be obtained using calculations that are
entirely under P
e by observing that
Evaluating the numerator of (4.25) (cf Mamon et al. [26]) for each quantity defined in equa-
tions (4.20)–(4.23) and taking advantage of the semi-martingale representation in (4.9), re-
cursive filtering equations are obtained that will provide self-calibrating estimates of the
HOHMM parameters. This will be elaborated further in Section 4.4.
Proposition 1: The filtering recursions for the respective quantity in equations (4.20)–
(4.23) are
Dwk+1 Ak+1
tsr
$(ywk+1 , ywk ) =AHk+1 Dwk Aktsr $(ywk , ywk−1 )
Dwk+1 Bk+1
t
$(ywk+1 , ywk ) =AHk+1 Dwk Bk+1
t
$(ywk , ywk−1 )
Dwk+1 Bk+1
ts
$(ywk+1 , ywk ) =AHk+1 Dwk Bk+1
ts
$(ywk , ywk−1 )
and
Dwk+1 Ck+1
t
( f )$(ywk+1 , ywk ) =AHk+1 Dwk Ck+1
t
( f )$(ywk , ywk−1 )
Proof The proofs of (4.26)-(4.29) are similar to those given in Mamon et al. [26].
log-likelihood function given the information reflected in Xt . Model parameters are gov-
erned by ywt but suppose that the HOHMM and all model parameters remain unchanged
over the infinitesimal interval [s, t]. Then, the regular HOHMM-modulated OU process
without jumps has a normal distribution with mean X s eα(t−s) + 1 − e−α(t−s) θ and variance
ξ2 (1−e−2α(t−s) )
2α
.
The pdf of Xt can be derived completely if the distribution of the jump component Jt is also
determined. As stated in Section 4.2.2, Jt relies on β and P. We recall that β ∼ N µβ , σ2β
Rt
and P has a constant jump intensity ς. Utilising the integral approximation s e−α(t−u) dJu ≈
e−α(t−u) (Jt − J s ) in Erlwein et al. [7], the density of the jump component is
∞
X (ς (t − s))m
Φ Jt−s (x) = e−ς(t−s)
m=0
m!
× φ x; µβ e−α(t−s) m, σ2β e−2α(t−s) m . (4.30)
By noting the stationarity of the compound Poisson process, equation (4.30) is also the
density function of the increment Jt − J s .
In equation (4.4), Xt is the sum of a regular OU process and a jump term. As in [17], and
just focusing first on the non-switching setting, we can utilise the convolution of the OU
and Jt ’s densities to get the probability density of Xt given X s , which turns out to be an
expectation of a normal density. That is,
∞
X (ς (t − s))m
ΦXt |Xs (x) = e−ς(t−s)
× φ x; X s eα(t−s)
m=0
m!
+ 1 − e−α(t−s) θ + µβ me−α(t−s) ,
ξ2 1 − e−2α(t−s)
+ σβ me
2 −2α(t−s)
2α
"
=EP4t φ x; X s eα(t−s) + 1 − e−α(t−s) θ
ξ2 1 − e−2α(t−s)
+ µβ P4t e−α(t−s) ,
2α
#
+ σ2β P4t e−2α(t−s) , (4.31)
Going back to the implementation of the EM algorithm under our HOHMM setting, let P4t
be q and assume (for tractability) that Pt is independent of other parameters in the model.
From (4.10) and (4.31), the discrete-time process Xk+1 , under the HOHMM framework, is
set to follow a mixture of normal distributions with
and
σ2X ywk = %2 ywk + σ2β(yw ) qκ2 ywk .
k
We now compute the maximum likelihood estimates (MLEs) of model parameters via the
EM algorithm. Define Ew = {κt , ϑt , %t , µβt , σβt , ptsr , 1 ≤ t, s, r ≤ N} as the set of HOHMM-
based parameters. Starting with the set Ew0 of initial parameters, the recursive parameter
updates bw of MLEs, where E
an updated set E bw ∈ argmaxEw L(Ew ) and L(Ew ) =
wyield
E
EE0 dP E Xk . Following Xiong and Mamon’s arguments [43], the estimation of the ma-
dP 0
trix T of transition probabilities can also be facilitated by a change of measure from PE0
w
to PE and entries are updated automatically through the filtering processes. It should be
w bw
noted that ywk is still an HOHMC under both PE and PE but each has the corresponding
transition matrices T = (ptsr ) and bT = b
ptsr . To estimate the transition probabilities in
succession, i.e., by substituting T with b
T, we utilise the likelihood function in combination
with the EM algorithm and consider
ptsr
= 1 for ptsr = 0 and b ptsr = 0. The estimation of parameters given the observa-
b
where
ptsr
tion process following equation (4.10) is accomplished by blending the EM algorithm and
recursive filters in (4.26)-(4.29). The resulting outcomes for b ptsr and the rest parameters
are given as follows.
are
Cbkt (Xk , Xk+1 ) − ϑt Cbkt (Xk ) + µβt qCbkt (Xk+1 ) − ϑt qµβt B
bt
κt =
b 2
k
+
Bbt
k
Cbkt (Xk ) 2µβt qκt2 + 2ϑt κt
+
B
bt
k
2ϑt + 2κt µβt q Cbkt (Xk+1 ) + 2κt Cbkt (Xk+1 Xk )
− , (4.35)
kBbt
Cbkt (Xk+1 ) − ϑt B
bt (Xk ) − κt Cbt (Xk )
µβt =
b k k
κt qB bt (Xk )
k
Dwk Ckt (Xk+1 ) − ϑt Dwk Bkt (Xk ) − κt Dwk Ckt (Xk )
= (4.36)
κt qDwk Bkt (Xk )
Cbt X 2 + κ2 Cbt X 2
k k+1 t k k
σ2βt =
b
B κt q
b t 2
k
bt ϑ2 + κt µβ q 2 + 2ϑt κt µβ q − %2
B k t t t t
+
Bbt κ2 q
k t
Ck (Xk ) 2µβt qκt + 2ϑt κt
b t 2
+
Bbt κ2 q
k t
2ϑt + 2κt µβt q Cbkt (Xk+1 ) + 2κt Cbkt (Xk+1 Xk )
− , (4.37)
Bbt κ2 q
k t
Abktsr Dwk Aktsr
ptsr =
b = , ∀pairs (t, s) , t , s. (4.38)
Bbsr
k
Dwk Bksr
4.5. Numerical application 117
For ease of numerical implementation, we assume q = 1 in Section 4.5. The filtering re-
cursions in Proposition 1 yield new estimates for κt , ϑt , %2t , µβt , σ2βt , and ptsr , 1 ≤ t, s, r ≤ N
whenever a new sequence of electricity spot prices becomes available up to time k.
The development of these recursive filters thereby implies that the parameter estimates
under our HOHMM setting are self-calibrating. The results in Propositions 1 and 2 con-
stitute further research progress relative to those given in Erlwein et al. [7], and Xiong
and Mamon [43] in the following respects. Firstly, self-calibrating filtering algorithms for
electricity spot prices in a regular HMM setting [7] are extended to a general HOHMM
case. Secondly, we include a compound Poisson process to depict the spikes in the price
dynamics; such inclusion was not a consideration item in the HOHMM setting of [43].
Thirdly, Erlwein et al. [7] went though the route of estimating first the entire drift compo-
nent before being able to compute an estimate for the mean-reverting level κ. In our case,
we directly dealt with the MLE of κ by providing an explicit solution as a function of filter-
ing recursions. The required sequence of computations in recovering the model parameters
is clarified.
From Figure 4.1, we observe prices displaying strong mean reversion, high periodicities,
and numerous spikes. Despite variations and jumps, DESP exhibit a long-run mean-
reverting level and cyclical patterns. For instance, there are more extreme values and
changes in temperature values during the winter and summer seasons, whilst off-peak pe-
riods exist in springs and autumns. The seasonality aspect mainly originates from market
supply and demand, which depend on electricity generation capacity, human activities, and
weather conditions. To execute the recursive filtering equations on the deseasonalised part
of the model, we perform data fitting on the component Dt using the statistical software R’s
built-in regression functions; this then removes the discernible seasonal pattern following
equation (4.1). A ‘step’ function is further applied in selecting the suitable number of ex-
planatory variables in terms of the adjusted-R2 and the Akaike information criterion (AIC).
The descriptive statistics, presented in Table 4.1, guide the parameters’ initialisation in the
implementation procedure. Table 4.2 summarises the estimated parameters for Dt .
Table 4.1: Descriptive statistics for daily electricity spot price (DESP)
Figure 4.1 displays the plots of the fitted seasonality-component function and the actual
DESP data. A zoom-in view of the data evolution from 01 Jan 2013 - 31 Dec 2014 is
4.5. Numerical application 119
400
300
200
100
0
01/01/2011
03/01/2011
05/01/2011
07/01/2011
09/01/2011
11/01/2011
01/01/2012
03/01/2012
05/01/2012
07/01/2012
09/01/2012
11/01/2012
01/01/2013
03/01/2013
05/01/2013
07/01/2013
09/01/2013
11/01/2013
01/01/2014
03/01/2014
05/01/2014
07/01/2014
09/01/2014
11/01/2014
Time
presented in Figure 4.2. Even though the seasonal component Dt has a notably crucial role
for DESP modelling, the spot-price behaviour is still hugely influenced by stochasticity;
refer to Figures 4.1 and 4.2.
Electricity spot prices are gathered with daily frequency and so we assign 4t = 1. We
process the observations in 73 batches with 20 data points in each batch. In this sense,
the parameters are updated roughly every 3 weeks. Other filtering window sizes can be
explored too; however, within the data set that we analyse, our experimentation produces
similar outcomes, telling us that with different window sizes have negligible effect. We note
that a 20-point data length for the HOHMM filtering procedure is fairly sufficient and not
numerically onerous in processing the continual flow of new information including those
resulting from abrupt changes in the DESP dynamics due to extreme weather, supply out-
ages, and excess demand, amongst others. In general, practitioners in other fields applying
online HOHMM-based filtering methods have the freedom to choose a data window size
befitting their circumstances such as computational resources and frequency of data gener-
ation.
CAD/MWh CAD/MWh
120
-2
-1
0
1
2
0
100
200
300
400
500
01/01/2011 01/01/2014
03/01/2011
05/01/2011 02/01/2014
07/01/2011
09/01/2011 03/01/2014
11/01/2011
(Jan/01/2014 - Dec/31/2014)
01/01/2012 04/01/2014
03/01/2012
05/01/2012 05/01/2014
07/01/2012
09/01/2012 06/01/2014
11/01/2012
01/01/2013 07/01/2014
Time
Time
03/01/2013
05/01/2013 08/01/2014
07/01/2013
09/01/2013 09/01/2014
11/01/2013
01/01/2014 10/01/2014
03/01/2014
05/01/2014
07/01/2014
actual observations
seasonal component
09/01/2014 12/01/2014
11/01/2014
Chapter 4. An HOHMM for electricity spot-price dynamics
Figure 4.2: Zoom-in view of the fitted seasonal component versus observed prices
4.5. Numerical application 121
HOHMM filtering calls for the verification of memory presence, and determination of ini-
tial values in connection with parameter estimation. To detect memory in our data, we
evaluate the fractional-differencing parameter d in Granger and Joyeux’s autoregressive
fractionally integrated moving average (ARFIMA) methodology [16]. When 0 < d < 0.5,
there is a finite long memory in the data; and d = 0 signifies short memory. In computing d,
one may employ the Geweke-Porter-Hudak estimator, approximated MLE, and smoothed
periodogram approach; estimated values for d could be easily returned by utilising the R
functions ‘fdGPH’, ‘fdML’, and ‘fdSperio’, respectively. In our case, we employ ‘fdSpe-
rio’, proposed by Reisen [30], since unlike the two former algorithms, the latter algorithm
has no restriction and is applicable to a non-stationary process. Our de-seasonalised data
set gives db = 0.20.
We wish to get the optimal estimates for the parameter set Ew = {κ, ϑ, %, µβ , σβ , p} ; for em-
phasis, these parameters are driven by a discrete-time finite-state HOHMC. This goal is ac-
complished by initialising estimates and updating subsequently the parameter set. Bench-
marks for starting values are attained by treating Xt as a single-regime process. This implies
that the transition probability matrix T is identity. From (4.10), the likelihood function of
Xk+1 is
L Xk+1 ; κ, ϑ, %, µβ , σβ
2
!
(Xk+1 −ϑ−κX k −µβ qκ)
Ym exp −
2 %2 +σ2β qκ2
= q , (4.39)
k=1 2π %2 + σ2β qκ2
κ = 0.6394, b
The R function ‘optim’ is applied to equation (4.40) producingb ϑ = −0.1515,b
%=
µβ = −0.4945, and b
0.6443,b σβ = 0.0103.
Propositions 1 and 2, in conjunction with the initial parameter estimates, obtained in Sec-
tion 4.5.2.1 are put into use for the estimation of the proposed model interfacing the
122 Chapter 4. An HOHMM for electricity spot-price dynamics
HOHMM, OU and jump attributes. Our dynamic parameter estimation proceeds by first
evaluating the filtering recursions (4.26)-(4.29) and then employing their numerical out-
comes to provide optimal estimates through equations (4.33)-(4.38); processing comple-
tion of one batch of data values constitutes one algorithm step or pass.
The final value in an algorithm step is used as the initial value for the filtering recurions
to complete the succeeding algorithm step. Once the optimal estimates of the parameters
in equation (4.10) are obtained through our self-calibrating process after certain number of
algorithm steps, we can back out, via equations (4.11)-(4.13), the optimal parameter esti-
mates of the proposed model with the original specifications in equation (4.5).
The filtering algorithms are implemented on the process Xt under the 1-,2-, and 3-state set-
tings driven by both HOHMC and HMC. Estimated parameters evolve as shown in Figures
4.4-4.7. The parameter evolutions under the 2-state HOHMM (Figures 4.4-4.5) are differ-
ent from those under the 3-state HOHMM (Figures 4.6-4.7). Nonetheless, there is gradual
convergence to certain values after approximately 55 passes for each sequence of param-
eter estimates. Whilst the parameters’ initial values might affect the convergence speed,
as long as they do not deviate substantially from the benchmark values under the 1-state
setting in section 4.5.2.1, the evolution of estimates for each parameter will eventually ap-
proach a specific value. Although not plotted here, parameter estimates under the HMMs
exhibit similar realisations. It is worth mentioning that with HOHMM, the movements of
the parameter estimates under the 1- and 2-state HOHMM settings exhibit analogous dy-
namic patterns and converge to similar optimal values; whilst the evolution of parameter
estimates under 3-state HOHMM setting is distinct. It indicates that a 2-state HOHMM
might be adequate to capture the dynamics of DESP and reflect the economic and market
information. This is further supported by the analysis of model performance in subsection
4.5.3.
We quantify the variability of the parameter estimates by considering the variance of the
estimators using the Fisher information I(Ew ). This is given by
∂
" 2 #
Ew
I (E ) = −E
w
log L (X; E ) .
w
∂E2w
This provides a bound on the asymptotic variance of the MLEs. The ML estimator is
consistent and has an asymptotically normal sampling distribution; see [36]. We utilise the
limiting distribution of the MLE, Ebw , to get the 95% confidence interval. For a generic
4.5. Numerical application 123
1.0
α1
α2
α
0.6
0.2
10 20 30 40 50 60 70
Algorithm steps
-1
θ1
-2
θ2
θ
-3
-4
-5
10 20 30 40 50 60 70
Algorithm steps
1.2
ξ1
ξ2
ξ
0.8
0.4
10 20 30 40 50 60 70
Algorithm steps
Figure 4.4: Evolution of parameter estimates for α, θ, and ξ under a 2-state HOHMM
124 Chapter 4. An HOHMM for electricity spot-price dynamics
0.70
0.60
μ β1
μβ
μ β2
0.50
10 20 30 40 50 60 70
Algorithm steps
0.004
σ β1
σ β2
0.002
σβ
0.000
10 20 30 40 50 60 70
Algorithm steps
Figure 4.5: Evolution of parameter estimates for µβ , and σβ under a 2-state HOHMM
4.5. Numerical application 125
0.7 α1
α2
0.6
α3
0.5
α
0.4
0.3
10 20 30 40 50 60 70
Algorithm steps
θ1
-0.8
θ2
θ3
-1.2
θ
-1.6
10 20 30 40 50 60 70
Algorithm steps
0.8
ξ1
0.6
ξ2
ξ
ξ3
0.4
0.2
10 20 30 40 50 60 70
Algorithm steps
Figure 4.6: Evolution of parameter estimates for α, θ, and ξ under a 3-state HOHMM
126 Chapter 4. An HOHMM for electricity spot-price dynamics
0.70
μ β1
0.60
μ β2
μβ
μ β3
0.50
10 20 30 40 50 60 70
Algorithm steps
0.004
σ β1
σ β2
σ β3
0.002
σβ
0.000
10 20 30 40 50 60 70
Algorithm steps
Figure 4.7: Evolution of parameter estimates for µβ , and σβ under a 3-state HOHMM
4.5. Numerical application 127
Abktsr B bt κt qB bt
ptsr = 2 , I ϑt = 2 , I b k
µβt = k
,
I b b
ptsr %t %t2
2
Cb
k
t
X 2
k + 2µβt qC b
k
t
(X k ) + µ βt q B
bt
k
κt = ,
I b
%t 2
Bbt 3
%t = − 2k + 4 Cbkt Xk+1 2
+ κt2 Cbkt Xk2
I b
%t %
t 2
+ Bk ϑt + κt µβt p + 2ϑt κt µβt q − σβt κt q
b t 2 2 2
+ Cbkt (Xk ) 2µβt qκt2 + 2ϑt κt
!
− 2ϑt + 2κt µβt p Ck (Xk+1 ) − 2κt Ck (Xk+1 Xk ) ,
b t b t
1 3
σβt = −
I b + Cbt X 2 + κ2 Cbt X 2
%2t κt2 q %4t κt2 q k k+1 t k k
bt ϑ2 + κt µβ p 2 + 2ϑt κt µβ q − σ2 κ2 q
+B k t t t βt t
+ Cbkt (Xk ) 2µβt qκt2 + 2ϑt κt
!
− 2ϑt + 2κt µβt p Cbk (Xk+1 ) − 2κt Cbk (Xk+1 Xk ) .
t t
Our recursive filtering is an adaptive approach and it has the mechanism for optimal es-
timates to be iteratively produced. As the estimates obtained manifest reasonably well
convergence and stability properties the SEs become smaller as we go further down the
algorithm steps. Moreover, Table 4.3 evinces that we have robust parameter estimates as
implied by the substantially narrow ranges of all SEs throughout the algorithm passes.
We shall generate and assess the one-step ahead forecasts for Xt and DESP under the
HOHMM-with-jump settings. To do these, we first compute the expected value of the
128 Chapter 4. An HOHMM for electricity spot-price dynamics
HMM
Parameter 1-state model 2-state model 3-state model
Estimates Bound of SE
Lower Upper Lower Upper Lower Upper
κi
b 1.0834 × 10−6 0.3113 2.4645 × 10−8 0.1574 4.45092 × 10−7 0.3113
ϑi
b 8.8910 × 10 −7
0.2568 2.0532 × 10 −8
0.2127 3.2596 × 10 −7
0.2568
%i
b 9.4071 × 10−8 0.0254 3.1251 × 10−9 0.1377 4.1097 × 10−8 0.0254
µβi
b 1.0480 × 10 −6
0.3162 2.2175 × 10 −9
0.4088 4.0512 × 10 −7
0.1253
σ βi
b 6.7328 × 10−8 0.0171 3.1251 × 10−9 0.0346 2.6396 × 10−8 0.0170
−7 −8 −7
p ji
b 6.3839 × 10 0.2063 3.2075 × 10 0.1785 6.3839 × 10 1.6160
HOHMM
Parameter 1-state model 2-state model 3-state model
Estimates Bound of SE
Lower Upper Lower Upper Lower Upper
κt
b 1.0834 × 10−6 0.3113 7.4602 × 10−7 0.3827 2.2600 × 10−6 0.3858
ϑt
b 8.8910 × 10 −7
0.2568 6.3506 × 10 −7
0.2766 1.6974 × 10 −6
0.2676
%t
b 9.4071 × 10−8 0.0254 1.0576 × 10−7 0.4424 1.3078 × 10−7 0.0221
µβt
b 1.0480 × 10 −6
0.3162 8.3156 × 10 −8
0.3357 2.0243 × 10 −6
0.3265
σβt
b 6.7328 × 10−8 0.0171 7.4834 × 10−8 0.0243 9.1683 × 10−8 0.0150
−7 −9 −9
qtsr
b 6.3839 × 10 0.2063 3.7525 × 10 1.2956 4.3980 × 10 0.8859
Table 4.3: Interval of standard errors for parameter estimates under the 1-, 2-, 3-state
HMMs and HOHMMs
4.5. Numerical application 129
= κ,b
yk Xk + ϑ,b
w
ywk
w D E
+ q κ,b
yk µβ ,b
ywk . (4.41)
Equation (4.41) brings about one-step ahead forecasts under the non-regime-switching
model, and 2-state and 3-state settings driven by HOHMMs. Figure 4.8 presents the graphs
showing the movements of the one-step ahead predictions for Xk and DESP under the 3-
state HOHMM setting. A magnified view of the predicted DESP is depicted in Figure 4.9
covering a one-year period. The forecasts follow closely the actual dynamics of DESP dur-
ing ‘normal’ periods. Models with regime-switching features outperform the 1-state model
during periods with jump occurrences, and HOHMMs are more accurate than the HMMs
in predicting spikes in DESP.
Visually, the DESP’s dynamics and spikes are captured very reasonably by our self-calibrating
algorithm and filtering processes under the regime-switching setting. But, more formally,
we wish to quantify the difference, in terms of accuracy and fitting performance, between
the HOHMMs and HMMs. To assess the goodness of fit, as a by product of the pre-
dictability performance, an error analysis following the criteria in Erlwein et al. [7] is
undertaken. We shall rely on the following error metrics: mean-squared error (MSE), root-
mean-squared error (RMSE), absolute mean error (MAE), relative absolute error (RAE),
and median absolute percent error (MdAPE). They are calculated as follows:
2 s 2
Pm b Pm b
k=1 Xk − Xk k=1 Xk − Xk
MSE = , RMSE = ,
m m
Pm b Pm b
| Xk − Xk | |Xk − Xk |
MAE = k=1 , RAE = Pk=1 ,
m m
k=1 | X
b k − X̄|
bk − Xk
X
and MdAPE = Md ,
Xk
0
100
200
300
400
500
600
-3
-2
-1
0
1
2
3
01/02/2011 01/02/2011
03/02/2011 03/02/2011
05/02/2011 05/02/2011
07/02/2011 07/02/2011
09/02/2011 09/02/2011
11/02/2011 11/02/2011
01/02/2012 01/02/2012
03/02/2012 03/02/2012
05/02/2012 05/02/2012
07/02/2012 07/02/2012
09/02/2012 09/02/2012
11/02/2012 11/02/2012
01/02/2013 01/02/2013
Time
Time
03/02/2013 03/02/2013
05/02/2013 05/02/2013
07/02/2013 07/02/2013
09/02/2013 09/02/2013
11/02/2013 11/02/2013
01/02/2014 01/02/2014
03/02/2014 03/02/2014
05/02/2014 05/02/2014
07/02/2014 07/02/2014
actual daily spot price
09/02/2014 09/02/2014
forecast daily spot price
actual deseasonalized observations
11/02/2014 11/02/2014
forecast of deseasonalized observations
Figure 4.8: One-step ahead forecasts for Xk and S k under a 3-state HOHMM
Chapter 4. An HOHMM for electricity spot-price dynamics
4.5. Numerical application 131
400
300
200
100
0
01/01/2012
02/01/2012
03/01/2012
04/01/2012
05/01/2012
06/01/2012
07/01/2012
08/01/2012
09/01/2012
10/01/2012
11/01/2012
12/01/2012
01/01/2013
Time
Figure 4.9: Comparison of one-step ahead forecasts in 1-, 2-, 3-state HMMs and HOHMMs
predicted values.
The results of our error analyses are presented in Table 4.4; they bespeak that a non-regime-
switching model is less adequate than a regime-switching model in describing the dynamics
of our data. Under the HMM setting, the 3-state framework outperforms the 1- and 2-state
frameworks; whilst under the HOHMM setting, the 2-state model is the best in predicting
the stochastic component and DESP. Furthermore, the HOHMM beats the regular HMM
with respect to the four metrics; this finding agrees with the output revealed in Figure 4.9.
The 2-state HOHMM has the best over-all prediction amongst all state settings. The 4-state
HMMs and HOHMMs were also tested but no significant improvement is achieved. Ad-
ditionally, the computational cost of using HMMs and HOHMMs with state dimensions
higher than 4 outweighs the benefits. Results of the t-test for the RMSEs’ mean differ-
ences under various modelling set ups are displayed in Table 4.5. The adjusted p-values
in our pairwise comparison were computed via the Bonferroni’s method using the R func-
tion ‘p.adjust’; this addresses the issue caused by familywise errors. The p-values for the
comparison of 2-state versus 3-state HOHMMs are large; so, we cannot reject the null
hypothesis of no RMSE-mean differences at a significance level of 5%. This agrees with
Figures 4.8 and 4.9, where regimes 2 and 3 behave similarly. Thus, introducing a third
state will generate minimal gain (if any). Error metrics for the 2-state and 3-state settings
132 Chapter 4. An HOHMM for electricity spot-price dynamics
Table 4.4: Error analysis of the HMM- and HOHMM-based models for DSP
Table 4.5: Bonferroni-corrected p-values for the t-test performed on the RMSEs involving
the DSP
in Table 4.4 show very close results. However, comparison tests demonstrate that the 1-
state and 2-state settings’ error metric values are statistically different, and the same can be
said concerning those for the 1-state and 3-state settings. Additionally, Figure 4.9 and Ta-
ble 4.4 lend support to such an outcome, which clearly suggests the merit of incorporating
regime-switching features in the model for our data.
We use an information selection criterion that emphasises the trade-off between bias and
variance for our HMM and HOHMM settings. Similar to Erlwein et al. [7], and Xi and
4.5. Numerical application 133
Mamon [38]), we consider the Akaike Information Criterion (AIC), to provide a measure
that balances the relative goodness of fit and complexity of various model settings. The
AIC is given by
where l is the number of model parameters to be estimated, and log L (X; Ew ) is the log-
likelihood function associated with the model. For the observation process Xk+1 , the corre-
sponding log-likelihood under the HOHMM setting is
B X
X N
log L Xk+1 ; κ, ϑ, %, µβ , σ2β = hywk , et i×
k=1 t=1
− 1
log 2π %2 ywk + σ2β qκ2 ywk 2
2
Xk+1 − ϑ ywk − κ ywk Xk − µβ ywk qκ ywk
− , (4.43)
2 % yk + σβ yk qκ yk
2 w 2 w 2 w
where B is the number of observed values and N is the number of states. The number of
parameters to be estimated depends upon N, and this is summarised in Table 4.6. As ex-
pected, there is a significant increase in the number of parameters to be estimated as the
number of states grows in a model.
Given the form of the AIC in equation (4.42), the selection principle hinges on minimising
the AIC. From Table 4.7, the 1-state model has the highest AIC values. Clearly, the two-
regime HOHMM-OU-jump setting outperforms all other settings given our data as model
inputs.
The AIC metric lacks the asymptotic-consistency property as pointed out by Bozdogan
[4]. This puts into question the AIC’s optimality as the best-model criterion (cf [33]) when
one is confronted by a collection of models with different dimensions and numbers of pa-
rameters. The Bayesian information criterion (BIC) is an alternative metric, which admits
the data set’s sample size on top of other inputs included in the AIC. As comprehensively
discussed in Kuha [21], if the two criteria agree on the best-model choice, robustness in
model selection is achieved. For this reason, we also adopt the BIC in evaluating the set of
candidate models to choose from.
The AICs in Table 4.7 come from a static estimation. We complement this with the gen-
eration of BIC values generated by our dynamic parameter estimation method. The BIC is
computed as
By setting B as the number of observations in each algorithm pass when employing equa-
tion (4.43), a series of BIC values are obtained as the data set is processed in its entirety.
Given the model choice-metric form in equation (4.44), the underlying principle of selec-
tion is to maximise the BIC function. Figure 4.10 depicts the evolution of the calculated
BIC values for the 1-, 2-, and 3-state models.
For the 1-state model, we get higher BIC values for almost all of the periods spanned by
the algorithm passes in the whole data set. We note, nonetheless, that for periods where
jumps occur, the BIC values markedly drop under the 1-state model. The drastic decline of
the BIC during DESP’s spike occurrences insinuates that this criterion is unable to sustain
robustness with one-state modelling. In contrast, the BIC values produced by the regime-
switching models have more stable patterns. Furthermore, the 2-state HOHMM even pro-
duces BIC values higher than those from the 1-state HOHMM/HMM for algorithm steps
4.5. Numerical application 135
-40
-80
BIC
-120
-160
0 10 20 30 40 50 60 70
Algorithm steps
1-state HMM/HOHMM 2-state HMM 3-state HMM
2-state HOHMM 3-state HOHMM
Figure 4.10: Evolution of BIC values under the 1-, 2-, 3-state HMM- and HOHMM-based
models
that encompass periods of jump events. This tells us that the 1-state model is not able to
capture the market well in situations where jumps could be present. On the other hand, the
2-state HOHMM can describe satisfactorily the dynamics of data with jump characteristics.
This is also supported by the results from our error and AIC-based analyses. Undoubtedly,
it is worthy to embed regime-switching capabilities, and in the case of our data set, the
2-state HOHMM setting is regarded as the most appropriate choice.
In this research, we first identify the best-fitting model for our DESP data. Then, optimal
estimates using our filtering-based calibration are fed into the chosen model for EFSP.
136 Chapter 4. An HOHMM for electricity spot-price dynamics
Regime κ
b ϑ
b %
b µβ
b σβ
b α
b θ
b ξ
b
State 1 0.7169 -0.4722 0.1281 0.5312 1.6664 × 10−4 0.3328 -1.6679 0.1499
State 2 0.6388 -0.5643 0.6473 0.6457 1.5808 × 10−4 0.4482 -1.5622 0.7965
Let G(t, T ) denote the expected value of the spot price for delivery at future time T , and
assume that the current time is t. With EFSP as the underlying variable, the theoretical
forward price F(t, T ) of an electricity contract is computed as
where λ (t, T ) is the price of the risk process. In our OU process with jumps, λ (t, T ) has
two contributing parts: (i) the price of risk due to switching of regimes and (ii) the premium
due to the market price of risk. Under the real-world probability measure P, the EFSP as a
function of HOHMMs is calculated as
G (t, T ) =E [S T | Xt ]
w w
=D (T ) exp Xt e−α(yt )(T −t) + 1 − e−α(yt )(T −t) θ(ywt )
ξ2 (ywt ) 1 − e−2α(yt )(T −t)
w
+
2α(ywt )
" Z T !#
−α(yw
× E exp e t )(T −u)
dJu | Xt
s
w w
=D (T ) exp Xt e−α(yt )(T −t) + 1 − e−α(yt )(T −t) θ(ywt )
ξ2 (ywt ) 1 − e−2α(yt )(T −t)
w
+
2α(yw ) t
Z T
w w
exp µβ ywt e−α(yt )(T −u) + σ2β ywt e−2α(yt )(T −t) ςdu
× exp
s
− ς (T − t) .
(4.46)
Derivation details of (4.46) are similar to those given in Benth et al. [2]. The optimal pa-
rameter estimates in the evaluation of the EFSP in equation (4.46) are shown in Table 4.8.
4.5. Numerical application 137
Figure 4.11: Expected future spot price on delivery (EFSP) under a 2-state HOHMM
It is reasonable to assume a constant Markov chain when pricing EFSP over a short-term
delivery period. For a longer maturity of T > 30 days, EFSP has to be numerically com-
puted with a dynamic Markov chain involved, as recommended in Erlwein et al. [7].
The reasonableness of the “constant Markov-chain” assumption for contracts with short
delivery horizon remains disputable. In our implementation, we simulate dynamic states
through an HOHMC to obtain estimated values of the EFSP for a general delivery period.
Figure 4.11 shows the simulated EFSP with time t mapping the last 30 days of our dataset
for maturities T = 1, . . . , 30 days. Salient features of the data are replicated by our pro-
posed model such as the short-term fluctuations, seasonal patterns, and spikes of electricity
prices. Our pricing application is of practical significance, as forward contract valuation is
immediate once λ (t, T )’s estimate is available. Determing the appropriate λ (t, T ) is be-
yond the scope of this article. But, we certainly recognise its critical impact on the pricing
of electricity contracts. This warrants a separate investigation that could utilise our current
results and exposition serving the necessary groundwork.
138 Chapter 4. An HOHMM for electricity spot-price dynamics
4.6 Conclusion
A new framework, possessing regime-switching and data memory-capturing mechanisms,
is proposed for the modelling and forecasting of electricity spot prices. The spot price
is modelled as an exponential of an OU process, and augmented by a compound Poisson
term to deal with the mean-reverting, stochastic and spiked components. The exponential
part of the model is scaled by a suitable deterministic combination of trigonometric and
linear functions to account for the cycles and other type of attributes of certain price de-
terminants. The parameters in the exponential component are modulated by a higher-order
hidden Markov chain in discrete time, which drives the random switching between differ-
ent economic regimes.
The work of Erlwein et al. [7] appears to be the only online HMM filtering algorithms for
the analysis of electricity spot prices. An improvement of their approach was accomplished
in our model development. Empirical implementation based on our extended HOHMM fil-
ters confirmed that the predominant features of seasonality, mean reversion and extremely
large spikes in electricity prices are all captured quite well. The various insights from this
study reinforce the findings of Erlwein et al. [7]. We further showed that HOHMMs offer a
much better fit than those achieved by the usual HMMs. The 2-state HOHMM outperforms
other model settings for our data in accordance with an information-criterion evaluation.
This study showcased two key research contributions: (i) new filters that support self-
updating parameter estimation of the HOHMM-OU-jump model thereby enriching the lat-
est research of Xiong and Mamon [43] with the inclusion of spikes; and (ii) extension of
the Erlwein et al.’s model [7] by incorporating HOHMMs for electricity price modelling
geared towards derivative valuation and risk measurement. A direct and natural direction
of this work is further empirical test of the modelling framework and estimation being put
forward in the analysis of various contracts (more sophisticated than forwards) in the elec-
tricity market for investment and hedging. The HOHMM with lag order 2, which we fixed,
is the focus of this research; certainly, the implementation of the HOHMM filtering with
a higher lag order, and statistical inference for the estimation of the optimal lag are all
worthwhile future research endeavours.
References
[2] F. Benth, A. Cartea, R. Kiesel, Pricing forward contracts in power markets by the cer-
tainty equivalence principle: Explaining the sign of the market risk premium, Journal
of Banking and Finance, 32(10) (2008) 2006–2021.
[4] H. Bozdogan, Model selection and Akaikes information criterion: The general theory
and its analytical extensions, Psychometrika, 52(3) (1987) 345–370.
[5] CCNMatthews Newswire, AESO 10-year plan identifies potential $3.5 billion trans-
mission investment, Toronto, Marketwired L.P. (2007) 1.
[6] S. Deng, Pricing electricity derivatives under alternative stochastic spot price models,
Proceedings of the 33rd Annual Hawaii International Conference on System Sciences,
(2000) 10. doi:10.1109/HICSS.2000.926755.
[8] C. Erlwein, F. Benth, R. Mamon, HMM filtering and parameter estimation of an elec-
tricity spot price model. Energy Economics, 32(5) (2010) 1034–1043.
139
140 REFERENCES
[9] R. Elliott, H. Yang, Forward and backward equations for an adjoint process, In:
Stochastic Processes, (eds.: S. Cambanis, J. Ghosh, R. Karandikar, P. Sen), Springer,
New York, NY, (1992) 61–69
[10] R. Elliott, L. Aggoun, J. Moore, Hidden Markov Models: Estimation and Control,
Springer, New York, 1995.
[12] V. Fanelli, L. Maddalena, S. Musti, Modelling electricity futures prices using seasonal
path-dependent volatility, Applied Energy, 173 (2016), 92–102.
[13] S. Fleten, J. Lemming, Constructing forward price curves in electricity markets, En-
ergy Economics, 25(5) (2003) 409–424.
[14] H. Geman, A. Roncoroni, Understanding the fine structure of electricity prices, Jour-
nal of Business, 79(3) (2006) 1225–1261.
[16] C. Granger, R. Joyeux, An introduction to long memory time series models and frac-
tional differencing, Journal of Time Series Analysis, 1 (1980) 49–64.
[18] R. Huisman, R. Mahieu, Regime jumps in electricity prices. Energy Economics, 25(5)
(2003) 425–434.
[19] S. Islyaev, P. Date, Electricity futures price models: Calibration and forecasting. Eu-
ropean Journal of Operational Research, 247(1) (2015) 144.
[21] J. Kuha, AIC and BIC: Comparisons of Assumptions and Performance, Sociological
Methods and Research, 33(2)(2004),188–229.
[22] L. Lee, F. Jean, High-order hidden Markov model for piecewise linear processes and
applications to speech recognition, Journal of the Acoustical Society of America,
140(2) (2016) EL204–EL210.
[23] J. Lucia, E. Schwartz, Electricity prices and power derivatives: Evidence from the
Nordic power exchange. Review of Derivatives Research, 5(1) (2002) 5–50.
[25] R. Mamon, R. Elliott, Hidden Markov Models in Finance: Further Developments and
Applications, International Series in Operations Research and Management Science,
209, Springer, New York, 2014.
[26] R. Mamon, C. Erlwein, B. Gopaluni, Adaptive signal processing of asset price dy-
namics with predictability analysis, Information Sciences, 178 (1) (2008) 203–219.
[29] R. Merton, Option pricing when underlying stock returns are discontinuous. Journal
of Financial Economics, 3(1) (1976) 125–144.
[31] E. Schwartz, The stochastic behavior of commodity prices: Implications for valuation
and hedging, Journal of Finance, 52(3)(1997) 923–973.
[33] G. Schwarz, Estimating the dimension of a model, Annals of Statistics, 6(1978) 461–
464.
[35] T. Siu, W. Ching, E. Fung, M. Ng, X. Li, A high-order markov-switching model for
risk measurement, Computers and Mathematics with Applications, 58 (1) (2009) 1–
10.
[36] A. van der Vaart, Asymptotic Statistics, Cambridge University Press, Cambridge and
New York, 1998.
[37] O. Wu, T. Liu, B. Huang, F. Forbes, Predicting electricity pool prices using hidden
Markov models, Ifac-Papersonline, 28(8) (2015), 343–348.
[38] X. Xi, R. Mamon, Parameter estimation of an asset price model driven by a weak
hidden markov chain, Economic Modelling, 28(1) (2011) 36–46.
[39] X. Xi, R. Mamon, Yield curve modelling using a multivariate higher-order HMM,
In: State-Space Models and Applications in Economics and Finance (eds.: Y. Zeng,
and Y. Wu), Springers Series in Statistics and Econometrics for Finance 1, (2013)
185–203.
[40] X. Xi, R. Mamon, Capturing the regime-switching and memory properties of interest
rates, Computational Economics, 44(3) (2014) 307–337.
[41] X. Xi, R. Mamon, Parameter estimation in a WHMM setting with independent and
volatility components, In: Hidden Markov Models in Finance: Volume II (Further
Developments and Applications) (eds.: R. Mamon, R. Elliott), Springer (2014) 227-
240.
[45] L. Yousef, Derivation and Empirical Testing of Alternative Pricing Models in Al-
berta’s Electricity Market, ProQuest Dissertations Publishing, 2002.
[46] W. Yu, G. Sheblé, Modeling electricity markets with hidden markov model, Electric
Power Systems Research, 76(6)(2006) 445–451.
[47] F. Ziel, R. Steinert, S. Husmann, Efficient modeling and forecasting of electricity spot
prices, Energy Economics, 47(2015), 98–111.
Chapter 5
5.1 Introduction
Commodity futures markets provide a mechanism for investors to hedge a price risk or pos-
sibly profit from price changes at the futures contract’s maturity. These markets, in essence,
offer social and economic benefits through the efficient allocation of resources and making
some form of insurance accessible to businesses. The analysis of commodity futures typi-
cally covers the energy and agriculture sectors with the intertwined objectives of modelling
price formation and risk management. The literature on futures contracts dealing with the
needs of the aquaculture and fisheries industries remain rather scanty, owing to the chal-
lenges in establishing efficient markets in these two sectors.
We note that frozen shrimp was the very first commodity from the seafood industry to be
traded in the Chicago Mercantile Exchange back in the 1960s. Unfortunately, it was a
thin market and only lasted for 3 years; see [19]. Three decades later, the Minneapolis
Grain Exchange introduced the first exchange-traded shrimp futures contract in 1994 to
help market participants hedge their risk exposure to shrimp prices. Alas, the market was
also terminated in 2000 due to low trading volumes and lack of interest, as per the account
of Quagrainie and Engle [24].
The aquaculture-based futures markets in North America were short-lived and apparently
failed unluckily. In contrast, the Fish Pool ASA, located in Norway, has been successful
144
5.1. Introduction 145
since its launch in 2005. Furthermore, it has a strong connection with the rapidly expand-
ing farmed salmon industry over the course of several decades. Primarily owned by Oslo
Børs ASA, the Fish Pool has become a global exchange serving a venue to mitigate price
risk and an instrument to reckon in sustaining the fish and seafood markets. It operates as
a regulated marketplace for trading financial derivatives with the salmon price as the un-
derlying variable. Fish Pool is a very active exchange in the trading of salmon derivatives.
As reported in [23], the volume of salmon traded in the futures exchange has increased
dramatically from approximately 3, 600 tonnes in 2006 to 90, 445 tonnes in 2016 with a
record cash value of 5.8 billion Norwegian Krone (NOK).
The Fish Pool plays a critical role in the pricing and management of fresh farmed salmon.
The creation and trading of salmon forward and futures contracts bring financial and eco-
nomic improvements in terms of increasing the producers’ profitability, enhancing market
efficiency, and facilitating risk hedging. It is thereby vital to model the dynamics of the
process that underlies the value of the futures contract. In this chapter, we put forward
a regime-switching model that is capable of accurately describing the joint dynamics of
salmon futures prices. Our modelling is geared towards the valuation other pertinent finan-
cial derivatives on salmon, hedging against volatile salmon price swings, and optimising
harvesting and investment strategies of seafood resources.
A number of models for salmon prices are deterministic simply to keep the financial-
modelling framework analytically tractable; see for example, Cacho [3], and Guttorm-
sen [15]. In our case, we deal with, using stochastic processes, the uncertainty of market
prices. Although Solibakke [27] built a stochastic volatility model for the Fish Pool mar-
ket, their framework only considered the time-varying dynamics of front months contracts
only, rather than the entire term structure of the the futures prices. Ewald [11] derived
explicit formulae for prices of fish futures and call options under the assumption that the
stock level of salmon follows a stochastic logistic growth. The assumption in [11] is some-
what problematic because such governing dynamics in the pricing framework refer to wild
catches of fresh salmon from an open-access fishery source and not catches via fish farm-
ing. According to the UN’s Food and Agriculture Organisation (FAO) [18], the farmed-fish
industry has surpassed the wild-fish sector for a few decades now in terms of production
volume. Moreover, as a widely traded commodity, farmed salmon, and not wild ones, form
the foundation of the Fish Pool’s market structure.
146 Chapter 5. Modelling and forecasting salmon futures-prices curves
Certain studies in the literature feature the construction of salmon futures pricing frame-
work starting with the spot-price evolution; then this is followed by estimating the spot-
price model using a Kalman-filtering method (e.g., Schwartz [26]). Ankamah-Yeboah et
al. [23] presented the price-formation development in the salmon/aquaculture futures mar-
ket via a risk-premium model, whereby spot prices lead to price discovery of futures con-
tracts. Asche et al [1] studied the spot-forward relationship by treating futures price with
maturities up to six months as an unbiased estimator of spot prices. The popular two-factor
commodity model for spot prices, with a stochastic convenience yield explicitly included,
is widely applied to examine salmon futures prices and aquaculture-farming considera-
tions; see details in Ewald et al. [13] and [12]. Alternatively, research investigations would
also focus directly on futures prices when performing the analysis of the Fish Pool salmon
market.
Of particular interest about the Fish Pool is that the salmon’s spot prices and futures prices
are observed weekly and daily, respectively. Thus, spot prices missing on certain days
could be recovered from futures prices collected at a daily frequency. However, if the daily
futures-price modelling begins with a spot price description, generating spot prices with
daily frequency (from data with weekly frequency) is necessary, and this presents an added
difficulty as noted in [1]. Another Fish Pool’s striking characteristic is that on every trading
day, closing futures prices, corresponding to various maturities that varies from months to
years, are available. But then, as we move forward in time, the remaining time to maturity
of the contract also decreases; and futures prices corresponding to the remaining time to
maturity are available as well. This complication, in the context of multivariate modelling,
creates a heavy computational challenge when capturing the term structure of futures prices
at the Fish Pool whilst ensuring that every futures price in the entire data set is processed,
without overlap (i.e., taken as input only once), in the estimation of model parameters.
Such a complication identified above is certainly a gap in the current literature, which we
shall address in Section 5.4 of this chapter. The usual approach so far to circumvent this
issue in model estimation is simply to average actual maturities; see Ankamah-Yeboah et
al. [23], Ewald et al. [13] and Ewald et al. [12].
Our research is distinguished from the aforementioned works by the following contribu-
tions: (i) Instead of using the classical two-factor model, we construct a Markovian regime-
switching model to capture the memory and stochasticity of salmon futures prices. (ii) This
chapter is the first to pin down the modelling of the term structure of salmon futures prices
5.1. Introduction 147
without assuming approximate maturities of the contracts in the estimation procedure. (iii)
We directly investigate the dynamic behaviour of salmon futures prices, thereby avoid-
ing the use of weekly spot prices. (iv) As an alternative to the Kalman filtering method,
we put forward a self-calibrating estimation method based on a regime-switching model
modulated by a higher-order Markov chain, which is capable of modelling empirical dis-
tributions of virtually all shapes.
Markov regime-switching models capture the dynamics of price evolution more accurately
and provide better forecasting performance as per the findings in Erlwein et al. [10], Date et
al. [5], and Xiong and Mamon [38]. In the context of a Markovian regime-switching mech-
anism, hidden Markov models (HMMs) have been widely utilised encompassing many ap-
plications in engineering, and the natural and social sciences; for examples of applications
concentrating in the fields of economics, insurance and finance, see Mamon and Elliott
[20, 21]. HMMs drive the model’s regime-switching attribute via Markov chains with hid-
den states; although the assumption of first-order state transition dependency in HMMs
does not take advantage of information from historical past that may be beneficially useful.
This gives rise to the concept of higher-order hidden Markov model (HOHMM), which
is a doubly stochastic process in that there is a stochastic model for the observation pro-
cess and for which the parameters are modulated by a higher-order hidden Markov chain
(HOHMC). An ultimate aim is to obtain the best estimate of the latent Markov chain’s cur-
rent or future state. In Xi and Mamon [32], it was shown that the HOHMM setting yields
better forecasting performance in modelling the risky assets log returns. Xiong and Mamon
[37] found that HOHMM-based models could capture better the empirical characteristics
of Toronto’s daily average temperature compared to the capability of HMM-based models.
Our investigation reveals that Date et al. [5] is by far the only paper that utilised the HMM-
filtering algorithms to model and forecast commodity futures prices. In [5], model cali-
bration uses heating-oil futures price data; results showed that Markov regime-switching
models outperformed a one-regime model. This work can be deemed as an extension and
updated version of [5], but with more advancements in various aspects of futures-price
modelling described as follows: (i) Model parameters in this chapter are modulated by an
HOHMM, which has a memory-capturing “configuration” that may enable the extraction of
possibly useful information from past data. (ii) HOHMM filtering algorithms, under a mul-
tivariate setting, are devised for estimation. (iii) The multi-step ahead forecasts are analysed
to evaluate the prediction performance of all proposed models under both the HMM and
148 Chapter 5. Modelling and forecasting salmon futures-prices curves
HOHMM settings. (iv) Finally, and most importantly, we design a crafty moving-window
scheme for our data-filtering algorithm passes; this new data-processing scheme covers the
whole life cycle of all futures contracts under study, and ensures that no data point is missed
out or entered more than once into the filtering equations.
We start with the modelling of futures prices under a one-state set up, assuming that the
spot price process is lognormally distributed. A finite-state HOHMM is then embedded
into the framework, regulating the stochastic switching amongst regimes. Our best esti-
mate (in the sense of an expected value given a history of past and current information) of a
regime reflects a state or level generated by the interactions of competing market forces that
affect the futures prices. Filters for the HOHMMs are established by first transforming the
HOHMMs into HMMs, and then usual techniques are applied to obtain optimal parameter
estimates of the HOHMM-based models.
The remainder of this chapter is structured as follows. Section 5.2 presents the formula-
tion of the multi-dimensional model, the parameters of which are driven by a discrete-time
HOHMC. In Section 5.3, adaptive filters for the states of the HOHMC and relevant quan-
tities are derived through a change of reference probability technique. We then draw up
a self-calibrating parameter-estimation scheme; this takes into account the recursive filters
via the Expectation-Maximisation (EM) algorithm. Numerical application of our proposed
models is conducted on a data set of daily salmon future prices collected from the Fish
Pool in Section 5.4. Both proposed model’s goodness of fit and prediction performance are
assessed. Some concluding remarks are given in Section 5.5.
where α, θ and ξ are positive constants, and WtQ is a standard Brownian motion under Q.
5.2. Model setup 149
where α measures the speed of mean reversion moving to the long-run mean level of log
ξ2
price, µ = θ − 2α
, and ξ denotes the volatility of the process. Employing the Brownian-
motion and Itô-isometry properties, the conditional distribution of X under Q at time T ,
where T > t, is normal with respective mean and variance given by
EQ [XT | Ft ] = Xt e−α(T −t) + 1 − e−α(T −t) µ, (5.3)
h i
ξ2 1 − e−2α(T −t)
and VarQ [XT | Ft ] = , (5.4)
2α
where Ft is an WtQ -generated filtration.
From Manoliu and Tompaidis [17], the price Ft at time t of a futures contract with maturity
T is the expected price of the underlying commodity at time T under Q. That is,
h i
Ft = EQ [PT | Ft ] = EQ eXT | Ft .
Using the results from equations (5.3) and (5.4), the log-expected value of the spot price is
Gt = log Ft
1
=EQ [XT | Ft ] + VarQ [XT | Ft ]
2
ξ2 1 − e−2α(T −t)
=Xt e−α(T −t) + 1 − e−α(T −t) µ + . (5.5)
4α
As noted in Schwartz [26] and Weron [31], it is reasonable to assume that the dynamics of
salmon spot prices follow a mean-reverting stochastic process under an objective measure,
and to introduce the price of risk λt . We shall be working under the objective (or real-world
probability) measure P when implementing a filtering-based estimation using observed
market data in Sections 5.3 and 5.4. To facilitate the estimation, a change of measure
will be employed; this is independent from the concept of risk-neutral measure involved
in valuation. The construction of λt , connecting Q and P, in pricing temperature-based
150 Chapter 5. Modelling and forecasting salmon futures-prices curves
derivatives is elaborated in Xiong and Mamon [38]. Elliott et al. [8] proposed a modified
version of the Esscher transform that takes into account λt when implementing a discrete-
time regime-switching Gaussian model and its continuous-time version. Research findings
in [8] and [38] form the basis in our development of a regime-switching model for salmon
futures prices under the arbitrage-free assumption. In particular, the log-spot price Xt under
P evolves as
Salmon futures prices are assumed available at each time tg , g ∈ Z+ with maturities
T 1 , T 2 , . . . , T g . Let Fthh be the price of a futures contract at time th with maturity T h for
h = 1, . . . , g. Using the Euler method and invoking the results in Date et al. [5], the
g-dimensional discretisation of equation (5.7) is
1 1 !
−αh 4tk+1
h h −αh (T h −tk+1
h
) λ − ξ e ( h k+1 ) 1 + e
h T −th h 4th
Ghk+1 =Ghk + h 1−e ξ e h h −α −α k+1
α k
4
s
h h
−αh (T h −tk+1
h
) 1 − e−2α 4tk+1 h
+ ξe zk+1 , (5.8)
2αh
h
where 4tk+1 = tk+1
h
− tkh for k ∈ Z+0 , and {zhk+1 } are sequences of independent and identically
distributed (IID) N (0, 1) random variables.
Remark 1: In the succeeding discussion, all vectors are denoted by bold lowercase let-
ters and all matrices are represented by bold capitalised letters.
To incorporate the impact of changes in market conditions and economic regimes on fu-
tures prices, we embed an HOHMM following the formulation in Xiong and Mamon [37];
morever, the modelling framework is extended to a multi-dimensional set up. Let yk be a
discrete-time homogeneous Markov chain with a finite-state space in RN under a probabil-
ity space (Ω, F , P). We associate the state space with the canonical basis of RN , which is
5.2. Model setup 151
{e1 , e2 , . . . , eN }, where ei := (0, . . . , 0, 1, 0, . . . , 0)> ∈ RN with 1 in the ith position; also, >
stands for the transpose of a vector and N is the state-space dimension. To equip the model
with a regime-switching capability, the g-dimensional process in equation (5.8) will have
parameters that are governed by yk so that
1 h !
−αh (yk )(T h −tk+1
h
) −αh (yk )4tk+1
h
− ξ (yk )e 1+e
4
s
h h
−αh (yk )(T h −tk+1
h
) 1 − e−2α (yk )4tk+1 h
+ ξ (yk )e
h
zk+1 . (5.9)
2αh (yk )
where pdcb := P(ywk+1 = ed |ywk = ec , ywk−1 = eb ) with k ≥ 1 and d, c, b ∈ {1, 2, . . . , N}. The
quantity pdcb is interpreted as the probability that the Markov chain will be in state d given
that currently it is in state c and was in state b immediately prior to being in the present
state. The observation process in equation (5.9) can be expressed as
where
1 −αh (yw
ξh (ywk )e−α (yk )(Th −tk+1 )
h h w h
νh (ywk ) = 1 − e k
)4tk+1
α (yk )
h w
The stochastic basis (Ω, F , {Fk }, P) serves as the modelling background and supports the
stochastic processes considered in our framework. The global filtration Fk is defined as
Fk := Fkw ∨ Fkz , where Fkw and Fkz are filtrations generated by ywk and Wt (a P-Wiener
process), respectively.
It is important to note that ywk is latent rather than directly observable from the salmon
futures market. In particular, the state of the HOHMM is hidden in the noisy observation
process Ghk+1 , which is evolving under P. Following the ideas common in papers [32]-[37],
a transformation that converts ywk into a regular Markov chain is employed, after which the
usual HMM-filtering estimations apply. Consider the mapping η(eb , ec ) := ebc , for 1 ≤
b, c ≤ N, where ebc is an RN unit vector with 1 in its ((b − 1)N + c)th position. The
2
semi-martingale representation of the new Markov chain η(ywk+1 , ywk ) is then given by
2 ×N 2
where { wk+1 }k≥1 is a martingale increment, and B is a RN probability transition matrix
with entries b ji := pdcb for j = (d − 1)N + c, i = (c − 1)N + b, and b ji = 0 otherwise.
The ensuing calculations are facilitated by the IID assumption. So, filters are computed un-
der P,
e and then they are related back to P with the justification via the Girsanov’s theorem.
Remark 2: The reference measure P e for filtering is not the same as the equivalent risk-
neutral measure Q used in pricing. The purpose of having a P e is to circumvent the direct
calculations under P, which necessitates hard semi-martingale computations.
Under our modelling framework, the suitable Radon-Nikodým derivative of P with respect
to P
e is constructed as
g Y k
dP Y
Ψk := = ϕhl , k ≥ 1, Ψ0 = 1,
dP
e Fk h=1 l=1
−1
φ σ (yl−1 )
h w
Gl − Gl−1 − ν (yl−1 )
h h h w
ϕhl =
σh (ywl−1 )φ Ghl
1 1 h 2 h w −2 h 2
= h w exp Gl − σ (yl−1 ) × Gl − Ghl−1 − νh (ywl−1 ) ,
σ (y )
l−1 2
where φ is the probability density function of an N (0, 1) random variable, and Fk is the
filtration generated by the observation process Gk . Every parameter driving the multivari-
ate observation process in equation (5.10) is modulated by the same ywk . Even though the
correlation of salmon futures prices is not explicitly modelled, it is implicitly encapsulated
in the underlying higher-order Markov chain regulating to G0k s multivariate dynamics with
possibly dependent variates.
The general principle of obtaining optimal estimates of pertinent quantities under P for a
multi-dimensional HOHMM setting is first to construct filters, which are conditional expec-
e involving functions of η(yw , yw ). Then, the filters
tations under the reference measure P k k−1
under P
e will be used to recover the model’s parameter estimates under P.
By the Bayes’ theorem for conditional expectations, a filter of η(ywk , ywk−1 ) under P can be
computed as
EP [Ψk η(ywk , ywk−1 )|Fk ]
e
qk = E η(yk , yk−1 ) | Fk = .
w w
(5.16)
EPe[Ψk |Fk ]
154 Chapter 5. Modelling and forecasting salmon futures-prices curves
h i
Let sk := EP Ψk η(ywk , ywk−1 )|Fk . Since c,b=1 hη(ywk , ywk−1 ), ecb i = hη(ywk , ywk−1 ), 1i = 1, where
e PN
N
X N D
X E
hsk , ecb i = EP Ψk η(yk , ywk−1 )|Fk , ecb
e
c,b=1 c,b=1
N
X
= EP Ψk η(ywk , ywk−1 ), ecb | Fk
e
c,b=1
= E [Ψk | Fk ] .
P
(5.17)
e
Plugging in the result from equation (5.17) into equation (5.16), together with the construc-
tion of sk , yields
sk sk
qk = PN = . (5.18)
c,b=1 hsk , ecb i
hsk , 1i
As with Xi and Mamon [32], we also define the following relevant quantities:
k
X
Aktsr = hywl−2 , er ihywl−1 , e s ihywl , et i, (5.19)
l=2
k
X
Bkt = hywl−1 , et i = Bk−1
t
+ hywk−1 , et i, (5.20)
l=2
k
X
Bkts = hywl−1 , et ihywl−2 , e s i, (5.21)
l=2
X k
Ckt f Ghk = f (Ghl )hywl−1 , et i = Ck−1
t
f Ghk−1 + f (Ghk )hywk−1 , et i, (5.22)
l=2
where r, s, t = 1, . . . , N, 2 ≤ l ≤ k, 1 ≤ h ≤ g, and f has the form f Gh = Gh ,
or f Gh = (Gh )2 . Equations (5.19), (5.20) and (5.21) refer, respectively, to the Markov
chain’s number of jumps from state (er , e s ) to et , amount of time spent in state et , and oc-
cupation time in state (et , e s ) up to time k. The auxiliary process Ckt ( f ) in equation (5.22)
is the level sum for the states et .
We define two N 2 ×N 2 matrices C and D (similar to Xiong and Mamon [37]) for deriving the
filters of η(ywk+1 , ywk ). Recursive filtering relations for ζk+1
w Ak+1
tsr η(yw , yw ) , ζ w
k+1 k k+1 Bk+1 η(yk+1 , yk ) ,
t w w
ζk+1
w Bk+1
ts η(yw , yw ) , and ζ w
k+1 k k+1 C t ( f )η(yw , yw ) are obtained with the aid of equations
k+1 k+1 k
(5.14) and (5.18).
5.3. Filtering and parameter estimation 155
where
g
Y 1 1 h 2 h w −2
cik+1 = exp Gk+1 − σi (yk )
σ h (yw )
h=1 k 2
h w 2
× Gk+1 − Gk − νi (yk ) ,
h h
Proof The derivations are analogous to the proofs provided in Xi and Mamon [32], or
Xiong and Mamon [37].
Remark 3: Expressions for our recursive filters (5.23)-(5.27) in matrix notation are more
compact and ‘neater’ than those presented in Erlwein and Mamon [10], engendering a
156 Chapter 5. Modelling and forecasting salmon futures-prices curves
more efficient evaluation using software with intensive matrix-manipulation capability. Ad-
ditionally, these results are extensions of the filtering equations derived by Date et al. [5]
for commodity futures prices under a regular HMM framework.
Once the quantities in Proposition 5.3.1 are all determined, model parameter estimates are
immediate from Proposition 5.3.2.
indices (weights given in parenthesis): Nasdaq Salmon Index (85%), Fish Pool European
Buyers Index (10%), and Statistics Norway customs statistics (5%). To avoid issues con-
cerning deliverable grades, the FPI makes use of the weighted average of the most traded
weight categories: 3-4 kg, 4-5 kg, 4-6 kg with contributions of 30%, 40%, and 30%, re-
spectively, to the averaging procedure.
Spot prices are accessible at the Fish Pool on a weekly basis for immediate delivery. On
the other hand, forward prices are available on a daily basis, reflecting the expectations
of market participants for future trading. The interest rate is assumed deterministic rather
than stochastic because forward prices in the Fish Pool are equivalent to futures prices for
numerical application. To the best of our knowledge, research studies in the literature on
pricing salmon futures would transform daily futures prices into weekly or monthly fre-
quency. This is evident from the futures price modelling formulation of Ankamah-Yeboah
et al. [23], Asche et al. [1], and Ewald et al. [13], which contains weekly spot prices.
Remark 4: Our approach differs significantly from the current methodology on salmon
futures price modelling. Instead of averaging futures prices to generate a proxy for the
weekly spot prices as the common practice in various papers, we directly use daily futures
price observations for modelling and forecasting the dynamics of futures prices in the near
or medium-term horizons.
We customise a filtering method tailored to our proposed multivariate model; this is then
implemented on data concerning futures contracts with maturities up to 6 months. These
are short-term contracts that are more frequently traded than others in the salmon market;
see Ankamah-Yeboah et al. [23] and Ewald et al. [13]. We consider a data set of daily
log-return series of futures prices collected by the Fish Pool; this data set comprises 1515
data points. The 12 maturity dates are denoted by T h , where h = 1, . . . , 12, and they
are set to be the last business day in the months of January 2016, February 2016 , . . . ,
December 2016. When the filtering procedure is carried out on the multivariate data, a
moving window is going over time th until the entire trajectory of futures price curves is
exhaustively processed. Futures, with maturities from 1 to 6 months, are traded on any
business date between the starting th and ending th as shown in Table 5.1.
158 Chapter 5. Modelling and forecasting salmon futures-prices curves
Table 5.1: Illustrating the data periods of future contracts (maturities of 1–6 months) cov-
ered by the moving window in the filtering procedure
To further elucidate the relevance of the information in Table 5.1, consider futures contracts
with expiry data T 1 (29 Jan 2016) and refer to Table 5.2.
Table 5.2: Futures contracts with maturity up to 6 months and expiration on 29 Jan 2016
5.4. Numerical application 159
If a market participant enters into a futures contract on any business day between t11 (03 Aug
1
2015) and t21 (31 Aug 2015), then he owns has a 6-month futures contract with maturity
date T 1 ; In this case, the contract could have between 151 to 180 days until expiration.
A similar reasoning can be made when owning 5-month, . . . , 1-month contracts until
the expiration T 1 is reached. Our algorithm, with moving window going through time
points th , is applied to futures contracts with maturity dates T 2 , T 3 , . . . , T 12 . This proposed
data-reorganisation scheme ensures that for times to maturity T h − tkh , 1 ≤ k ≤ 126 (k
expressed as number of trading days), we have 12 data points corresponding to each tk .
More importantly, this scheme will create a multivariate time series without missing or
causing a double entry of any data point.
Remark 5: A novelty of this work arises from realistically treating the remaining time to
maturity T h − tkh to be varying. This gives the benefit of using all available raw information
without any additional data transformation in the process of modelling and parameter es-
timation.
Table 5.3 presents the descriptive statistics of our data set. Low volatility as well as skew-
ness and kurtosis of low magnitude are observed for the log-futures price data.
Coming up with the appropriate initial parameter values in the implementation of our self-
calibrating estimation could make use of the least-square method suggested in Erlwein and
Mamon [10] or the likelihood maximisation mentioned in Date et al. [5]. In our case,
we adopt the latter method. Considering Gt as a one-state process, the maximiser of the
associated log-likelihood function, given a series of observations, is given by
m
Y 1 (Gk+1 − Gk − ν)2
argmax log L (Gk+1 ; ν, σ) = log √ − , (5.31)
k=1 2πσ 2σ2
where m = 120 in this numerical application. For a g-dimensional setting, the log-likelihood
to be maximised must simultaneously take into account all futures prices spanning matu-
rities T h , 1 ≤ h ≤ g, where g = 12. The aim is to find maximisers νh and σh for equation
(5.10). Given that {zhk+1 } is a sequence of IID N(0, 1), we must solve the optimisation prob-
lem
2
g m Ghk+1 − Ghk − νh
Y Y 1
argmax log L (Gk+1 ; ν, σ) = log √ − . (5.32)
σ h 2
h
h=1 k=1 2πσ 2
Invoking Date et al. [5], it is reasonable to assume that the speed of mean reversion is fixed
and independent of the Markov chain when modelling the log-futures prices of a commod-
ity. In fact, we validated that the estimates of α in our model remains relatively stable using
prices of the short-term futures. A fixed α is justified and provides some simplification
in the computation. We emphasise that both λ and ξ are still governed by yw , and their
estimated values are required to update ν(yw ) and σ(yw ) in equation (5.10). We address
5.4. Numerical application 161
problem (5.32) using both R functions ‘nlm’ and ‘optim’, generating initial parameter esti-
α = 0.9864, b
mates b λ = 4.2257, and b
ξ = 0.1855; these estimates are used as benchmarks in
selecting starting values for the multi-regime HOHMMs.
Each variate in our multivariate HOHMM has the same number of regimes and every vari-
ate’s switching is driven by the same HOHMC’s transition matrix P. The time increment
∆t = 1 trading day in the processing of our data, which is divided into 24 batches. Each
batch, for one algorithm, has 5 vectors (i.e., 5 tkh for all T h , h = 1, 2, . . . , 12), and it gives a
window size covering an entire week of trading. One week of multivariate data is deemed
sufficient to accumulate new information that could impact the futures market, such as sub-
stantial changes in supply and demand, climactic conditions (e.g., sea-surface temperature,
water currents, etc), economic and political events, amongst other market factors. Thus,
our algorithm formulation updates quantities that are functions of HOHMM every trading
week through the online filters (5.23)-(5.27).
The filtered estimates are then fed into equations (5.28)-(5.30) to get new parameter esti-
mates. The estimates from the most recent algorithm pass serve as initial parameter values
for the next algorithm pass via the recursive filtering equations. We experimented on var-
ious combinations of different initial parameter values and other filtering windows. It is
found that, for majority of the times during the filtering process, convergence of parameter
estimates could be achieved without any iteration failures. When this is true, window sizes
and starting values only impact the convergence speed, but they do not substantially pro-
duce different results.
Figures 5.1 and 5.2 depict the movement of the entries for the transition probability matrix
P under the 2-state and 3-state settings, respectively. The jumps in the evolution of proba-
bilities reflect possible market-state changes and price fluctuations. Under both 2-state and
3-state settings, noticeable changes occur from the 11th to 15th algorithm pass; and after
approximately 19 algorithm passes, stability of transition probability evolution is attained.
Evidence of regime switching is also detected in the evolution of other parameter estimates
for the process Gt in equation (5.10).
When estimates of ν and σ are generated and with the the value of α computed from solving
the aforementioned optimisation in (5.32), the evolution of the estimates for λ and ξ can
be obtained through equations (5.11) and (5.12). Figures 5.3 and 5.4 show the evolution of
162 Chapter 5. Modelling and forecasting salmon futures-prices curves
Transition probabilities
p111
0.8
p211
p112
p212
p121
0.4
p221
p122
p222
0.0
5 10 15 20
Algorithm steps
Figure 5.1: Evolution of transition probabilities under a 2-state HOHMM for futures prices
with expiry T h .
parameters, under both 2- and 3-state HOHMM set ups, using prices from contracts with
expiration on the last maturity date T 12 . The small magnitude of volatilities ξ and σ is
consistent with the descriptive statistics of the data portrayed in Table 5.3. Figures 5.1 and
5.2 illustrate that the behaviour of parameter estimates is relatively stable after a period of
dramatic switches. Also, the general downward pattern of parameter dynamics is a shared
characteristic of Figures 5.3 and 5.4.
For a comprehensive evaluation of model performance, we also implement our filtering al-
gorithm on 2- and 3-state HMM frameworks, which are two special cases of an HOHMM
with lag order 1. It is a common post-diagnostic check to compare the forecasting perfor-
mance of a proposed model with other benchmarked models, such as the random walk and
ARCH-type models; see Hardy [16]. However, these typical benchmarks are incompatible
with the HMM and HOHMM frameworks. This is because we directly model multivariate
futures prices and incorporate memory property of the market into the setting, whereas ran-
dom walk and ARCH models deal with univariate spot prices. We also rely on Mamon et
al. [22] whereby it was found that ARCH and GARCH models are unable to beat the reg-
ular HMMs’ performance with respect to capturing short- and medium-term predictability
of a data series. To find the ‘best-performing’ model that captures the main features of the
5.4. Numerical application 163
p111
Transition probabilities
0.8 p211
p311
p112
p212
0.4
p313
p113
p213
p313
0.0
5 10 15 20
Algorithm steps
p121
Transition probabilities
p221
0.8
p321
p122
p222
0.4
p323
p123
p223
p323
0.0
5 10 15 20
Algorithm steps
p131
Transition probabilities
p231
0.8
p331
p132
p232
0.4
p333
p133
p233
p333
0.0
5 10 15 20
Algorithm steps
Figure 5.2: Evolution of transition probabilities under a 3-state HOHMM for prices of
futures with expiry T h .
164 Chapter 5. Modelling and forecasting salmon futures-prices curves
0.5
ξ1
0.3 ξ2
ξ
0.1
5 10 15 20
Algorithm steps
12
λ1
λ2
8
λ
4
0
5 10 15 20
Algorithm steps
ν1
ν2
0.006
ν
0.000
5 10 15 20
Algorithm steps
σ1
0.014
σ2
σ
0.006
5 10 15 20
Algorithm steps
Figure 5.3: Evolution of parameter estimates ξ, λ, ν, and σ under a 2-state HOHMM for
prices of futures with expiry 30 Dec 2016.
5.4. Numerical application 165
0.5
ξ1
ξ2
0.3 ξ3
ξ
0.1
5 10 15 20
Algorithm steps
λ1
λ2
5 10
λ3
λ
5 10 15 20
Algorithm steps
ν1
0.006
ν2
ν3
ν
0.000
5 10 15 20
Algorithm steps
0.015
σ1
σ2
σ3
σ
0.005
5 10 15 20
Algorithm steps
Figure 5.4: Evolution of parameter estimates ξ, λ, ν, and σ under a 3-state HOHMM for
prices of futures with expiry 30 Dec 2016.
166 Chapter 5. Modelling and forecasting salmon futures-prices curves
salmon futures prices, we evaluate the prediction of Gt under the 1-, 2-, and 3-state HMMs
and HOHMMs. The one-step ahead forecasts in Date et al. [5] are extended; additionally,
the d-step ahead predictions for all proposed models are assessed.
Proposition 5.4.1 Given information up to time k, the ‘best’ estimate of the d-step ahead
forecast of the g-dimensional observation process Ghk+1 = log Fk+1 h
is
2
h i Yd X N σhj
E Fk+d | Fk = Fk
h h
hB ql , e ji i exp ν j +
l−1 .
h
(5.33)
l=1 j,i=1
2
Figure 5.5 depicts the one-step ahead prediction, taking d = 1 in equation (5.33), for salmon
futures prices under the 3-state HOHMM setting with maturity dates of 28 Oct 2016, 29
Nov 2016 and 29 Dec 2016. We can observe that the one-step ahead forecasts are quite
close to the actual market data. Although not shown here, similar patterns can be found
for the one-step ahead forecasts under the HMM setting as well. The forecasts, with very
short term, from our proposed models follow closely the actual prices at the Fish Pool. The
trends and dynamics of salmon futures prices, from visual inspection, are captured well by
our filtering algorithms and estimation procedure.
A formal way of quantifying forecasts’ quality is through an error analysis, i.e, examining
the goodness of fit for all the proposed HMMs and HOHMMs. Following the criteria in
Erlwein et al. [7] and Date et al. [5, 4], we evaluate the root-mean-squared error (RMSE),
absolute-mean error (AME), relative-absolute error (RAE), and mean-absolute-percent er-
ror (MAPE) of the proposed and competing multi-dimensional models. We extend further
the assessment of the error metrics to the case of d-step ahead forecasts.
The results of error analyses for d = 1, 2, . . . , 5, i.e., spanning the entire next week’s trading
days, are presented in Table 5.4. For all d, the 3-state model outperforms other HOHMM-
state settings. Except for the 1-step ahead forecasting case, where the 2-state HMM yields
better fit than those obtained from the 1-and 3-state HMMs, the HOHMM with 3 states is
adjudged better than any HMM set up in terms of goodness of fit. Compared to the HMM
settings, the HOHMM settings produce smaller errors in one-step ahead predictions, whilst
no pronounced difference is detected as the forecasting horizons become longer. The 4-
state HMM and HOHMM settings, which were examined as well, effect only negligible
improvements. Therefore, there is no practical motivation to pursue an HOHMM with a
5.4. Numerical application 167
Table 5.4: Error analysis of HMM- and HOHMM-based models under 1-, 2-, and 3-state
settings for salmon futures prices
168 Chapter 5. Modelling and forecasting salmon futures-prices curves
Table 5.5: Bonferroni-corrected p-values for the paired t-test performed on the RMSEs
involving salmon futures prices
number of states greater than 3. In fact, it is also prohibitive to consider a large number of
states since the size of the transition probability entries becomes unwieldy even with just
one regime addition; see Figures 5.1 and 5.2. Notwithstanding the small fitting errors seen
in general for the 1-, 2- and 3-state HMM and HOHMM settings, HMM and HOHMM
with a regime-switching feature clearly outperform the special case, 1-state model, across
all forecasting metrics. This indicates the indisputable benefit of incorporating regime-
switching and memory-capturing capabilities into models. Overall, the 3-state HOHMM is
the best modelling framework in accurately describing the dynamics of our salmon futures-
price data set.
The statistical significance of the error-mean differences in each pairwise setting of our
proposed models is evaluated by a t-test based on a 95% confidence level. The adjusted
p-values are computed with the Bonferroni’s approach to control the family wise error rate.
Table 5.5 displays the estimated Bonferroni-corrected p-values for the three pairs of model
settings.
For the 2-state HMM versus 3-state HMM and the 2-state HMM versus 3-state HOHMM
models, the p-values are larger than 0.05. Therefore, the null hypothesis of no significant
difference cannot be rejected. This tells us that the 2-state and 3-state HMM settings (also
2-state and 3-state HOHMM settings) have similar capability in capturing the dynamics
of the observation process although Table 5.4 shows the 3-state is slightly better than the
2-state setting. For the 1-state versus 2-state and 1-state versus 3-state setting under HMM
(also, 1-state versus 2-state and 1-state versus 3-state setting under HOHMM ) the p-values
are appreciably smaller than 0.01. So, the RMSE differences between the HMM-based
switching and no-switching models (also, between the HOHMM-based switching and no-
switching models) are statistically significant.
5.4. Numerical application 169
Table 5.6: Number of estimated parameters under HMM and HOHMM settings for salmon
futures prices
The AIC emphasises the penalty for increasing the number of parameters. The AICc is
the AIC with a correction for a relatively small size of model parameters by increasing the
penalty for model complexity. The BIC strengthens the AIC and AICc by selecting the
‘best’ model from a set of candidate models with a different penalty for the addition of
parameters. The AIC, AICc and BIC are computed as
with B being the number of observations in each algorithm pass. The model that gives the
smallest information-criterion assessment value obtained using equations (5.34)– (5.36)
is preferred. We compute the values of AIC, AICc and BIC for the 1-, 2- and 3-state
HMMs and HOHMMs after each algorithm step. The evolution of the calculated values of
these information-criterion metrics is plotted in Figure 5.6. We observe that the the 1-state
170 Chapter 5. Modelling and forecasting salmon futures-prices curves
model generates the largest values and the most volatile patterns throughout the entire data
period. Even though the 3-state HOHMM produces higher values than those from other
regime-switching models at the earlier part of the algorithm pass, it generally has stable
patterns and smaller AIC/AICc/BIC values thereafter. These findings fortify the results of
the above-mentioned error analyses, selecting the 3-state HOHMM as the best model for
the data set studied in this chapter.
Probing the models’ forecasting performance entails as well running the model on data
set covering the out-of-sample period, i.e., the data period not used in model estimation.
The empirical evidence from the in-sample performance is commonly less reliable than the
those generated from analysing forecasts in the out-of-sample period, which better reflects
the information available in ‘real time’; see White [30], and Stock and Watson [29]. As
the d−step ahead prediction is forward-looking and does not overlap with the period en-
compassing the data used for the filtering and estimation of parameters, our approach is
distinctly consistent with the out-of-sample forecasting.
This section on numerical demonstration culminates with an appraisal of all model settings
on their one-step ahead prediction ability using price data of futures with maturities after
30 Dec 2016. Specifically, this out-of-sample forecasting exercise focuses on futures with
maturities up to 6 months that expire on the last business day of Jan 2017, Feb 2017, . . .,
Jun 2017. Figure 5.7 exhibits the comparison between the actual and predicted futures
prices under the 3-state HOHMM. It shows that the movement of the one-step ahead pre-
dictions resemble the actual data. Such a result is also in agreement with what Figure 5.5
reveals. The results of the out-of-sample error analysis under the 1-, 2- and 3-state HMMs
and HOHMMs are also reported in Table 5.7. The 2- and 3-state HMMs and HOHMMs
incontestably outperform the 1-state model with respect to all forecasting metrics whilst
the 3-state HOHMM has the smallest forecast errors. Summing up altogether the pertinent
evidence from error analyses, information-criteria evaluations, and out-of-sample predic-
tions, it is ascertained that there is merit in putting forward the regime-switching settings
for the modelling and forecasting of Fish Pool’s futures prices. The 3-state HOHMM turns
out to be the best-fitting model for the chosen sample data within the intent of our empirical
analysis.
5.5. Conclusion 171
Table 5.7: Out-of-sample error analysis of HMM- and HOHMM-based models under 1-,
2-, and 3- state settings for futures prices
5.5 Conclusion
In this work, we put forward a multi-dimensional HOHMM-based approach in examining
the evolution of salmon futures prices. It is assumed that the parameters of the arbitrage-
free log-futures prices model are driven by a second-order hidden Markov chain in discrete
time. The usual approach to solve for the MLEs is to utilise the classic Kalman filtering un-
der a univariate setting, which is common in the price discovery research of the aquaculture
literature. But in light of the correlated multivariate nature of the data for salmon futures
prices, we developed suitable and congruous self-calibrating filtering algorithms, given in
matrix representations, for optimal parameter estimation and short-term price prediction.
This work contributes to the widening of the collection of available quantitative techniques,
enabling further the use of financial technologies in the management of price risk and
volatilities in the fisheries sector and aquaculture industry. Our results could be extended
in a number of directions, such as applying the proposed framework to develop dynamic
hedging strategies and explore optimisation of portfolios with direct exposure to prices of
fish and seafood.
5.5. Conclusion 173
62
Futures prices (NOK)
actual prices
predicted prices
58
54
70
65
actual prices
predicted prices
60
72
actual prices
68
predicted prices
64
Figure 5.5: One-step ahead predictions for futures prices with expiries on 28 Oct 2016, 29
Nov 2016, and 29 Dec 2016.
174 Chapter 5. Modelling and forecasting salmon futures-prices curves
-4000
-6000
AIC
-8000
-10000
5 10 15 20
Algorithm steps
-12000
5 10 15 20
Algorithm steps
1-state HMM/HOHMM 2-state HMM 3-state HMM
2-state HOHMM 3-state HOHMM
-4000
-6000
BIC
-8000
-10000
5 10 15 20
Algorithm steps
Figure 5.6: AIC, AICc and BIC for the 1-, 2-, 3-state HMM/HOHMM-based model for
salmon futures prices.
5.5. Conclusion 175
75
Futures prices (NOK)
70
actual prices
predicted prices
65
60
70
65
actual prices
predicted prices
60
actual prices
predicted prices
70
65
60
Figure 5.7: One-step ahead and out-of-sample predictions under the 3-state HOHMM for
futures prices with expiries on 31 Jan 2017, 28 Feb 2017, and 31 Mar 2017.
References
[1] F. Asche, B. Misund, A. Oglend, The spot-forward relationship in the Atlantic salmon
market, Aquaculture Economics and Management, 20(2)(2016), 222–234.
[4] P. Date, L. Jalen, R. Mamon, A partially linearised sigma point filter for latent state
estimation in nonlinear time series models, Journal of Computational and Applied
Mathematics, 233(2010) 2675–2682.
[5] P. Date, R. Mamon, A. Tenyakov, Filtering and forecasting commodity futures prices
under an HMM framework, Energy Economics, 40(2013) 1001–1013.
[6] R. Elliott, L. Aggoun, J. Moore, J., Hidden Markov Models: Estimation and Control,
Springer, New York (1995).
[7] C. Erlwein, F. Benth, R. Mamon, HMM filtering and parameter estimation of an elec-
tricity spot price model, Energy Economics, 32 (5) (2010) 1034–1043.
[8] R. J. Elliott, T. Kuen Siu, A. Badescu, Bond valuation under a discrete-time regime-
switching term-structure model and its continuous-time extension, Managerial Fi-
nance, 37(11)(2011) 1025–1047.
176
REFERENCES 177
[10] C. Erlwein, R. Mamon, An online estimation scheme for a Hull-White model with
HMM-driven parameters, Statistical Methods and Applications, 18(1)(2009) 87–107.
[11] C. Ewald, Derivatives on nonstorable renewable resources: Fish futures and options,
not so fishy after all, Natural Resource Modeling, 26(2) (2013), 215–236.
[13] C. Ewald, R. Ouyang, T. K. Siu, The market for salmon futures: An empirical anal-
ysis of the Fish Pool using the Schwartz multi-factor model. Quantitative Finance,
16(12)(2016), 1823–1842.
[15] A. Guttormsen, Faustman in the sea: Optimal rotation in aquaculture. Marine Re-
source Economics, 23(4)(2008), 401–410.
[17] M. Manoliu, S. Tompaidis, Energy futures prices: Term structure models with
Kalman filter estimation, Applied Mathematical Finance, 9(1)(2002) 21–43.
[18] Market competition between farmed and wild fish: A literature survey, Food and
Agriculture of the United Nations. http://www.fao.org/3/a-i5700e.pdf (accessed Oct
2017).
[21] R. Mamon, R. Elliott, Hidden Markov Models in Finance: Further Developments and
Applications, International Series in Operations Research and Management Science,
104, Springer, New York, 2014.
178 REFERENCES
[22] R. Mamon, C. Erlwein, R. Gopaluni, Adaptive signal processing of asset price dy-
namics with predictability analysis. Information Sciences, 178(2008), 203–219.
[23] Press release, Fish Pool turned over 90000 tons of salmon last year, 2017,
http://ilaks.no/fish-pool-snudde-over-90-000-tonn-laks-ifjor/ (accessed Oct 2017).
[24] K. Quagrainie, C. Engle, A latent class model for analysing preferences for catfish,
Aquaculture Economics and Management, 10(1)(2006), 1–14.
[25] S. Ross, S. Hedging long run commitments: Exercises in incomplete market pricing,
Economic Notes: Economic Review of Banca Monte dei Paschi di Siena. 26(2)(1997)
385–420.
[26] E. Schwartz, E. The Stochastic behavior of commodity prices: Implications for valu-
ation and hedging, Journal of Finance, 52(3)(1997) 923–973.
[27] P. Solibakke, Scientific stochastic volatility models for the salmon forward market:
Forecasting (un-)conditional moments. Aquaculture Economics and Management,
16(3)(2012), 222–249.
[30] H. White, A reality check for data snooping. Econometrica, 68(5)(2000), 1097–1126.
[31] R. Weron, Modeling and Forecasting Electricity Loads and Prices: A Statistical Ap-
proach. Hoboken, NJ; Chichester, England: John Wiley and Sons. (2006).
[32] X. Xi, R. Mamon, Parameter estimation of an asset price model driven by a weak
hidden Markov chain, Economic Modelling 28(1) (2011) 36–46.
[33] X. Xi, R. Mamon, Yield curve modelling using a multivariate higher-order HMM. In
Zeng, Y. and Wu, S. (eds), State-Space Models and Applications in Economics and
Finance. New York, Springer (2013) 185–203.
REFERENCES 179
[35] X. Xi, R. Mamon, Parameter estimation in a WHMM setting with independent and
volatility components. In Mamon, R. and Elliott, R (eds), In: Hidden Markov Models
in Finance: Volume II (Further Developments and Applications). New York, Springer
(2014) 227–240.
[36] X. Xi, R. Mamon, Capturing the regime-switching and memory properties of interest
rates, Computational Economics, 44(3) (2014), 307–337.
Conclusion
The higher-order hidden Markov process modulates the parameters in discrete time to cap-
ture the random shifts amongst different economic regimes resulting from the interaction
of various factors. We derived, via some ‘idealised’ reference probability measure, the
recursive filters for the state of the Markov chain and auxiliary quantities of the observa-
tion process. EM parameter estimates are expressed in terms of the adaptive filters, which
produces self-calibrating modelling methodology. The performance of the four proposed
models was assessed and benchmarked against present standard models in the literature
and current practice, comparing various statistical metrics that quantify the model’s good-
ness of fit (i.e., minimised forecasting errors) and the balance between model’s maximised
likelihood and inherent complexity.
180
6.1. Summary of research contributions 181
This research work offers further developments and valuable insights in both the theoreti-
cal and practical aspects of regime-switching models covering recent financial innovations
in the commodity markets. As a recapitulation, the accomplishments of this research be-
gan with the modelling in Chapter 2 of the DATs under a discrete-time HMM; an online
parameter estimation scheme and evaluation of a temperature-based option with parame-
ter sensitivity analysis were also presented. The formulation of a temperature model with
a discrete-time HOHMC governing the model parameters was introduced in Chapter 3.
Recursive filtering equations for various quantities as a function of HOHMM were aided
smoothly by the probability-measure-change technique, and we underscored a numerical
implementation on a data set of 4-year Toronto’s DATs. We created a modelling set-up
combining OU and jump processes, both of which are modulated by a HOHMC, in the de-
piction of electricity spot-price movement in Chapter 4. The model was applied to a desea-
sonalised series of AESO-collected data. Chapter 5 advanced the use of multi-dimensional
HOHMM in the modelling and very short-term prediction of futures prices in the aquacul-
ture industry. An empirical illustration was carried out on salmon futures prices in the Fish
Pool market for the purpose of model validation.
Over all, the significant research contributions in this thesis highlight the: (i) generali-
sation of the regular HMM-OU framework to a modelling set up that possesses the means
to extract information from the data observed beyond a unit time lag, hence incorporating
the flexibility to capture data’s memory property; (ii) development of dynamic model cali-
bration procedures, through HOHMM-filtering, that addresses the need for financial tech-
nologies for automation in the context of today’s artificial-intelligence-driven world; (iii)
valuation and risk measurement of temperature-linked derivatives under a regime-switching
approach; (iv) enrichment of the HOHMM-OU model with a compound Poisson process
to accurately chart electricity spot price dynamics; and (v) design of multi-dimensional
HOHMM filters and predictors in the analysis of salmon futures prices, whereby the dy-
namic estimation picks up and processes all available raw data points once only and no
observed values are discarded or transformed into proxy values that yield more approxima-
tion errors and uncertainty of model results.
182 Chapter 6. Conclusion
• The construction of our HOHMM filtering algorithms is aligned with the discrete-
time nature of the observed time series. As an alternative, we have yet to see the
development of the HOHMMs and their filters in continuous time. For example, re-
cent regime-switching models with continuous-time HMM filters were formulated in
[1]. Although this type of filters requires discretisation and integral approximations,
there are instances that filter derivations are more straightforward because they are
standard continuous-time stochastic calculus computations. Thus, there is merit to
consider continuous-time filtering for HOHMMs, which is still an inchoate area in
stochastic modelling.
• A natural direction in modelling the DATs under the HOHMM framework is the
pricing and dynamic hedging of other weather-linked products. For instance, the
6.2. Further research directions 183
[2] L. Jalen, R. Mamon, Parameter estimation in a WHMM setting with independent and
volatility components. In Mamon, R. and Elliott, R (eds), In: Hidden Markov Models
in Finance: Volume II (Further Developments and Applications). New York, Springer
(2014) 241–261.
184
Appendix A
dPbυ
∗
Yk
δ
= Ψ = ϕδl .
dPυ∗ Xk k
l=1
where
2
exp − 2 2 (yl−1 ) Xl − δ (yl−1 ) Xl−1 − η (yl−1 )
1 b
ϕδl = .
exp − 2 2 (y1 l−1 ) (Xl − δ (yl−1 ) Xl−1 − η (yl−1 ))2
Therefore, the log likelihood for Ψ̃δk is
k
δ2 (yl−1 ) Xl−1
2
δ (yl−1 ) Xl−1 + 2η (yl−1 ) b
δ (yl−1 ) Xl−1
!
X − 2Xlb
log Ψδk
b
= − 2 (y )
+ R (δ (yl−1 ))
l=1
2 l−1
k
δ2i Xl−1 δi Xl−1 + 2ηib δi Xl−1
n 2 !
X X b − 2Xlb
= hyl−1 , ei i − + R(δi ) .
2
l=1 i=1
2i
i=1
2i
N
X 1 b2 i 2 b i
= − δ T X − 2δi T (X k−1 , X k ) + 2ηi
bδi T i
(X k−1 ) + R (δi ) .
i=1
2i2 i k k−1 k k
185
186 Chapter A. Derivations of the model’s optimal parameter estimates in chapter 2
δi ) with respect to b
We differentiate L(b δi and set the result to zero giving
k
η2 (yl−1 ) − 2Xlb
η (yl−1 ) + 2b
η (yl−1 ) δ (yl−1 ) Xl−1
!
log Ψηk
X
= + R (η (yl−1 )) .
b
−
l=1
2 2 (yl−1 )
Invoking equations (2.19) and (2.20) and then taking expectation of the log likelihood in-
η) = E log Ψηk | Xk with
h i
volving Xk , we obtain L(b
k
N
X hyl−1 , ei i 2 !
log Ψηk
h i X
E | Xk = E − 2
ηi − 2Xlb
b ηi + 2b
ηi δi Xl−1 + R (ηi ) Xk .
l=1 i=1
2i
dPbυ
∗
Yk
2
ϕl ,
2
υ ∗ = Ψk =
dP Xk l=1
187
(y l−1 )exp − 1
2 (yl−1 )
(X l − δ (y l−1 ) Xl−1 − η (y l−1 ))2
where ϕl =
2b
. The log likelihood of Ψk
2 2
(yl−1 )exp − 2 2 (y1 l−1 ) (Xl − δ (yl−1 ) Xl−1 − η (yl−1 ))2
b
is calculated as
k ! ! !
X 1 1
log Ψk
2
= log + − 2 (Xl − δ (yl−1 ) Xl−1 − η (yl−1 )) + R ((yl−1 )) ,
2
l=1
(yl−1 )
b (yl−1 )
2b
where R 2 does not have b
. From equations (2.19)–(2.20), we have the expectation of the
2 ), which is given by
log likelihood as a function of Xk , denoted by L(b
k
N ! ! !
X X 1 1
log Ψk
2
h i
| Xk = E hyl−1 , ei i log + − 2 (Xl − δ (yl−1 ) Xl−1 − η (yl−1 ))
2
E
l=1 i=1
(yl−1 )
b (yl−1 )
2b
+ R (i ) .
2 ) with respect to b
Differentiating L(b 2 and equating the result to 0 yield
Tbki Xk2 + δ2i Tbki Xk2 + η2i B
bi + 2η2 δi Tbi (Xk−1 ) − 2δi Tbi (Xk−1 , Xk ) − 2ηi Tbi (Xk )
k i k k k
i2 =
b .
O
bt
k
j,i=1
188 Chapter A. Derivations of the model’s optimal parameter estimates in chapter 2
where R π ji is free of b π ji = 1 and introducing a Lagrange multiplier ς,
π ji . Noting nt=1 b
P
we maximise
n
n
X X
π ji , ς =
L b π ji Jkji + ς b
logb π ji − 1 + R π ji .
j,i=1 j=1
π ji , ς with respect to b
Differentiation of L b π ji and ς and then setting each partial derivative
to 0 would result to
1 cji
J + ς = 0.
π ji k
b
J
cji Oki
n Pn
X j=1 k
π ji = 1 and j=1 Jk
ji
= Oki , π ji = = = 1, which can
Pn Pn
Since j=1 b we have b
j=1
−ς −ς
J
cji
n Pn
X j=1 k
be re-expressed as π ji =
b Therefore, the optimal estimate for p is given by
j=1 O
bi
k
Jcji
π ji = k .
b
Obi
k
Appendix B
Proof The HDD futures price is calculated as the expected value of HDD over the contract
period given the current value at time t under the risk-neutral measure Q. Based on the
the Fubini-Tonelli theorem, expectation and integration can be interchanged. We firstly
perform the proof for the case of 0 ≤ t ≤ τ1 < τ2 as follows
F H (t, τ1 , τ2 ) =EQ H Ft
" Z τ2 #
=E Q
max (T base − T v , 0) dvFt
τ1
Z τ2
= E max (T base − T v , 0) Ft dv.
Q
τ1
h i
To solve for the futures price, it is necessary to compute the value of EQ max (T base − T, 0) Ft .
Rt
We suppose α(yt ) is deterministic so that s eαu dBu in equation (2.7) is a Wiener process (cf
Rt
Benth and Šaltytė-Benth [2]). The integral s eα(yu )u dBuQ in equation (2.42), when evaluated
using the optimal parameter estimates under the HMM settings, is then a Wiener process
that incorporates deterministic regime-switching with filtration extended to time t. Under
the normality assumption, T base −T v is normally distributed with EQ max (T base − T, 0) Ft =
M (t, v, Xt ), and Var max (T base − T, 0) Ft = A2 (t, v). Write M := M (t, v, Xt ) and A2 :=
189
190 Chapter B. Proofs of Propositions in chapter 2
T base − T − M
A2 (t, v) . Let T ∗ = , which has a standard normal distribution as well. Then
A2
Z Tbase 1 (T −(T base −M))2
E max (T base − T, 0) Ft =
Q
− (T base − T ) √ e− 2A2 dT
−∞ 2π
Z ∞
1 (T −AT ∗ −M−(T base −M ))2
− base
= (AT + M) √
∗
e 2A2 AdT ∗
−M 2πA
Z ∞A Z ∞
1 − (T 2 )
∗ 2
1 (T ∗ )2
= M√ e dT +
∗
AT ∗ √ e− 2 dT ∗
−M A 2π −MA 2π
∞
M
Z −A
1 (T ∗ )2 1 T ∗ )2
(
=M 1 − +
− 2
∗
− 2
√ e dT A −
√ e
−∞ 2π 2π −AM
2
M 1 − ( MA2 )
=M 1 − Φ − +A√ e
A 2π
M M
=MΦ + Aφ .
A A
Therefore, in terms of M(t, v, Xt ), A2 (t, v), and D (x) = xΦ (x)+φ (x), the HDD futures price
for the period 0 ≤ t ≤ τ1 < τ2 is given by
Z τ2 ! !
M (t, v, Xt ) M (t, v, Xt )
F H f (t, τ1 , τ2 ) = M (t, v, Xt ) Φ + A (t, v) φ dv
τ1 A (t, v) A (t, v)
Z τ2 ! !!
M (t, v, Xt ) M (t, v, Xt ) M (t, v, Xt )
= A (t, v) Φ +φ dv
τ1 A (t, v) A (t, v) A (t, v)
Z τ2 !
M (t, v, Xt )
= A (t, v) D dv.
τ1 A (t, v)
For the scenario 0 ≤ τ1 ≤ t < τ2 , based on the result from the first case and the fact that if
Z is Ft -measurable, EQ (Z |Ft ) = Z, we have
" Z τ2 #
F H (t, τ1 , τ2 ) =E Q
max (T base − T v , 0) dvFt
"Zτ1t Z τ2 #
=E Q
max (T base − T v , 0) dv + max (T base − T v , 0) dvFt
τ1 t
Z t "Z τ2 #
= max (T base − T u , 0) du + E Q
max (T base − T v , 0) dvFt
τ t
Z 1t Z τ2 !
M (t, v, Xt )
= max (T base − T u , 0) du + A (t, v) D dv.
τ1 t A (t, v)
191
k "
X κ2 ywl−1 Xl−1
b 2
κ ywl−1 Xl−1 + 2ϑ ywl−1 b
− 2Xlb κ ywl−1 Xl−1 #
log Ψwκ
k = − + R κ yl−1 ,
w
k
X hywl−1 , et i 2 2
N
X
E log Ψk | Xk =
wκ
κt Xl−1
E −
2%2t
b
l=1 t=1
κt Xl−1 + 2ϑtb
κt Xl−1 + R (κt ) Xk .
− 2Xlb
κ) with respect to b
Differentiation of L(b κ and setting the resulting expression to 0, we get the
Ct (Xk−1 , Xk ) − ϑt b
b Ct (Xk−1 )
optimal estimate of bκ, given the observations Xk+1 , as b
κt = k k .
Ct X 2
b
k k−1
192
193
ϕwϑ
l = l−1
2 . Thus,
exp − 2%2 (yw ) Xl − ϑ yl−1 − κ yl−1 Xl−1
1 w w
l−1
k
X ϑ2 ywl−1 − 2Xlb
b ϑ ywl−1 + 2b
ϑ ywl−1 κ ywl−1 Xl−1 !
log Ψwϑ
k = − + R ϑ yl−1 ,
w
where R (ϑ) is a remainder not containing bϑt . Applying equations (3.19) and (3.21), the
ϑ) = E log Ψwϑ | Xk , where
expectation of the log likelihood depending on Xk is L(b k
k N
hywl−1 , et i 2
h i X " X ! #
E log Ψk | Xk =
wϑ
ϑt − 2Xlb
ϑt + 2b
ϑt κt Xl−1 + R (ϑt ) Xk .
E − 2
b
l=1 t=1
2% t
ϑ) with respect to b
We differentiate L(b ϑ and equate the result to 0. The optimal estimate b
ϑ
C (Xk ) − κt Ck (Xk−1 )
b t bt
may be be derived as bϑt = k .
Bt
b
k
dPbυ
w
Yk
υw = Ψk =
w%
ϕw%
l ,
dP Xk
l=1
2
% yl−1 exp − 2b%2 (yw ) Xl − ϑ yl−1 − κ yl−1 Xl−1
w 1 w w
where ϕw%
l = l−1
2 . So,
% yl−1 exp − 2%2 (yw ) Xl − ϑ yl−1 − κ yl−1 Xl−1
b w 1 w w
l−1
k "
#
X 1 1
log Ψw% = log + − Xl − ϑ yl−1 − κ yl−1 Xl−1 + R % yl−1 .
w w 2 w
k
l=1 % ywl−1
b %2 ywl−1
2b
194 Chapter C. Derivations of the model’s optimal parameter estimates in chapter 3
%) with respect to b
We differentiate L(b % ignoring the remainder and equate the derivative to
0. The optimal estimate for parameter b % is given by
Ctk Xk2 + b
b Btk ϑ2t + κt2 b 2
Ctk Xk−1 + 2b
Ctk (Xk−1 ) ϑt κt
%t =
b2
Bt
b
k
Ct (Xk ) ϑt + b
b Ctk (Xk−1 , Xk ) κt
−2 k .
Bt
b
k
t=1 htsr = 1, we
PN b
where R (htsr ) is a remainder independent of b
htsr . With the constraint
introduce the Lagrange multiplier ρ and obtain the function to maximise as
N
N
X X
htsr , ρ = + ρ + R (htsr ) .
htsr Atsr
L b log b k
b
h tsr − 1
t,s,r=1 t=1
Differentiating L bhtsr , ρ with respect to htsr and ρ and setting the derivatives to 0, we have
N PN btsr
1 btsr X t=1 Ak Bksr
Ak + ρ = 0. Since t=1 htsr = 1 and t=1 Ak = Bk , we get htsr = = =
PN b PN tsr sr b
htsr
b t=1
−ρ −ρ
195
N PN btsr
t=1 A
X
1, which can be rewritten as htsr =
b k
. This means that the optimal estimate b
htsr
t=1 Bksr
b
Atsr
b
htsr =
is given by b k
.
B sr
b
k
Appendix D
2
exp − 2%2 (1yw ) Xl − ϑ yl−1 −b
w
κ yl−1 Xl−1 − µβ(yl−1 ) qκ ywl−1
w w
ϕwκ
l = l−1
2 .
exp − 2%2 (1yw ) lX − ϑ y w
l−1 − κ y w
l−1 X l−1 − µβ(yl−1 )
w qκ ywl−1
l−1
Thus,
2 w 2
X κ yl−1 Xl−1 + 2µβ(ywl−1 ) qXl−1 + µβ(ywl−1 ) q
k b
log Ψwκ
k = −
l=1 2%2 ywl−1
κ ywl−1 Xl−1 + µβ(ywl−1 ) q ϑ ywl−1 − Xl
2b
+ R κ ywl−1
−
2%2 ywl−1
Xk X N
hyw , et i 2 2
= − l−1 2 b κ Xl−1 + 2µβ qXl−1 + µβ q
l=1 t=1
2%t
!
+ 2bκ Xl−1 + µβ q (ϑ − Xl ) + R (κt ) ,
196
197
where R (κt ) is a remainder free of bκt . With equations (4.21) and (4.23), we consider the
h i
κ) = E log Ψwκ
expectation of the log likelihood, i.e., L(b k | X k , where
X X hywl−1 , et i 2 2
k " N
E log Ψk | Xk =
wκ
κ + + µβ q
E − X l−1 2µ β qXl−1
2%2t
b
l=1 t=1
#
+ 2bκ Xl−1 + µβ q (ϑ − Xl ) + R (κt ) Xk .
κ) with respect to b
By differentiating L(b κ and setting the result to 0, we get the optimal
κ, which is
estimate b
ϕwϑ
l = l−1
2 .
exp − 2%2 (yw ) Xl − ϑ yl−1 − κ yl−1 Xl−1 − µβ(yl−1 ) qκ ywl−1
1 w w w
l−1
So,
b2 w
Xk
ϑ yl−1 − 2Xlb
ϑ ywl−1 + 2κ ywl−1 bϑ ywl−1 Xl−1 + µβ(ywl−1 ) q
log Ψwϑ
k = −
2%2 yw
l=1 l−1
+ R ϑ ywl−1
k
X hywl−1 , et i 2
N
X
= −
2
ϑ − 2Xlb
b ϑ + 2κb
ϑ Xl−1 + µβ q
l=1 t=1
2% t
+ R (ϑt ) ,
198 Chapter D. Derivations of the model’s optimal parameter estimates in chapter 4
ϑt .
where R (ϑt ) is a remainder that does not contain b
On the basis of equations (4.21) and (4.23), the expectation of the log-likelihood depending
h i
ϑ) = E log Ψwϑ | Xk , where
on Xk is L(b k
k
X hywl−1 , et i 2
N
h iX
E log Ψwϑ
k | Xk = E − 2
ϑ − 2Xlb
b ϑ
l=1 t=1
2% t
!
+ 2κbϑ Xl−1 + µβ q + R (ϑt ) Xk .
∂L(b
ϑ)
Solving and equating the result to 0, we get the optimal estimate
∂b
ϑ
bw k
dPE Y
= Ψ = ϕw%
bw
E w%
Define a new measure P , using equation (4.15), by setting l ,
dPEw Xk k
l=1
where
Xl −ϑ(yw )−κ(yw )Xl−1 −µβ(yw ) qκ(yw )2
% ywl−1 exp −
l−1 l−1 l−1 l−1
!
% (yl−1 )+σβ(yw ) qκ (yl−1 )
2 w 2 2 w
2b
ϕw%
l = l−1
2 .
Xl −ϑ(yl−1 )−κ(yl−1 )Xl−1 −µβ(yw ) qκ(ywl−1 )
w w
k
N !
X X 1 w
= hyl−1 , et i log
w
y
% l−1
l=1 t=1
b
2
Xl − ϑ ywl−1 − κ ywl−1 Xl−1 − µβ(ywl−1 ) qκ ywl−1 !
+ ,
− R (% t )
2 b%2 yl−1 + σβ(yw ) qκ2 yl−1
w 2 w
l−1
+ R (%t ) .
h i
%) of L(b
Equating to 0 the mathematical derivative (with respect to b %) = E log Ψw%
k | X k ,
%2 is
our optimal estimate of b
2
Ck Xk + κt Ck Xk−1 + Bk ϑt + κt µβt p + 2ϑt κt µβt q − σβt κt q
bt 2 2 bt 2 bt 2 2 2
%2t =
b
B
bt
k
Cbkt (Xk−1 ) 2µβt qκt2 + 2ϑt κt
+
B bt
k
2ϑt + 2κt µβt p Cbkt (Xk ) + 2κt Cbkt (Xk−1 Xk )
− .
B
bt
k
200 Chapter D. Derivations of the model’s optimal parameter estimates in chapter 4
2
exp − 2%2 (1yw ) Xl − ϑ ywl−1 − κ ywl−1 Xl−1 − b
µβ(ywl−1 ) qκ ywl−1
wµβ
ϕl = l−1
2 .
exp − 2%2 (1yw ) Xl − ϑ ywl−1 − κ ywl−1 Xl−1 − µβ(ywl−1 ) qκ ywl−1
l−1
Thus,
wµ
k
X µβ(ywl−1 ) qκ ywl−1 − 2Xl + 2 κ ywl−1 + ϑ ywl−1
log Ψk β
b
= −
l=1 2%2 ywl−1
!
× b µβ(ywl−1 ) qκ yl−1 + R µβ(yw )
w
l−1
k X N
hywl−1 , et i
X
= µ w
µ w
− β(y w ) qκ yl−1 β(yw ) qκ y
l−1 − 2Xl
2%2t
b l−1
b l−1
l=1 t=1
!
+ 2 κ yl−1 + ϑ yl−1
w w + R µ ,
βt
where R µβt does not contain b
µβt . Again, by (4.21) and (4.23),
k
hyw , et i
N
wµ
X X
log Ψk β
h i
E | Xk = E − l−1 2 b µβ(ywl−1 ) qκ ywl−1
l=1 t=1
2%t
!
µβ(ywl−1 ) qκ ywl−1 − 2Xl + 2 κ ywl−1 + ϑ ywl−1 Xk + R µβt .
b
Cbkt (Xk ) − ϑt B
bt (Xk−1 ) − κt Cbt (Xk−1 )
µβt =
b k k
.
κt qB bt (Xk−1 )
k
201
bw k
dPE wσβ
Y wσ
= Ψk = ϕl β ,
bw
E
From equation (4.15), define a new measure P by setting Ew
dP X k
l=1
where
Xl −ϑ(yw )−κ(yw )Xl−1 −µβ(yw ) qκ(yw )2
σβ(ywl−1 ) exp −
l−1 l−1 l−1 l−1
!
2 % (yl−1 )+bσβ(yw ) qκ (yl−1 )
2 w 2 2 w
wσβ
ϕl = l−1
.
Xl −ϑ(yw )−κ(yw )Xl−1 −µβ(yw ) qκ(yw )2
σβ(ywl−1 ) exp −
l−1 l−1 l−1 l−1
b !
2 % (yl−1 )+σβ(yw ) qκ (yl−1 )
2 w 2 2 w
l−1
k
log 1
wσ
X
log Ψk β =
σ w
l=1
b β(yl−1 )
2
Xl − ϑ ywl−1 − κ ywl−1 Xl−1 − µβ(ywl−1 ) qκ ywl−1
−
2 % yl−1 + b
2 w
σβ(yw ) qκ ywl−1
2 2
l−1
+ R σβ(yw )
l−1
k X
N
X 1
= hyl−1 , et i log
w
σβ(ywl−1 )
l=1 t=1
b
2
Xl − ϑ ywl−1 − κ ywl−1 Xl − µβ(ywl−1 ) qκ ywl−1 !
−
2 % yl−1 + b
2 w
σβ(yw ) qκ yl−1
2 2 w
l−1
+ R σβt ,
k
N
h wσβ
i X X w 1
E log Ψk | Xk = hyl−1 , et i log
E
σβ(ywl−1 )
l=1 t=1
b
1
−
2 %2 ywl−1 + b
σ2β(yw ) qκ2 ywl−1
l−1
!
2
× Xl − ϑ −κ
ywl−1 − µβ(ywl−1 ) qκ
ywl−1 Xl ywl−1Xk + R σβt
N
!
X 1 1 2
= Xl − ϑ − κXl−1 − µβ qκ
log −
t=1
σβ
b 2 %2 + b
σ2β qκ2
+ R σβt .
σ2β is
The optimal estimate for b
bt ϑ2 + κt µβ q 2
Cbkt Xk2 + κt2 Cbkt Xk−1
2
+Bk t t
σ2βt =
b
Bbt κt2 q
k
b 2ϑt κt µβ q − % + Cbt (Xk−1 ) 2µβ qκ2 + 2ϑt κt
Bk
t
t
2
t k t t
+
B bt κt2 q
k
2ϑt + 2κt µβt q Ck (Xk ) + 2κt Cbkt (Xk−1 Xk )
bt
− .
B
bt κt2 q
k
ptsr , ρ with respect to ptsr and ρ, and then setting the resulting mathe-
Differentiating L b
matical derivatives to 0, we have
1 btsr
A + ρ = 0.
ptsr k
b
Abktsr Bksr
PN
ptsr = 1 and t=1 Aktsr = Bksr , we have t=1 ptsr = = = 1, and
PN PN PN t=1
Given that t=1 b b −ρ −ρ
A
PN btsr
ptsr = t=1Bbsr k . Henceforth,
PN
consequently, t=1 b
k
Abktsr
ptsr =
b .
Bbsr
k
Appendix E
2
exp − 2(σh )21(yw ) Gl − bνh ywl−1 − Gl−1
h
ϕwν
l = l−1
2 .
exp − 2(σh )12 (yw ) Gl − νh ywl−1 − Gl−1
l
Thus,
Xk bνh ywl−1 2Gl − 2Gl−1 − b νh ywl−1
wνh
log Ψk = 2 + R ν h
y w
l−1
l=1 2 σh ywl−1
k X N
hyl−1 , et i h w
w
X
= ν yl−1 2Gl − 2Gl−1 − b
ν yl−1 + R νt ,
h w h
2 b
l=1 t=1 2 σht
where R νth is a b νth -free expression. With equations (5.20) and (5.22), we consider the
h i
expectation of the log likelihood, i.e., L(νbh ) = E log Ψwν | Fk , where
h
k
k N
hywl−1 , et i h w
X "X #
h
h i
E log Ψwν
k | Fk = E 2
ν
b y l−1 2G l − 2G l−1 − ν
b h
yw
l−1
F k + R νth .
l=1 t=1
2σ t
204
205
νh ) with respect to b
By differentiating L(b νh and setting the result to 0, we get
Cbt ζk
w
C k
t h
( f )η(yk ,
w w
yk−1 )
νt h = k =
b .
Bk
b t
ζk Bk η(yk , yk−1 )
w t w w
Consequently,
2
k ν h w
h
X 1 G l − y l − G l−1
log Ψwσ = + R σh ywl−1
k
log − 2
l=1 σ yl−1
bh w
2 bσ yl
h w
2
k X N ν h w
X 1 G l − y l−1 − G l−1
= hyl−1 , et i log − + σht ,
w
2 R
σ yl−1
h w
σ yl−1 w
l=1 t=1 b 2 b h
where R σht is independent of b
σht . From equations (5.20) and (5.22), we have
2 !
k ν h w
N
h h
iX X w 1 G l − y l−1 − G l−1
F + R σh .
E log Ψwσ
k | Fk = E hyl−1 , et i log − 2 k t
σh ywl−1
l=1 t=1 b 2 bσh ywl−1
By equation (5.19),
k N !hyw ,er ihyw ,es ihyw ,et i
h p
i X X p tsr
l−2 l−1 l
log Λk | Fk = E Fk
b
E log
l=2 t,s,r=1
p tsr
k N N
X X X
= E ptsr − log ptsr hyl−2 , er ihyl−1 , e s ihyl , et i Fk = E ptsr Ak Fk + R (ptsr ) ,
w w w tsr
log b log b
l=2 t,s,r=1 t,s,r=1
ptsr , % with respect to ptsr and %, and then setting the resulting mathe-
Differentiating L b
matical derivatives to 0, we obtain
1 btsr
A + % = 0.
ptsr k
b
Abktsr Bksr
PN
ptsr = 1 and t=1 Ak = Bksr , we then have ptsr = = = 1.
PN PN tsr PN t=1
Noting that t=1 b t=1 b −% −%
Hence,
Abktsr
ptsr =
b .
Bbsr
k
Appendix F
From equation (5.14), the conditional expectation of the new Markov chain η(ywk+1 , ywk ),
given information up to time k, is
Fh
As per equation (5.10), log Fk+d
h is characterised by a mixture of normal distributions having
PN D k E
the density function j,i=1 qk+d−1 , e ji φ G; νhj , σhj . The 1-step ahead forecasts for Fk+1
h
is
then
2
h i N
X σhj
h
| Fk = F k
h
hqk , e ji i exp ν j + .
h
E Fk+1 (F.5)
j,i=1
2
207
208 Chapter F. Derivation of the d-step ahead forecasting formula in chapter 5
h i
For d = 2, assume that Fk+1
h
= E Fk+1
h
| Fk . Thus,
2
h i 2 X
Y N σhj
h
| Fk = Fkh hB qk , e ji i exp ν j +
l−1 .
h
E Fk+2 (F.6)
l=1 j,i=1
2
h
The d-step ahead forecasts for Fk+d can be obtained straightforwardly through
2
h i d X
Y N σhj
h
| Fk = Fkh hB qk , e ji i exp ν j +
l−1 h
E Fk+d (F.7)
l=1 j,i=1
2
Curriculum Vitae
Publications: