0% found this document useful (0 votes)
48 views

A Hidden Markov Model For Forecasting Rainfall Data Availability

This document discusses using a hidden Markov model to predict the availability of rainfall data from weather stations in West Sumatra, Indonesia. It provides background on hidden Markov models and their applications in related areas. The study uses secondary rainfall data from 3 stations over 18 months to predict the probability of rainfall data availability at each station for the next day.

Uploaded by

mosaad khadr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

A Hidden Markov Model For Forecasting Rainfall Data Availability

This document discusses using a hidden Markov model to predict the availability of rainfall data from weather stations in West Sumatra, Indonesia. It provides background on hidden Markov models and their applications in related areas. The study uses secondary rainfall data from 3 stations over 18 months to predict the probability of rainfall data availability at each station for the next day.

Uploaded by

mosaad khadr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Science and Technology Indonesia

e-ISSN:2580-4391 p-ISSN:2580-4405
Vol. 5, No. 2, April 2020

Research Paper

A Hidden Markov Model for Forecasting Rainfall Data Availability at the


Weather Station in West Sumatra
Rahmawati Ramadhan1 , Dodi Devianto1 *
1 Department of Mathematics, Andalas University, Limau Manis Campus, Padang 25163, Indonesia

*Corresponding author: [email protected]

Abstract
Indonesia is a maritime continent in Southeast Asian, laying between Indian Ocean and Pacific Ocean. This position intensely
affects the level of rainfall in Indonesia, especially West Sumatra. The availability of rainfall data can form a Markov chain
which its state is not able to be observed directly (hidden), is called the Hidden Markov Model (HMM). The purposes of this
research are to predict the hidden state of the availability of rainfall data using decoding problems and to find the best state
sequence (optimal) by using Viterbi Algorithm, and also to predict probability for the availability of rainfall data in the future by
using the Baum Welch Algorithm in the Hidden Markov Model. This research uses secondary data with a period of one day
from the availability of rainfall data at the Minangkabau Meteorological Station, Padang Pariaman Climatology Station, and
Silaing Bawah Geophysics Station from January 2018 to July 2019. The results of the prediction show that the Hidden Markov
Model can be used to predict the probability of rainfall data availability. The results for the availability of the highest rainfall
data for one day ahead is at the Padang Pariaman Climatology Station, with a probability of 0.36, followed by Minangkabau
Meteorological Station is 0.35, and Silaing Bawah Geophysics station is 0.29. The result has shown for the next one day period
the probability of rainfall data available from the three stations will be available following the Viterbi algorithm.
Keywords
Hidden Markov Model, Rainfall, Decoding Problem

Received: 12 February 2020, Accepted: 03 April 2020


https://doi.org/10.26554/sti.2020.5.2.34-40

1. INTRODUCTION ation, decoding problems, and learning problems. Based on


A water particle that falls to the earth’s surface through a series research by Thyer and Kuczera (2003), they have discussed a
of hydrological processes is the occurrence of rain. Meanwhile, calibration model with Bayesian approach on rainfall data by
using HMM, the same thing was done by Sansom (1998) for the
rainfall is the height of the rainwater that falls to the earth’s
surface to a flat within a certain period. Many say the least data breakpoint. Some research that uses inhomogeneous HMM
precipitation in an area influenced by several factors, including and nonparametric model reduction of rainfall events has been
factors latitude, altitude, and distance from the source of the carried out also by Mehrotra and Sharma (2005); Robertson et al.
water, the wind, the mountain areas, the temperature difference, (2004); Greene et al. (2011); Pineda and Willems (2016).
Furthermore, there are some HMM models introduced by
and the total land area.
previous researchers, the abundance of species in the river with
In analyzing the availability of rainfall in the future maybe
associated with a stochastic process, where the problem is related HMM with negative binomial model approach discussed by
to the chance of future events that they cannot be predicted Spezia et al. (2014). According to research conducted in Xia
directly on the availability of rainfall data. States of availability and Tang (2019); Li et al. (2018); Bathaee and Sheikhzadeh (2016),
of rainfall data are uncertain and subject to change, and it is HMM also associated with Bayesian analysis. Meanwhile, in
the case of time series, there is a lot of research that studies the
assumed there are some unobserved circumstances; it can be
hidden Markov models, one of which discusses the most reliable
modeled by Hidden Markov Model (HMM).
HMM is a broadening of the Markov chain in which the classification of multivariate time series as described Antonucci
state cannot be observed directly (hidden), but can only be ob- et al. (2015), and categories with HMM multiple time series by
served through a set of other observations. In HMM, there are Colombi and Giordano (2015). Also besides, HMM also is used
three fundamental problems to be solved that problem evalu- in the financial case is to predict trends in the time series as
discussed in Zhang et al. (2019) and to predict the probability
Ramadhan et. al. Science and Technology Indonesia, 5 (2020) 34-40

of the changes of the exchange rate as discussed in Ramadhan of a denoted state space 𝑆 = (𝑆1 , 𝑆2 , . . . , 𝑆𝑁 ) and the state
et al. (2020). Based on the research of Devianto et al. (2015b), the at the time t denoted by 𝑋 𝑡, 𝑡 = 1, 2, . . . , 𝑇
techniques to build models using financial data can be used au- 2. The number of observations (observation) of each state
toregressive fractionally integrated moving average (ARFIMA). represented by 𝑀, where probability every state repre-
Furthermore, the enumeration models have been developed with sented by 𝑣 = 𝑣1 , 𝑣2 , . . . , 𝑣𝑀 and space observation repre-
an exponential distribution characterization approach described sented by 𝑂 = (𝑂1 , 𝑂2 , . . . , 𝑂𝑇 ), which 𝑇 is the length of
in Devianto (2016); Devianto et al. (2015a,c). the observation data.
According to research conducted in Stoner and Economou 3. The transition probability matrix 𝐴 = [𝑎𝑖𝑗] where aij is an
(2019), it is illustrated how the hidden Markov framework could element of 𝐴 which is the conditional probability of the
be adapted to construct a compelling model for sub-daily rainfall, state at the time , given the state at the time , that is
which is capable of capturing all of these essential characteristics
well. Several homogenous Hidden Markov Models (HMMs) were 𝑎𝑖𝑗 = 𝑃(𝑋𝑛+1 = 𝑗|𝑋𝑛 = 𝑖) (2)
developed to forecast droughts using the Standardized Precipi-
tation Index, SPI, at short-medium term, as discussed in Khadr for 1 ≤ 𝑖, 𝑗 ≤ 𝑁
(2016). The hidden Markov sequence was assigned to represent 4. Observation probability distribution at the time 𝑡, at state
the recurrence of mast years, as described in Tseng et al. (2020). 𝑗, commonly known as emission matrix
Model availability of rainfall data can be formed into a hid-
𝐵 = [𝑏𝑖𝑘 ] (3)
den state HMM with attention. In this study, there are three
fundamental problems to be solved that are evaluations prob- where
lem, decoding problems, and learning problems. The results
can provide information about the availability of rainfall data 𝑏𝑖𝑘 = 𝑃(𝑂𝑡 = 𝑣𝑘 |𝑋𝑡 = 𝑖) (4)
in particular areas of West Sumatra in the future. Historical
information about the availability and the accuracy of rainfall for 1 ≤ 𝑗 ≤ 𝑁 , 1 ≤ 𝑡 ≤ 𝑇 and 1 ≤ 𝑘 ≤ 𝑀
data will be very helpful in predicting climate change and irregu- 5. The initial state distribution represented by 𝜋(i) where
larities. In addition, by obtaining an overview of future weather
conditions, the specific policy considering water supply, plant 𝜋(𝑖) = 𝑃(𝑋1 = 𝑖, 1 ≤ 𝑖 ≤ 𝑁 ) (5)
performance and yield can be estimated for better preventive of
resources for community. So HMM can be written in the notation 𝜆=(𝐴, 𝐵, 𝜋) where 𝐴
is expressed by the matrix of transition probability, 𝐵 a chance
2. EXPERIMENTAL SECTION observation matrix known as emission matrix, and 𝜋 is the dis-
tribution of the initial state. There are three special algorithms
2.1 Materials that can be solved by HMM method, namely:
A sequence of events that fulfill the legal requirement of prob- a) Evaluation Problem
ability, where every value randomly changed against time. A To calculate the probability of the observation sequence
Stochastic process predicted the properties of the future depend 𝑃(𝑂|𝜆) requires forward algorithms and backward algorithms
on the properties of the current conditions based on their charac- (Bain and Engelhardt, 1992). Steps to resolve with the advanced
teristics in the past called a Markov chain (Ross, 1996). Stochastic algorithm are as follows:
process 𝑋𝑛 (𝑡) is a series of random variables that change with i. Initial Step
the time of observation t𝜖T, and a stochastic process 𝑋𝑛 is said In this initial step, we determined the initial observation
to have Markov properties if, probability 𝛼1 (i) which ends at state at the time if it is known a
sequence of preliminary observations 𝑂1 is as follows:
𝑃(𝑋𝑛+1 = 𝑗|𝑋𝑛 = 𝑖, 𝑋𝑛−1 = 𝑖𝑛−1 , ...., 𝑋0 = 𝑖0 ) (1)
𝛼(𝑖) = 𝜋(𝑖)𝑏𝑖 (𝑂𝑖 ) (6)
= 𝑃(𝑋𝑛+1 = 𝑗|𝑋𝑛 = 𝑖)
for time and for every 𝑛 = 0, 1, 2, ... and for every 𝑗, 𝑖, 𝑖𝑛−1 , ..., 𝑖1 , 𝑖0 for 1 ≤ 𝑖 ≤ 𝑁
HMM is a stochastic model where the system is assumed to ii. Induction step
be a Markov Process with hidden states. If 𝑋 = (𝑋1 , 𝑋2 , . . . ) is a In this induction step, we determined the total observation
Markov process and 𝑂 = (𝑂1 , 𝑂2 , . . . ) is a function of 𝑋 , then is a probability 𝛼𝑡+1 (𝑗) which end in state 𝑖 at the time 𝑖 = 2, 3, 4, ...,
Hidden Markov Model which can be observed through 𝑂, or can if known a sequence of observations 𝑂1 , 𝑂2 , ..., 𝑂𝑟 is as follows:
be written to a function 𝑓 . The parameter 𝑋 represents the state
𝑁
process that is hidden, while parameter 𝑂 states an observation 𝛼𝑡+1 (𝑗) = { ∑ 𝛼𝑡 (𝑖)𝑎𝑖𝑗 }𝑏𝑗 (𝑂𝑡+1 ) (7)
room that can be observed. The elements of the Hidden Markov 𝑖−1
Model are:
for 𝑗 = 1, 2, .....𝑁 , 𝑡 = 1, 2, ..., 𝑇
1. The number of hidden state elements (hidden state) repre-
iii. Termination step On termination of this step, we deter-
sented by N as the number of states which the probability
mined the total combined odds of observation and the hidden

© 2020 The Authors. Page 35 of 40


Ramadhan et. al. Science and Technology Indonesia, 5 (2020) 34-40

state when known a model so that it is known probability obser- for 2 ≤ 𝑡 ≤ 𝑇 and 1 ≤ 𝑗 ≤ 𝑁
vation sequence 𝑃(𝑂|𝜆), as follows: iii. Termination Step
On termination of this step, it will be determined the greatest
𝑁
𝑃(𝑂|𝜆) = ∑ 𝛼𝑇 (𝑖) (8) probabilities throughout t the first observation and ends in state
𝑖−1 𝑖, as follow
Next, calculate the probability of observation by using a 𝑃 + = max [𝛿𝑇 (𝑖)]
backward algorithm 𝛽𝑡 (𝑖), namely with as following steps : 1≤𝑖≤𝑁
i. Initial step In this initial step, the initial observation prob-
abilities otherwise equal to one, this is because it is assumed 𝑖 is
the final state, and zero for the other can be expressed as 𝑋𝑇+ = 𝑎𝑟𝑔 max [𝛿𝑇 (𝑖)] (14)
1≤𝑖≤𝑁
𝛽𝑡 (𝑖) = 1 (9) iv. Backtracking Step
In this last step, it will be determined the best state sequen
for 1 ≤ 𝑖 ≤ 𝑁
ce, as follow
ii. Induction step
In this induction step, we determined the total observation 𝑋𝑡+ = 𝜑𝑡+1 (𝑋𝑡+1
+
), 𝑡 = 𝑇 − 1, 𝑇 − 2, ..., 1 (15)
probabilities for 𝑡 < 1 , as follows:
c) Learning Problem
𝑁
𝛽𝑡 (𝑖) = ∑ 𝑎𝑖𝑗 𝑏𝑗 (𝑂𝑡+1 )𝛽𝑡+1 (𝑗) (10) This problem estimates the best model to explain a sequence
𝑗−1 of observations, where the changing parameters of HMM, 𝜆 =
(𝐴, 𝐵, 𝜙) so that 𝑃(𝑂|𝜆) becomes the maximum. In the Baum-
for 𝑡 = 𝑇 − 1, 𝑇 − 2, ...1 and 𝑖 = 1, 2, ..., 𝑁
Welch algorithm, also defined four variables, namely: variable
iii. Termination step In this step, the total odds will be deter-
forward (forward), variable backward (backward), variable 𝜉𝑡 (𝑖, 𝑗),
mined combination of observation and the hidden state when
and the variable 𝛾𝑡 (𝑖). Forward variable and backward variable
known a model so that it is known probabilities observation
will be used in the calculation of the variable 𝜉𝑡 (𝑖, 𝑗) and the
sequence 𝑃(𝑂|𝜆), as follows
variable 𝛾𝑡 (𝑖), so the estimation formula learning problem is as
𝑁 follows:
𝑃(𝑂|𝜆) = ∑ 𝑏𝑖 (1)𝜋(1)𝛽1 (𝑖) (11)
𝑖−1 ̂ = 𝛾𝑡 (𝑖), 1 ≤ 𝑖 ≤ 𝑁
𝜋(𝑖)
b) Decoding Problem
∑𝑇𝑡=𝑖−1 𝜉𝑡 (𝑖, 𝑗)
Decoding problem, this decoding step is to find the best state 𝑎̂𝑖𝑗 = ,1 ≤ 𝑖 ≤ 𝑁,1 ≤ 𝑗 ≤ 𝑁
sequence (optimal) associated with the observation of 𝑂 and a ∑𝑇𝑡=1 −1
𝛾𝑡 (𝑖)
model of 𝜆, which has also been known. This problem can be
solved by the Viterbi Algorithm. Steps in the Viterbi algorithm ∑𝑡=1
𝑡=1,𝑂=𝑗 𝛾𝑡 (𝑖)
to determine the best state sequence are as folows 𝑏̂𝑖𝑗 = ,1 ≤ 𝑗 ≤ 𝑁,1 ≤ 𝑘 ≤ 𝑀 (16)
i. Initial step ∑𝑇𝑡=1
=1
𝛾𝑡 (𝑖)
In this initial step, it will be determined the greatest proba-
where 𝜋(𝑖)
̄ is the estimated value of initial state, 𝑎̂𝑖𝑗 is the matrix
bilities throughout 𝑡 the first observation and ends in state 𝑖 to
of the estimated value of transition probabilities, and 𝑏̂𝑖𝑗 is the
𝑡 = 1, as follows:
matrix of the estimated value of emission matrix.
𝛿1 (𝑗) = 𝑏𝑖 (𝑂1 )𝜋(𝑖)
2.2 Methods
In this study, the data to be used only calculates rainfall data for
17 months on January 1, 2018, to July 31, 2019, with a period of
𝜑𝑡 (𝑖) = 0, 1 ≤ 𝑖 ≤ 𝑁 (12) one day, from the contribution of rainfall data at the Minangk-
abau Meteorological Station (MM station), Padang Pariaman Cli-
ii. Recursion Step matology Station (PPC station), and Silaing Bawah Geophysics
In this recursion step, it will be determined greatest proba- Station (SBG station), it is obtained at the website address dataon-
bilities throughout 𝑡 the first observation and ends in state 𝑖 for line.bmkg.go.id with the amount of data used in one station is
𝑡 > 2 , as follows 570. Presentation of rainfall data in diagram form can be seen
from the Figure 1 as follows
𝛿𝑡 (𝑗) = max [𝛿𝑡−1 (𝑖), 𝑎𝑖𝑗 ]𝑏𝑗 (𝑂𝑡 )
1≤𝑖≤𝑁 Based on Figure 1, it can be seen that the highest availability
of rainfall data is at the MM station at 287 data, followed by the
PPC station at 231 data and the SBG station at 219 data. The
𝜑𝑡 (𝑗) = 𝑎𝑟𝑔 max [𝛿𝑡−1 (𝑖), 𝑎𝑖𝑗 ] (13) highest number of unavailable data of rainfall during the period
1≤𝑖≤𝑁

© 2020 The Authors. Page 36 of 40


Ramadhan et. al. Science and Technology Indonesia, 5 (2020) 34-40

Figure 1. Rainfall data availability


Figure 2. Rainfall EWMA Control Charts
of research is at SBG station with 134 data, followed by PPC
station with 122 data, and MM station with 66 data. From the
have indication there are changes in rainfall throughout the day,
three stations, each has 7 data that were not measured.
where there is a condition of very high rainfall or otherwise.
This research uses HMM for forecasting the availability of
The availability of rainfall data in these circumstances is very
rainfall data. That will be analyzed is rainfall data with predic-
important in order to be taken to minimize losses related to
tions of probability rainfall on the next period, as follows:
changes in precipitation. Therefore, it is necessary to build a
1. Take rainfall data by a period of one day. The numbers of
model of the availability of rainfall data in every weather station
data have been observed in the range of time of 570 days.
is West Sumatera by using HMM.
2. Determine the transition probability matrix needed ex-
change rate by using the following formula as a probability. 3.2 Elements of the Hidden Markov Model
The elements that must be determined to solve the case of fore-
𝑛(𝐴) casting the availability of rainfall data with HMM, as follows:
𝑃(𝐴) = (17) 1. Suppose 𝑁 , is the number of hidden states, with state space
𝑛(𝑆)
𝐴 = (𝑆1 , 𝑆2 , . . . , 𝑆𝑁 ) and the state at time 𝑡 is expressed
where 𝑛(𝐴) is the number of elements in 𝐴 and 𝑛(𝑆) is the by 𝑋𝑡 . In this case of rainfall data availability in MM
total number of elements in 𝑆 [22]. station, PPC station, and SBG station. The hidden state is
3. Determine elements of HMM. available, unavailable, and not measurement. So in this
4. Analyze the elements of HMM that have been able to use case study 𝑁 = 3 or can be written as 𝑠1 = 𝑃(available), 𝑠2
where the HMM is calculated the probability of availability = 𝑃(unavailable), 𝑠3 = 𝑃 (no measurement). For example ,
of an observation by using Forward-Backward algorithm, 𝑋𝑡 = 1, it states that the state is in a state of rainfall data
then it is followed by determining the sequences of hidden available.
state by using the Viterbi algorithm, and it is predicting 2. Let M, is the number of observations of each state, the
the HMM parameters by using the Baum-Welch algorithm. observation space 𝑂 = (𝑂1 , 𝑂2 , . . . , 𝑂𝑇 ) and the probability
5. Make the interpretations or conclusions from the results every observations expressed by 𝑣 = (𝑣1 , 𝑣2 , . . . 𝑣𝑀 ), in
that have been obtained. this study 𝑀 = 3, the MM station as 𝑣1 , PPC station as 𝑣2 ,
and SBG station as 𝑣3 .
3. RESULT AND DISCUSSION 3. Let
3.1 The Importance of Rainfall Data Availability by Con- 𝐴 = 𝑎𝑖𝑗 = 𝑃(𝑋𝑡+1 = 𝑗|𝑋𝑡 = 𝑖)
troller Graph of Exponentially Weighted Moving Av-
erage (EWMA) where 𝐴 is the probability of availability of rainfall data
In this study, the EWMA control chart is used to see whether over a range of values to 𝑗 on day 𝑡 + 1 if it is known on day
there is extreme rainfall or non-controlled data, such that the 𝑡 is in the range value to 𝑖 thus formed matrix probability
availability of rainfall data will be very crucial to observe. Rain- are:
fall data had been taken from the weather stations in West Su- ⎡ 𝑎11 𝑎12 𝑎13 ⎤
matera in 2018. The following chart is EWMA control of rainfall 𝐴 = [𝑎𝑖𝑗 ] = ⎢ 𝑎21 𝑎22 𝑎23 ⎥
data. ⎢ ⎥
⎣ 𝑎31 𝑎32 𝑎33 ⎦
Based on Figure 2. This shows that there is extreme rainfall
on the day of observation to-88, 89, 90, 91, 102, 103, 107, 108, 144, a) Transition probability matrix MM station Data
145, 146, 147, 148, 238.239, 240, 241, 252, 271, 316, 336, 337, 338,
⎡ 0.79 0.19 0.02 ⎤
339, 340, 341, 364, 365, and 366. This means that there are 28 days 𝐴 = [𝑎𝑖𝑗 ] = ⎢ 0.58 0.41 0.01 ⎥
with rainfall that is not controlled or extreme. This condition ⎢ ⎥
⎣ 0.46 0.18 0.36 ⎦

© 2020 The Authors. Page 37 of 40


Ramadhan et. al. Science and Technology Indonesia, 5 (2020) 34-40

b) Transition probability matrix PPC station Data 3.2.2 Decoding Problem with Viterbi Algorithm
For this problem decoding problem is how to determine the opti-
⎡ 0.74 0.25 0.01 ⎤ mal sequence hidden state, in this case is available, not available,
𝐴 = [𝑎𝑖𝑗 ] = ⎢ 0.46 0.52 0.02 ⎥ or no measurement with the sequence of observations that have
⎢ ⎥
⎣ 0.42 0.25 0.33 ⎦ been assumed. The Viterbi algorithm consists of three steps,
including the following:
c) Transition probability matrix SBG station Data
𝑋1∗ = 1, 𝑋2∗ = 1, 𝑋3∗ = 1
⎡ 0.64 0.35 0.01 ⎤
𝐴 = [𝑎𝑖𝑗 ] = ⎢ 0.47 0.50 0.03 ⎥ It means the most suitable sequence of available, unavailable, or
⎢ ⎥
⎣ 0.33 0.42 0.25 ⎦ no measurement rainfall data sequence for August 2019 is more
widely available.
4. Emission matrix 𝐵 = [𝑏𝑖𝑘 ], the conditional probability
matrix observation 𝑣𝑘 if the process is in state 𝑗, matrix 3.2.3 Learning Problem with Baum Welch Algorithm
emissions observations MM station, PPC station, and SBG To calculate the parameters of HMM prediction using Baum
station are as follows: Welch algorithm, it can be defined a new variable 𝜉𝑡 (𝑖, 𝑗), that
⎡ 0.74 0.64 0.56 ⎤ probability process in state 𝑖 at the time 𝑡 + 1. It follows
𝐴 = [𝑎𝑖𝑗 ] = ⎢ 0.24 0.34 0.42 ⎥
⎢ ⎥ ⎡ 𝛾1 (1) ⎤ ⎡ 0.9765 ⎤
⎣ 0.02 0.02 0.02 ⎦ 𝜋̂ = ⎢ 𝛾1 (2) ⎥ = ⎢ 0.1078 ⎥
⎢ ⎥ ⎢ −4 ⎥
5. Suppose 𝜋(𝑖) is the initial state distribution, in case of the ⎣ 𝛾1 (3) ⎦ ⎣ 7.201𝑥10 ⎦
availability of rainfall data are assumed: 𝜋(1) = 𝑃(available),
The value at 𝛾𝑡 (𝑖) for 𝑡 = 1 is an estimated of early chance.
𝜋(2) = 𝑃(unavailable), 𝜋(1) = 𝑃(no measurement). Initial
That means the value of
matrix for MM station, PPC station, and SBG station are
as follows: ̂ > 𝑃(𝑂|𝜆)
𝑃(𝑂|𝜆)
⎡ 0.70 ⎤
a) Initial matrix for MM station: 𝜋 = ⎢ 0.27 ⎥ has completed, then the probability process in a state of
⎢ ⎥
⎣ 0.03 ⎦ rainfall data availability will be available by 0.9765, it will not
⎡ 0.62 ⎤ be available is equal to 0.1078, and the estimation of the initial
b) Initial matrix for PPC station: 𝜋 = ⎢ 0.36 ⎥ probability that no measurements is equal to 7.201 x 10-4.
⎢ ⎥
⎣ 0.02 ⎦ Meanwhile, for a prediction of the transition matrix 𝑎𝑖𝑗 that
⎡ 0.64 ⎤ was written with 𝑎̂𝑖𝑗 is the ratio between the amount of displace-
c) Initial matrix for SBG station: 𝜋 = ⎢ 0.34 ⎥ ment of state probability to state the amount of displacement in
⎢ ⎥
⎣ 0.02 ⎦ the state probability in the state 𝑖 , as follows:

3.2.1 Evaluation Problem with Forward and Backward Al-


gorithm ⎡ ∑𝑇𝑡−1 𝜉𝑡 (1, 1) ∑𝑇𝑡−1 𝜉𝑡 (1, 2) ∑𝑇𝑡−1 𝜉𝑡 (1, 3)⎤
For the first problem on HMM, will be calculated the probability ⎢ 𝑇 ⎥
⎢ ∑𝑡−1 𝛾𝑡 (1) ∑𝑇𝑡−1 𝛾𝑡 (1) ∑𝑇𝑡−1 𝛾𝑡 (1) ⎥
models 𝜆 = (𝐴, 𝐵, 𝜋) that represented by 𝑃(𝑂|𝜆) or probabilities 𝑇
⎢ ∑ 𝜉𝑡 (2, 1) ∑𝑇𝑡−1 𝜉𝑡 (2, 2) ∑𝑇𝑡−1 𝜉𝑡 (2, 3)⎥
of observation sequence 𝑂 = (O1, O2, O3). This probability can 𝑎̂𝑖𝑗 = ⎢ 𝑡−1𝑇 ⎥
⎢ ∑𝑡−1 𝛾𝑡 (2) ∑𝑇𝑡−1 𝛾𝑡 (2) ∑𝑇𝑡−1 𝛾𝑡 (2) ⎥
be determined using the Forward and Backward algorithm. 𝑇
⎢ ∑𝑡−1 𝜉𝑡 (3, 1) ∑𝑇𝑡−1 𝜉𝑡 (3, 2) 𝑇
∑𝑡−1 𝜉𝑡 (3, 3) ⎥
⎢ 𝑇 ⎥
𝑁 ⎣ ∑𝑡−1 𝛾𝑡 (3) ∑𝑇𝑡−1 𝛾𝑡 (3) ∑𝑇𝑡−1 𝛾𝑡 (3) ⎦
𝑃(𝑂 = 𝑀𝐾𝐺|𝜆) = ∑ 𝑎𝑡 (𝑖) = 𝑎4 (1) + 𝑎4 (2) + 𝑎3 (3) = 0.091
𝑖=𝑁
⎡ 2.3702 0.3957 1.013𝑥10−3 ⎤
𝑁 ⎢ 2.7670 2.7670 2.7670 ⎥
𝑃(𝑂 = 𝑀𝐾𝐺|𝜆) = ∑ 𝛽1 (𝑖)𝜋(𝑖)𝑏𝑖 (𝑂1 ) ⎢ 0.3064 0.1331 3.621𝑥10−4 ⎥
𝑖=𝑁
𝑎̂𝑖𝑗 = ⎢ ⎥
⎢ 0.4399 0.4399 0.4399 ⎥
−3 10−3 10−5
= 𝛽1 (1)𝜋(𝑖)𝑏1 (𝑂1 ) + 𝛽1 (2)𝜋(2)𝑏2 (𝑂1 ) + 𝛽1 (3)𝜋(3)𝑏3 (𝑂1 ) = 0.091 ⎢ 1.157𝑥10 ⎥
⎣ 2.273𝑥10 −3 2.273𝑥10−3 2.273𝑥10−3 ⎦
Based on the results of the backward algorithm obtained are
consistent with the obtained solution of the forward algorithms, ⎡ 0.85 0.14 0.01 ⎤
namely the observation probabilities at 0.091. 𝑎̂𝑖𝑗 = ⎢ 0.69 0.3 0.01 ⎥
⎢ ⎥
⎣ 0.51 0.14 0.35 ⎦
The matrix 𝑎̂( 𝑖𝑗) is an estimator for the transition matrix
̂ >
𝑎𝑖𝑗 . The matrix 𝑎̂( 𝑖𝑗) describes to reach a value 𝑃(𝑂 | 𝜆)

© 2020 The Authors. Page 38 of 40


Ramadhan et. al. Science and Technology Indonesia, 5 (2020) 34-40

𝑃(𝑂 | 𝜆) , then the transition probabilities of rainfall data at of events in the present and the past and assumed there are
the state of availability will be the probability of the change properties of events that cannot be observed is called the Hidden
conditions of "available" to "available" is 0.85, from "available" Markov Model (HMM). Hidden Markov model by the learning
to "unavailable" is 0.14, from "available" to "no measurement" is problems with Baum-Welch algorithm might be the probabil-
0.01. The probability of "unavailable" to "available" is 0.69, from ity for available data the highest rainfall in the period of the
"unavailable" to "unavailable" is 0.3, from "unavailable" to "no day to come in PPC station is 0.36. It is followed by probability
measurement" is 0.01. The probability for conditions as well as available data is 0.35 in MM station and 0.29 in SBG station.
of "no measurement" to "available" is 0.51, "no measurement" to As for decoding problem on HMM using the Viterbi algorithm
"unavailable" is 0.14, "no measurement" to "no measurement" is can be concluded that for the period of one day ahead might be
0.35. probabilities rainfall data availability in MM station, PPC station,
Likewise with predictions emission matrix 𝑏𝑖𝑘 , which is and SBG station will be more widely available. The results of
denoted by 𝑏̂𝑖𝑘 obtained from a comparison of the number of probability rainfall data availability can available. For the future,
states that produce 𝑘 observations when the process is in state the results of probability rainfall availability in agriculture are to
𝑖 with the number of the entire process is in state 𝑖 so that the help anticipate extreme climate change, and can provide infor-
emission matrix estimators obtained as follows mation and early warning to farm communities about drought or
flooding. In addition, this information is also needed in disaster
𝑇
mitigation as a basis for determining the policy to be taken.
⎡ ∑𝑡−1,𝑂𝑡 −1 𝛾𝑡 (1) ∑𝑇𝑡−1,𝑂𝑡 −2 𝛾𝑡 (1) ∑𝑇𝑡−1,𝑂𝑡 −3 𝛾𝑡 (1) ⎤
⎢ 𝑇 ⎥ REFERENCES
⎢ 𝑇∑𝑡−1 𝛾𝑡 (1) ∑𝑇𝑡−1 𝛾𝑡 (1) ∑𝑇𝑡−1 𝛾𝑡 (1) ⎥
𝑇
⎢ ∑ 𝛾 (2)
𝑡 −1 𝑡
∑𝑡−1,𝑂𝑡 −2 𝛾𝑡 (2) ∑𝑇𝑡−1,𝑂𝑡 −3 𝛾𝑡 (2) ⎥ Antonucci, A., R. De Rosa, A. Giusti, and F. Cuzzolin (2015).
𝑎̂𝑖𝑗 = ⎢ 𝑡−1,𝑂 ⎥

𝑇
∑𝑡−1 𝛾𝑡 (2) ∑𝑇𝑡−1 𝛾𝑡 (2) ∑𝑇𝑡−1 𝛾𝑡 (2) ⎥ Robust classification of multivariate time series by imprecise
⎢ ∑𝑇𝑡−1,𝑂 −1 𝛾𝑡 (3) 𝑇
∑𝑡−1,𝑂𝑡 −2 𝛾𝑡 (3) ∑𝑇𝑡−1,𝑂𝑡 −3 𝛾𝑡 (3) ⎥ hidden Markov models. International Journal of Approximate
𝑡
⎢ ⎥ Reasoning, 56; 249–263
⎣ ∑𝑇𝑡−1 𝛾𝑡 (3) ∑𝑇𝑡−1 𝛾𝑡 (3) ∑𝑇𝑡−1 𝛾𝑡 (3) ⎦ Bain, L. and M. Engelhardt (1992). Introduction to Probability and
⎡ 0.9765 0.9961 0.7944 ⎤ Mathematical Statistics. Duxbury Press: California
⎢ 2.7670 2.7670 2.7670 ⎥ Bathaee, N. and H. Sheikhzadeh (2016). Non-parametric Bayesian
⎢ 0.1078 0.1298 0.2023 ⎥ inference for continuous density hidden Markov mixture
𝑎̂𝑖𝑗 = ⎢
0.4399
⎢ 7.201𝑥10 0.4399 0.4399 ⎥⎥ model. Statistical Methodology, 33; 256–275
−3 5.074𝑥10 −4 1.045𝑥10 −5
⎢ ⎥ Colombi, R. and S. Giordano (2015). Multiple hidden Markov
⎣ 2.273𝑥10−3 2.273𝑥10−3 2.273𝑥10−3 ⎦ models for categorical time series. Journal of Multivariate
⎡ 0.35 0.36 0.29 ⎤ Analysis, 140; 19–30
𝑎̂𝑖𝑗 = ⎢ 0.25 0.29 0.46 ⎥ Devianto, D. (2016). The uniform continuity of characteristic
⎢ ⎥ function from convoluted exponential distribution with stabi-
⎣ 0.32 0.22 0.46 ⎦
lizer constant. In AIP Conference Proceedings. AIP Publishing
The matrix 𝑎̂𝑖𝑘 is an estimator for the conditional probability
LLC
of matrix 𝑏̂𝑖𝑘 . The matrix describes that to achieve value 𝑃(𝑂 | Devianto, D., Maiyastri, L. Oktasari, and M. Anas (2015a). Convo-
̂ > 𝑃(𝑂 | 𝜆) , then the chances of availability of rainfall data for
𝜆) lution of generated random variable from exponential distri-
the period one day ahead for MM station is 0.35, for PPC station bution with stabilizer constant. Applied Mathematical Sciences,
is 0.36, for SBG station is 0.29. Probabilities unavailability of 9; 4781–4789
rainfall data for the period of one day ahead for MM station is Devianto, D., M. Maiyastri, and S. Damayanti (2015b). Forecast-
0.25, for PPC station is 0.29, for SBG station is 0.46. Probabilities ing Long memory Time Series for Stock Price with Autoregres-
no measurement on rainfall data for the period of one day ahead sive Fractionally Integrated Moving Average. International
for MM station is 0.32, for PPC station is 0.22, for SBG station Journal of Applied Mathematics and Statistics, 53(5); 86–95
0.46. Based on the value of the probability of the rainfall data, Devianto, D., L. Oktasari, and Maiyastri (2015c). Some proper-
SBG station has the greatest probability of unavailability and ties of hypoexponential distribution with stabilizer constant.
no measurements of rainfall data. It also impacts to sectors Applied Mathematical Sciences, 9; 7063–7070
that have direct relation to the weather with rainfall conditions; Greene, A. M., A. W. Robertson, P. Smyth, and S. Triglia (2011).
one of them is in the agriculture sector that requires weather Downscaling projections of Indian monsoon rainfall using a
information to estimate an increase of output product. Hence, it non-homogeneous hidden Markov model. Quarterly Journal
is necessary to improve human resources and also a good device of the Royal Meteorological Society, 137(655); 347–359
to do measurements of rainfall data. Khadr, M. (2016). Forecasting of meteorological drought using
Hidden Markov Model (case study: The upper Blue Nile river
4. CONCLUSIONS basin, Ethiopia). Ain Shams Engineering Journal, 7(1); 47–56
A stochastic process that meets the properties of probability in Li, X., V. Makis, H. Zuo, and J. Cai (2018). Optimal Bayesian
which the properties of future events depend on the properties control policy for gear shaft fault detection using hidden semi-

© 2020 The Authors. Page 39 of 40


Ramadhan et. al. Science and Technology Indonesia, 5 (2020) 34-40

Markov model. Computers & Industrial Engineering, 119; 21– (2014). Modelling species abundance in a river by Negative
35 Binomial hidden Markov models. Computational Statistics &
Mehrotra, R. and A. Sharma (2005). A nonparametric nonhomo- Data Analysis, 71; 599–614
geneous hidden Markov model for downscaling of multisite Stoner, O. and T. Economou (2019). An Advanced Hidden Markov
daily rainfall occurrences. Journal of Geophysical Research: Model for Hourly Rainfall Time Series. Cornell University
Atmospheres, 110(D16) Report: UK, arXiv: 1906.03846
Pineda, L. E. and P. Willems (2016). Multisite downscaling of sea- Thyer, M. and G. Kuczera (2003). A hidden Markov model for
sonal predictions to daily rainfall characteristics over Pacific– modelling long-term persistence in multi-site rainfall time
Andean River Basins in Ecuador and Peru Using a nonhomo- series 1. Model calibration using a Bayesian approach. Journal
geneous Hidden Markov model. Journal of Hydrometeorology, of Hydrology, 275(1-2); 12–26
17(2); 481–498 Tseng, Y.-T., S. Kawashima, S. Kobayashi, S. Takeuchi, and
Ramadhan, R., D. Devianto, and M. Maiyastri (2020). Hidden K. Nakamura (2020). Forecasting the seasonal pollen index
Markov Model for Exchange Rate with EWMA Control Chart. by using a hidden Markov model combining meteorological
In Proceedings of the Proceedings of the 1st International Con- and biological factors. Science of The Total Environment, 698;
ference on Statistics and Analytics, ICSA 2019, 2-3 August 2019, 134246
Bogor, Indonesia. EAI Xia, Y.-M. and N.-S. Tang (2019). Bayesian analysis for mixture
Robertson, A. W., S. Kirshner, and P. Smyth (2004). Downscal- of latent variable hidden Markov models with multivariate
ing of daily rainfall occurrence over northeast Brazil using a longitudinal data. Computational Statistics & Data Analysis,
hidden Markov model. Journal of climate, 17(22); 4407–4424 132; 190–211
Ross, S. M. (1996). Stochastic Processes P. John Wiley & Sons Zhang, M., X. Jiang, Z. Fang, Y. Zeng, and K. Xu (2019). High-
Sansom, J. (1998). A hidden Markov model for rainfall using order Hidden Markov Model for trend prediction in financial
breakpoint data. Journal of Climate, 11(1); 42–53 time series. Physica A: Statistical Mechanics and its Applica-
Spezia, L., S. Cooksley, M. Brewer, D. Donnelly, and A. Tree tions, 517; 1–12

© 2020 The Authors. Page 40 of 40

You might also like