A Hidden Markov Model For Forecasting Rainfall Data Availability

This document discusses using a hidden Markov model to predict the availability of rainfall data from weather stations in West Sumatra, Indonesia. It provides background on hidden Markov models and their applications in related areas. The study uses secondary rainfall data from 3 stations over 18 months to predict the probability of rainfall data availability at each station for the next day.

Uploaded by

mosaad khadr

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views

A Hidden Markov Model For Forecasting Rainfall Data Availability

Uploaded by

mosaad khadr

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Science and Technology Indonesia

e-ISSN:2580-4391 p-ISSN:2580-4405
Vol. 5, No. 2, April 2020

Research Paper

A Hidden Markov Model for Forecasting Rainfall Data Availability at the

Weather Station in West Sumatra
Rahmawati Ramadhan1 , Dodi Devianto1 *
1 Department of Mathematics, Andalas University, Limau Manis Campus, Padang 25163, Indonesia

*Corresponding author: [email protected]

Abstract
Indonesia is a maritime continent in Southeast Asian, laying between Indian Ocean and Pacific Ocean. This position intensely
affects the level of rainfall in Indonesia, especially West Sumatra. The availability of rainfall data can form a Markov chain
which its state is not able to be observed directly (hidden), is called the Hidden Markov Model (HMM). The purposes of this
research are to predict the hidden state of the availability of rainfall data using decoding problems and to find the best state
sequence (optimal) by using Viterbi Algorithm, and also to predict probability for the availability of rainfall data in the future by
using the Baum Welch Algorithm in the Hidden Markov Model. This research uses secondary data with a period of one day
from the availability of rainfall data at the Minangkabau Meteorological Station, Padang Pariaman Climatology Station, and
Silaing Bawah Geophysics Station from January 2018 to July 2019. The results of the prediction show that the Hidden Markov
Model can be used to predict the probability of rainfall data availability. The results for the availability of the highest rainfall
data for one day ahead is at the Padang Pariaman Climatology Station, with a probability of 0.36, followed by Minangkabau
Meteorological Station is 0.35, and Silaing Bawah Geophysics station is 0.29. The result has shown for the next one day period
the probability of rainfall data available from the three stations will be available following the Viterbi algorithm.
Keywords
Hidden Markov Model, Rainfall, Decoding Problem

Received: 12 February 2020, Accepted: 03 April 2020

https://doi.org/10.26554/sti.2020.5.2.34-40

1. INTRODUCTION ation, decoding problems, and learning problems. Based on

A water particle that falls to the earth’s surface through a series research by Thyer and Kuczera (2003), they have discussed a
of hydrological processes is the occurrence of rain. Meanwhile, calibration model with Bayesian approach on rainfall data by
using HMM, the same thing was done by Sansom (1998) for the
rainfall is the height of the rainwater that falls to the earth’s
surface to a flat within a certain period. Many say the least data breakpoint. Some research that uses inhomogeneous HMM
precipitation in an area influenced by several factors, including and nonparametric model reduction of rainfall events has been
factors latitude, altitude, and distance from the source of the carried out also by Mehrotra and Sharma (2005); Robertson et al.
water, the wind, the mountain areas, the temperature difference, (2004); Greene et al. (2011); Pineda and Willems (2016).
Furthermore, there are some HMM models introduced by
and the total land area.
previous researchers, the abundance of species in the river with
In analyzing the availability of rainfall in the future maybe
associated with a stochastic process, where the problem is related HMM with negative binomial model approach discussed by
to the chance of future events that they cannot be predicted Spezia et al. (2014). According to research conducted in Xia
directly on the availability of rainfall data. States of availability and Tang (2019); Li et al. (2018); Bathaee and Sheikhzadeh (2016),
of rainfall data are uncertain and subject to change, and it is HMM also associated with Bayesian analysis. Meanwhile, in
the case of time series, there is a lot of research that studies the
assumed there are some unobserved circumstances; it can be
hidden Markov models, one of which discusses the most reliable
modeled by Hidden Markov Model (HMM).
HMM is a broadening of the Markov chain in which the classification of multivariate time series as described Antonucci
state cannot be observed directly (hidden), but can only be ob- et al. (2015), and categories with HMM multiple time series by
served through a set of other observations. In HMM, there are Colombi and Giordano (2015). Also besides, HMM also is used
three fundamental problems to be solved that problem evalu- in the financial case is to predict trends in the time series as
discussed in Zhang et al. (2019) and to predict the probability
Ramadhan et. al. Science and Technology Indonesia, 5 (2020) 34-40

of the changes of the exchange rate as discussed in Ramadhan of a denoted state space 𝑆 = (𝑆1 , 𝑆2 , . . . , 𝑆𝑁 ) and the state
et al. (2020). Based on the research of Devianto et al. (2015b), the at the time t denoted by 𝑋 𝑡, 𝑡 = 1, 2, . . . , 𝑇
techniques to build models using financial data can be used au- 2. The number of observations (observation) of each state
toregressive fractionally integrated moving average (ARFIMA). represented by 𝑀, where probability every state repre-
Furthermore, the enumeration models have been developed with sented by 𝑣 = 𝑣1 , 𝑣2 , . . . , 𝑣𝑀 and space observation repre-
an exponential distribution characterization approach described sented by 𝑂 = (𝑂1 , 𝑂2 , . . . , 𝑂𝑇 ), which 𝑇 is the length of
in Devianto (2016); Devianto et al. (2015a,c). the observation data.
According to research conducted in Stoner and Economou 3. The transition probability matrix 𝐴 = [𝑎𝑖𝑗] where aij is an
(2019), it is illustrated how the hidden Markov framework could element of 𝐴 which is the conditional probability of the
be adapted to construct a compelling model for sub-daily rainfall, state at the time , given the state at the time , that is
which is capable of capturing all of these essential characteristics
well. Several homogenous Hidden Markov Models (HMMs) were 𝑎𝑖𝑗 = 𝑃(𝑋𝑛+1 = 𝑗|𝑋𝑛 = 𝑖) (2)
developed to forecast droughts using the Standardized Precipi-
tation Index, SPI, at short-medium term, as discussed in Khadr for 1 ≤ 𝑖, 𝑗 ≤ 𝑁
(2016). The hidden Markov sequence was assigned to represent 4. Observation probability distribution at the time 𝑡, at state
the recurrence of mast years, as described in Tseng et al. (2020). 𝑗, commonly known as emission matrix
Model availability of rainfall data can be formed into a hid-
𝐵 = [𝑏𝑖𝑘 ] (3)
den state HMM with attention. In this study, there are three
fundamental problems to be solved that are evaluations prob- where
lem, decoding problems, and learning problems. The results
can provide information about the availability of rainfall data 𝑏𝑖𝑘 = 𝑃(𝑂𝑡 = 𝑣𝑘 |𝑋𝑡 = 𝑖) (4)
in particular areas of West Sumatra in the future. Historical
information about the availability and the accuracy of rainfall for 1 ≤ 𝑗 ≤ 𝑁 , 1 ≤ 𝑡 ≤ 𝑇 and 1 ≤ 𝑘 ≤ 𝑀
data will be very helpful in predicting climate change and irregu- 5. The initial state distribution represented by 𝜋(i) where
larities. In addition, by obtaining an overview of future weather
conditions, the specific policy considering water supply, plant 𝜋(𝑖) = 𝑃(𝑋1 = 𝑖, 1 ≤ 𝑖 ≤ 𝑁 ) (5)
performance and yield can be estimated for better preventive of
resources for community. So HMM can be written in the notation 𝜆=(𝐴, 𝐵, 𝜋) where 𝐴
is expressed by the matrix of transition probability, 𝐵 a chance
2. EXPERIMENTAL SECTION observation matrix known as emission matrix, and 𝜋 is the dis-
tribution of the initial state. There are three special algorithms
2.1 Materials that can be solved by HMM method, namely:
A sequence of events that fulfill the legal requirement of prob- a) Evaluation Problem
ability, where every value randomly changed against time. A To calculate the probability of the observation sequence
Stochastic process predicted the properties of the future depend 𝑃(𝑂|𝜆) requires forward algorithms and backward algorithms
on the properties of the current conditions based on their charac- (Bain and Engelhardt, 1992). Steps to resolve with the advanced
teristics in the past called a Markov chain (Ross, 1996). Stochastic algorithm are as follows:
process 𝑋𝑛 (𝑡) is a series of random variables that change with i. Initial Step
the time of observation t𝜖T, and a stochastic process 𝑋𝑛 is said In this initial step, we determined the initial observation
to have Markov properties if, probability 𝛼1 (i) which ends at state at the time if it is known a
sequence of preliminary observations 𝑂1 is as follows:
𝑃(𝑋𝑛+1 = 𝑗|𝑋𝑛 = 𝑖, 𝑋𝑛−1 = 𝑖𝑛−1 , ...., 𝑋0 = 𝑖0 ) (1)
𝛼(𝑖) = 𝜋(𝑖)𝑏𝑖 (𝑂𝑖 ) (6)
= 𝑃(𝑋𝑛+1 = 𝑗|𝑋𝑛 = 𝑖)
for time and for every 𝑛 = 0, 1, 2, ... and for every 𝑗, 𝑖, 𝑖𝑛−1 , ..., 𝑖1 , 𝑖0 for 1 ≤ 𝑖 ≤ 𝑁
HMM is a stochastic model where the system is assumed to ii. Induction step
be a Markov Process with hidden states. If 𝑋 = (𝑋1 , 𝑋2 , . . . ) is a In this induction step, we determined the total observation
Markov process and 𝑂 = (𝑂1 , 𝑂2 , . . . ) is a function of 𝑋 , then is a probability 𝛼𝑡+1 (𝑗) which end in state 𝑖 at the time 𝑖 = 2, 3, 4, ...,
Hidden Markov Model which can be observed through 𝑂, or can if known a sequence of observations 𝑂1 , 𝑂2 , ..., 𝑂𝑟 is as follows:
be written to a function 𝑓 . The parameter 𝑋 represents the state
𝑁
process that is hidden, while parameter 𝑂 states an observation 𝛼𝑡+1 (𝑗) = { ∑ 𝛼𝑡 (𝑖)𝑎𝑖𝑗 }𝑏𝑗 (𝑂𝑡+1 ) (7)
room that can be observed. The elements of the Hidden Markov 𝑖−1
Model are:
for 𝑗 = 1, 2, .....𝑁 , 𝑡 = 1, 2, ..., 𝑇
1. The number of hidden state elements (hidden state) repre-
iii. Termination step On termination of this step, we deter-
sented by N as the number of states which the probability
mined the total combined odds of observation and the hidden