Flood Inundation Mappingand Depth Modellingusing Machine Learningalgorithmsand Microwave Data
Flood Inundation Mappingand Depth Modellingusing Machine Learningalgorithmsand Microwave Data
2, October 2021
Flood Inundation Mapping and Depth Modelling using Machine Learning algorithms and
Microwave data
Gyan Prakash1*, Praveen Kumar Gupta2, G Venkata Rao1 and Deva Pratap1,
1Department of Civil Engineering, National Institute of Technology Warangal, Telangana, 506004, India
2Space Applications Centre, ISRO, Ahmedabad, Gujarat, 380015, India
*Email: [email protected]
Abstract: Flooding is one of the most devastating natural hazards that significantly impact human life and property.
During floods, monitoring and mapping flood extent is crucial in identifying the flood-affected areas and the damage
assessment. Space-based monitoring of floods can provide a systematic, spatial, timely, and impartial way to monitor
disastrous floods. The study area is a part of the Kosi River in the Bihar state. In this study, using microwave remote
sensing data (Sentinel - 1), an independent and open-source tool was developed to monitor the flooding extent and water
depth. The tool consists of a hybrid model and a floodwater depth analysis model: The hybrid model is fully automated
in which Binarization techniques and Random Forest Classifier (RFC) and K Nearest Neighbor (KNN), supervised
Machine Learning(ML) algorithms were used. Using flood inundation maps and Digital Elevation Model (DEM), the
floodwater depth analysis model (PyQGIS standalone tool) was developed to calculate the flood water depth. Supervised
classification algorithms in the hybrid model were further compared and found that the performance of both the KNN
and RFC classifiers was close enough, but the time taken by RFC was less than KNN Classifier. The model results were
compared and validated with the August 2017 flood event results over the Darbhanga district. The results of the fully
automated model have shown a deviation of 0.9% to 19% compared with the published results over the Darbhanga
district. The present study suggests that the RFC ML algorithm can classify the SAR data into flooded and non-flooded
areas. The developed tool can be used to monitor floods in near/real-time to issue warnings to the people and rescue
operations.
Keywords: Flood Inundation, Floodwater Depth, Microwave Remote Sensing, Machine Learning Algorithms,
Supervised Classification
1. Introduction serious one is that the optical data doesn't have the ability
to penetrate through the clouds (Shen et al., 2019), which
Flood is one of the most devastating natural hazards which are mainly prevalent during the monsoon season.
is caused due to excessive increase in surface runoff, Microwave sensors are the alternative sources to
heavy rainfall, rise in the riverbed, cyclones, and cloud overcome the drawbacks of optical sensors. Due to the
bursts, etc. (Singh, 2015). Among the nations in the world, usage of longwave radiation, microwave sensors can
India is one of the most flood-affected countries due to its penetrate through the clouds and detailed vegetation
unique geo-climatic conditions, precipitation patterns, coverage. Microwave sensors are independent of solar
topographic features, population growth, urbanization, radiation, and they can provide the data in all weather
industrialization, etc. (Mohanty et al., 2020). According to conditions (Chang Huang et al., 2018; Shen et al., 2019).
National Flood Commission, out of a total geographical
area of approximately 329 million hectares, about 40 Several studies have been carried out for flood inundation
million hectares are prone to floods (Sharma et al., 2016; mapping and damage assessment using microwave data
Gangwar et al., 2013). Among all the river basins in India, (Anusha and Bharathi 2020; de Groeve 2010;
Ganga and Brahmaputra river basins experience the Gouweleeuw et al. 2011; D. C. Mason et al. 2012; Matgen
highest number of floods (Mohanty et al. 2020). et al. 2007; Schumann and Moller 2015; Shen et al. 2019;
Temimi et al. 2005; Tripathi et al. 2020, 2020). (Matgen
It is essential to have information about their intensities et al. 2007) extracted the flood extent and depth of
and extents to cope with the damage caused by floods. floodwater using DEM and SAR data with the help of the
Therefore, the preparation of flood inundation maps is the HEC-RAS river flow model and reported an RMSE of
primary step for damage control and assessing a flood 41cm for flood water depth. Using image segmentation,
event (Matgen et al., 2007). Compared to in-situ (David C. Mason et al. 2012) extracted the flood
measurement, remote sensing offers practical ways to inundation maps in urban and rural areas with an accuracy
observe and monitor the surface water dynamics at of 89% and a false-positive rate of 6%. For urban flood
multiple spatial and temporal scales. There are generally pixels using TerraSAR-X, 75% of pixels were accurately
two types of remote sensing datasets are available for the identified as water, with a false positive rate of 24%.
purpose of monitoring the surface water – the optical and Tripathi et al. 2020 used the Binarization method for the
microwave remote sensing data. Optical data has been classification of MODIS and SAR data by selecting
widely used to monitor and map surface water bodies due threshold values. They reported that MODIS data had
to the high availability and suitable Spatio-temporal shown an overestimation of 21% in the flood area
resolutions (Chang Huang et al., 2019). Although the compared with SAR data. Anusha and Bharathi 2020 used
optical data is numerously used for surface water body SAR and optical data for flood mapping of the August
extraction, the data has several limitations. The most
2017 flood in Uttar Pradesh with the help of thresholding area is 49.81m and 100m max elevation was found in
and Unsupervised classification methods. SRTM DEM. Gole and Chitale 1966 reports that the Kosi
river is built by large sediment flux, which also plays a
ML algorithms such as Support vector machine, random vital role in causing westward shifting of Kosi and
forest, K Nearest Neighbor (KNN), Decision Tree (DT), extensive flooding. Thus, the Kosi river changes its course
K-means, and iso-data (ISO) cluster have been used in of flow frequently with a 24 year frequency period and
several studies to minimize the human interference and causes a lot of damage in the Northern Bihar region (Bhatt
time taken for flood mapping (Benoudjit and Guida 2019; et al. 2010). In August 2008, the Kosi River routed to its
Campolo et al. 1999; D Amitrano 2018; Elsafi 2014; Feng old course of flow, followed by the Kosi river 100 years
et al. 2015; Schumann and Moller 2015; Shahabi et al. ago, and this flood affected over 2.3 million people in the
2020; R. Sinha et al. 2008; Tehrany et al. 2014, 2015). northern area of Bihar state (Singh et al. 2011). In the
Benoudjit and Guida 2019 developed an algorithm for present study, taking these damages into account caused
flood mapping using Sentinel 1 and Sentinel 2 data with by Kosi river flood, a part of Kosi river basin which lies in
the help of NDWI and a supervised Classifier. They Bihar of 14,861.535 Km2 areas and 112 Km long river was
reported an overall accuracy of 77 % for the rural and selected study area as shown in Figure 1.
74.7% for the urban floods. Shahabi et al. 2020 developed
an ensemble model using KNN as meta classifier and 3. Data and Methodology
Weighted base classifier for flood inundation mapping.
Thus, from the above studies, it can be interpreted that the 3.1 Data and Pre-Processing
hybrid/ensemble model results in higher accuracy than Active microwave remote sensing data in C band, dual
individual models. For automation of flood mapping tools polarization with VV and VH polarization from Sentinel-
with high accuracy results, hybrid models were developed 1 satellite was used. The SAR data over the Kosi river
in several studies (Anusha and Bharathi 2020; Matgen et basin was acquired for flood events days - from Alaska
al. 2007; Tehrany et al. 2014; Twele et al. 2016). Satellite Facility (ASF) as Ground Range Detected (GRD)
product with a spatial resolution of 10 m and temporal
The above studies were mostly done for inundation resolution of 10 to 12 days (Table 1).
mapping. Thus, there is a need for a coupled model that
can also estimate the floodwater depth along with It is found in various studies that the VH polarization band
inundation extent. Floodwater Depth Estimation Tool is more useful in separating Water and Other Land features
(FwDET) was used to estimate an approximate water (Benoudjit and Guida 2019; Matgen et al. 2007; Tavus et
depth of the flood plain (Cohen et al., 2018). In this study, al. 2019; Tripathi et al. 2020; Twele et al. 2016) based on
a fully automated coupled model approach for flood their Backscatter value, which can be derived from SNAP
mapping and depth modeling was made. tool and all the classification was done on Sigma0_VH_db
band of SAR datasets.
2. Study Area
SNAP tool was used for pre-processing of SAR data such
North Bihar faces heavy damages due to floods in the Kosi as radiometric calibration, speckle filtering, orbit file,
river (Bhatt et al., 2010). During the last few years, the geometric correction etc. as shown in Figure 2.
Kosi River has changed its flow course by 150 Km and
caused damage to human lives and properties every year. Flooded pixels were identified using binarization
For more than five decades, flood control management has techniques as used in (Tripathi et al. 2020) by applying
been working for this basin but continues to bring harm threshold values as a trial and error process. These
through its devastating floods every year (R. Sinha et al. threshold values can be estimated from the histogram
2008). The geomorphological properties of the Kosi River shown in Figure 3 (b). There are two peaks can be seen in
have a significant role in these extensive floods. Kosi the histogram in Figure 3(b). Thus, it can be interpreted
flows through the slopes of the Himalayas in Tibet and the that the high peak shows other land features, and the low
Southern slopes in Nepal. After that, it enters into Indian peak shows water. Water mask band was created using the
region (Kosi River). following math in the "band math" in snap tools.
In Himalayan region, only it has three tributaries, Arjun, If σo_VH < th then 255*( σo_VH < th) else 0*( σo_VH > th &&
Tamur and Sun-Kosi. Three gauge/discharge stations σo_VH !=0) (1)
along the Kosi River, namely, Barahkshetra, Birpur, and
Baltara, were used by Central Water Commission (CWC), Similar equation was used for the VV band. Further, the
India. In which Barahkshetra and Birpur show higher peak water mask generated using the above equation was
discharges than Baltara for the same return period. The checked and found 78% similarity with a published global
annual average discharge at Kosi was found 2236 m3/s, the water mask data of the Kosi River basin. While this
average monsoon discharge 5156 m3/s being almost five published Global water mask (Pekel, J F., et al. 2016)
times higher than the non-monsoon discharge 1175 m3/ s shows a 50 % probability of flood extent, as shown in
huge difference the river vulnerable to extensive flooding Figure 4. Pre-processing of DEM data is performed using
(R. Sinha et al. 2008). The average elevation in the study QGIS software.
222
Journal of Geomatics Vol. 15, No. 2, October 2021
223
Journal of Geomatics Vol. 15, No. 2, October 2021
It was found that the pixel value of the water surface was
very close to the value same as under the second smaller
peak of the histogram. The threshold value for water in this
study area varies from -19db to -22db, and using a "band
math" tool in SNAP software water mask was created by
applying equation (1) in Sigma0_VH_db band image
shown in Figure 6(b). This water mask was updated using
Figure 4. Surface water mask (source: global surface
a published global surface water mask (Pekel, J F., et al.
water product)
2016). Further, the water mask band was used as ground
truth data for RFC and KNN supervised classification.
3.2 Methodology
The main objective of this study is to develop an
Water mask had labeled data of flooded area as 255 and
automated model which will have the potential to provide
non-flooded area as 0 shown in Figure 6 (b). The labeled
flood inundation extent and water depth in near real-time.
information is used as training datasets for the machine
Automated flood mapping and water depth estimation was
learning models. N_estimator (RFC parameter) was set to
done in two stages. In the first stage, flood inundation
a value of 100 whereas six neighbors were selected in the
extent was estimated using a hybrid model, which was
KNN algorithm. Models estimated inundations are shown
developed using Machine learning-based supervised
in Figure 8.
classifiers (mainly RFC and KNN). In the second stage,
flood extents maps were used along with DEM to estimate
floodwater depth maps using the PyQGIS tool based on
FwDET (Cohen et al. 2018). The methods used in this
study are shown in Figure 5 as a flowchart; further
description of the method is discussed below.
224
Journal of Geomatics Vol. 15, No. 2, October 2021
surface interpolated within this boundary line using grow 4. Result and validation
distance tool as in QGIS. The interpolated surface zones.
Flood water depth was found after subtracting the surface 4.1 Hybrid model based inundation
created within flood extent with DEM in raster format, and The random forest model results flood inundation map,
each pixel shows the flood water depth at that location in which shows 5432 Km2 under non-flooded area and 2517
meters. The structure of this depth estimation tool is shown Km2 as flooded area. The KNN model estimated 5892
in Figure 7. Km2 as non-flooded area and 2057 Km2 under flooded
area. The algorithms, KNN and RFC, show nearly the
same flood extent and demonstrated an accuracy of 0.9719
and 0.9726, respectively with ground truth data used in
this study. This classification report shows that both
Classifiers perform well with a very little difference in
performance, but the time taken by KNN classifiers was
28hr, whereas the RFC algorithm takes only 6hr. The
confusion matrix, as shown in Figure 10, was generated to
asses the classification results.
Figure 8. Flood inundation map using RFC (left) and KNN (right) based hybrid model
225
Journal of Geomatics Vol. 15, No. 2, October 2021
4.2 Validation
Further validation of these models results was done with
the flood mapping of Darbhanga district 2017 Tripathi et
al. 2020. SAR data of 11/08/2017, 23/08/2017,
04/09/2017, and 16/09/2017 dates were used to map
inundation using the Binarization technique. The flood
inundation map generated by Tripathi et al. 2020 and
hybrid model based inundation in the present study were
compared for validation. On August 23, 2017, heavy
runoff was calculated using TRMM and IMD rainfall
products found in (Tripathi et al. 2020).
226
Journal of Geomatics Vol. 15, No. 2, October 2021
227
Journal of Geomatics Vol. 15, No. 2, October 2021
228
Journal of Geomatics Vol. 15, No. 2, October 2021
India. Asian Journal of Earth Sciences, 4(1), 9–19. Journal of Hydrology, 512, 332–343.
Singh D.A. (2015). Floods in India: Cause and Control Tehrany M. S., B. Pradhan, S. Mansor, and N. Ahmad
(2015). Flood susceptibility assessment using GIS-based
Sinh R., G.V. Bapalu, L. K. Singh and B. Rath, B. (2008).
support vector machine model with different kernel types.
Flood risk analysis in the Kosi river basin, north Bihar
Catena, 125, 91–101.
using multi-parametric approach of Analytical Hierarchy
Process (AHP). Journal of the Indian Society of Remote Temimi M., R. Leconte, F. Brissette and N. Chaouch
Sensing, 36(4), 335–349 (2005). Flood monitoring over the Mackenzie River Basin
using passive microwave data. Remote Sensing of
Sivanpillai R., K. M. Jacobs, C. M. Mattilio and E.V.
Environment, 98(2–3), 344–355.
Piskorski (2020). Rapid flood inundation mapping by
differencing water indices from pre- and post-flood Tripathi G., A. C. Pandey, B. R. Parida, and A. Kumar
Landsat images. Frontiers of Earth Science 2020 15:1, (2020). Flood Inundation Mapping and Impact
15(1), 1–11. Assessment Using Multi-Temporal Optical and SAR
Satellite Data: a Case Study of 2017 Flood in Darbhanga
Tavus B., S. Kocaman, H. Nefeslioğlu and C.
District, Bihar, India. Water Resources Management,
GÖKÇEOĞLU (2019). Flood Mapping Using Sentinel-1
34(6), 1871–1892.
SAR Data: A Case Study of Ordu August 8 2018 Flood.
International Journal of Environment and Geoinformatics, Twele A., W. Cao, S. Plank, and S. Martinis (2016).
6(3), 333–337. Sentinel-1-based flood mapping: a fully automated
processing chain. International Journal of Remote Sensing,
Tehrany M. S., B. Pradhan and M. N. Jebur (2014). Flood
37(13), 2990–3004.
susceptibility mapping using a novel ensemble weights-of-
evidence and support vector machine models in GIS.
229