0% found this document useful (0 votes)
43 views

LSTM

The document discusses using a long short-term memory (LSTM) deep learning algorithm to predict significant wave height along the Brazilian coast. The LSTM model is trained using ERA5 reanalysis data and buoy data from 7 locations along the coast. Forecasts are made for lead times of 6, 12, 18, and 24 hours. Results show the LSTM model can accurately forecast significant wave height, with accuracy close to 87% compared to actual buoy measurements. The study demonstrates that data-driven deep learning models using LSTM can be used as an alternative to computationally expensive physical wave models.

Uploaded by

chaion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

LSTM

The document discusses using a long short-term memory (LSTM) deep learning algorithm to predict significant wave height along the Brazilian coast. The LSTM model is trained using ERA5 reanalysis data and buoy data from 7 locations along the coast. Forecasts are made for lead times of 6, 12, 18, and 24 hours. Results show the LSTM model can accurately forecast significant wave height, with accuracy close to 87% compared to actual buoy measurements. The study demonstrates that data-driven deep learning models using LSTM can be used as an alternative to computationally expensive physical wave models.

Uploaded by

chaion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Ocean Modelling 181 (2023) 102151

Contents lists available at ScienceDirect

Ocean Modelling
journal homepage: www.elsevier.com/locate/ocemod

A deep learning approach to predict significant wave height using long


short-term memory
Felipe C. Minuzzi a ,∗, Leandro Farina a,b
aInstitute of Mathematics and Statistics, Federal University of Rio Grande do Sul (UFRGS), Av. Bento Goncalves 9500, PO Box 15080, Porto Alegre, RS, Brazil
bCenter for the Study of Coastal and Oceanic Geology (CECO), Federal University of Rio Grande do Sul (UFRGS), Av. Bento Goncalves 9500, Bulding
43.125, Porto Alegre, RS, Brazil

ARTICLE INFO ABSTRACT


Keywords: We present a new deep learning training framework for forecasting significant wave height on the Southwestern
Ocean waves Atlantic Ocean. We use the long short-term memory algorithm (LSTM), trained with the ERA5 dataset and also
Deep learning with buoy data. The forecasts are made for seven different locations in the Brazilian coast, where buoy data
Long–short term memory
are available. We consider four different lead times, e.g., 6, 12, 18 and 24 h. Experiments are conducted using
Significant wave height
exclusively historical series at the selected locations. The influence of other variables as inputs for training is
Forecast
investigated. Results of the LSTM forecast show that a data-driven methodology can be used as a surrogate
to the computational expensive physical models and also as an alternative to the reanalysis data. Accuracy of
the forecasted significant wave height is close to 87% when compared to real buoy data.

1. Introduction to forecast wave heights along the east coast of India. They used buoy
observations data as input to the model, collected every three hour,
Almost all engineering applications concerning the ocean, from nav- from May 1983 until August 1984, resulting in 16 months of historical
igation to renewable energy, going through offshore platforms, alerts data. Different lead times were analysed and the predictions yielded
of catastrophic events and geosciences research (Komen et al., 1996; accurate results. From this point on, ANNs have been used in several
Cavaleri et al., 2007; Ardhuin et al., 2019), benefit from an accurate works for wave predictions.
description of the sea state, of which wave heights are probably the Short-term forecasts using ANNs trained with data from two sites
most important parameter. offshore the Atlantic and the Irish Sea coasts of Ireland is reported
It is no novelty that ocean waves can be simulated with in Makarynskyy (Makarynskyy, 2004), and hourly forecasts of sig-
mathematical–physical models, and several state-of-the-art programs
nificant wave height and zero-up-crossing wave periods are made
are available for such purpose (see, e.g., the WAVEWATCH III WAVE-
for 1–24 h time intervals. In another ANNs work, Makarynskyy and
WATCH III Development Group (WW3DG), 2019 and SWAN Booij
collaborators (Makarynskyy et al., 2005) have used buoy data from
et al., 1997). However, powerful artificial intelligence algorithms are
Portugal’s West coast, also to forecast significant wave height and zero-
gaining visibility and popularity with the increasing public compu-
up-crossing wave period for 3, 6, 12 and 24 h intervals. In one of
tational libraries available making it possible the analysis of large
amount of data for several applications that are based on historical the approaches, each parameter over every time interval is forecasted
information. As several reliable databases are available and continue by a separate ANN, while in the second approach, only two ANNs
to expand the time period for which they provide historical data, one are used to concurrently simulate the variables. Browne et al. gave
important question has been posed: can data-driven models, with help a comprehensive explanation of using ANNs to estimate waves near-
of the artificial intelligence, act as a physical model surrogate, with shore and compared results with SWAN model. Results show that ANNs
computational time and accuracy that are superior to the latter? This outperformed the physical model in simulations for seventeen near-
question has already been posed and a compeling review can be found shore locations around the continent of Australia over a 7 month
in Boukabara et al. (2019, 2022) and Boukabara and Hoffmanb (2022). period.
The use of artificial intelligence, specially in the context of artificial Agrawal and Deo (2004) applied ANNs as an alternative to find
neural networks (ANNs) in wave modelling has already been investi- interrelationships among certain characteristic wave parameters. Net-
gated and studied. In a pioneering research, Deo and Naidu (1998) works were trained for locations at the east coast of India and devel-
used feed-forward networks with three different training approaches oped in order to estimate values of average zero-cross wave period,

∗ Corresponding author.
E-mail addresses: [email protected] (F.C. Minuzzi), [email protected] (L. Farina).

https://doi.org/10.1016/j.ocemod.2022.102151
Received 15 October 2021; Received in revised form 23 August 2022; Accepted 22 November 2022
Available online 2 December 2022
1463-5003/© 2022 Elsevier Ltd. All rights reserved.
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

peak-spectral period, maximum spectral energy density and maximum The long short-term memory (LSTM) recurrent neural network has
wave height from the given value of significant wave height and also the ability of learning long-term dependencies. This benefit gives the
to evaluate the spectral width parameter from the spectral narrowness possibility to create a neural network that can use information of a
parameter. Krasnopolsky et al. (2002) proposed an alternative to the long past to build its predictions. LSTM architecture presents a ’forget
complex mathematical formulations involved in forecast systems by gate’, a modification made to overcome the vanishing gradient prob-
approximating solutions of exact physical models using ANNs. They lem. In a recent work, Pirhooshyaran and Snyder (2020) used LSTM
consider the UNESCO equation of state of the seawater (density of together with a sequence-to-sequence neural network to forecast and
the seawater) and an approximation for the nonlinear wave–wave hindcast ocean waves, as well as to reconstruct missing data. Besides,
interaction. The non-linear interactions in wind wave spectra is also feature selection has been employed based on nearby buoys data. Hu
investigated by Tolman et al. (2005) using ANNs, while Zamani et al. et al. used LSTM to predict ocean wave height and period under the
(2008) forecasted significant wave heights for several hours ahead near-idealized wave growth conditions of Lake Erie (Hu et al., 2021),
using buoy measurements in Caspian Sea using models based on ANNs. obtaining an accuracy close to numerical models but reduced storm
Londhe and Panchang (2006) applied ANNs within the feed-forward peak underestimation. Lou et al. (2021) proposed a framework for
back propagation algorithm for four different lead times (6, 12, 18 and automatic driving scheme combining wave height prediction using
24) to forecast significant wave heights in six different buoy locations at LSTM with ship driving in order to adjust ships course and avoid areas
the Gulfs of Mexico, Alaska and Maine. Six network architectures were with higher waves. Predictions of wave height, wave period, and wave
tested for each buoy location, and predictions were made for the period direction in the US Atlantic Coast using an ANN constructed with both
of 8 January until 31 December, 2004 for all buoys except one, for convolutional LSTM and LSTM layers is conducted by Wei (2021).
which the predictions ended in 16 September of that year. An accuracy The ANN is used to forecast storm waves induced by winter storms,
of 86% is obtained for the 6 h lead time and between 67% and 83% for Hurricanes Isaias and Eta with good accuracy specially for short lead
12 h. Besides that point, as expected, the accuracy drops to 55%–71% times.
for 18 h lead time and less than 63% for the 24 h lead time forecast. In the context of ocean wave modelling, given the comprehen-
Despite the good results, a limitation highlighted by using ANNs in sive amount of significant wave height time series documented and
ocean wave predictions is the considerable under-predictions in the available, LSTM is a good alternative to improve predictions of this
highest peaks, which can be explained by the dominance of smaller variable.
wave heights records in the datasets used to train the network. In an Thus, the present work aims to develop a new training methodology
attempt to overcome this drawback, data from 2004 were added to for LSTM neural networks to forecast significant wave height on several
the training phase (which had records of the Hurricane Ivan), and a locations in Brazil’s coast, for four different lead times. We consider
tendency to catch the higher peaks was observed. only the historic wave data to feed the net since, as it will be shown,
More recently, Campos et al. (2020, 2017) developed a post- additional features show no improvement in the final results. Our
processing algorithm to improve ensemble averaging, as a replacement approach differs from the work in Pirhooshyaran and Snyder (2020)
to the typical arithmetic ensemble mean, using neural networks trained both in methodology and goals. We choose to perform forecast with
with altimeter data. Similar techniques using nonlinear ensemble av- verification times within a time period of one month (or 744 time-
eraging, also based on neural networks, were studied in the Gulf of steps) and the training phase of the LSTM considers four different lead
Mexico (Campos et al., 2019). James et al. (2018) used supervised times, namely, 6, 12, 18 and 24 h. Moreover, we solely focus on LSTM
machine learning algorithms to estimate ocean-wave conditions at the predictions for significant wave height, using (i) only this variable
Monterey Bay, California, to act as a surrogate to physical models, for uni-variate forecasts; (ii) closely correlated variables, such as peak
while O’Donncha et al. (2018) combined both ensemble physical model wave period and 10m wind speed and (iii) the best four variables
simulations and machine learning to derive an integrated technique based on Pearson’s correlation coefficient, for multi-variate forecast.
that investigate uncertainty from modelling and generate a forecast that Training of the LSTM network is performed using both ERA5 reanalysis
is better than the best individual model prediction. Bento et al. (2021) data, implemented by the European Center for Medium Range Forecast
used Convolutional Neural Networks (CNN) to forecast short-term wave (ECMWF) (Hersbach et al., 2018, 2020) and real buoy measurements.
power in different seasons of the year, with results that outperforms As our contribution, we present a new methodology to train LSTM
conventional methods for horizons between two and six hours. network that can be suitable for short-term forecasts of significant
Not only ANNs were used to ocean wave predictions in the context wave height, and potentially for other ocean waves variables as well,
of data-driven results, but also several other artificial intelligence tech- with a large decrease in computational time if compared to traditional
niques, such as support vector machines (Browne et al., 2007), Bayesian physical models and with improved accuracy. Moreover, results using
optimization (Cornejo-Bueno et al., 2018), genetic programming (Nit- ERA5 as training dataset does not aim to provide an operational fore-
sure et al., 2012; Gaur and Deo, 2008) and wavelets (Oh and Suh, 2018; cast, because these data are usually not available on a daily basis, but
Prahlada and Deka, 2015). Furthermore, Deep learning, i.e., neural training the LSTM with buoy dataset is an independent experiment and
networks within a deep range that uses great amount of data, has could be adapted to be used as a possible operational forecast.
already been shown to be a powerful technique for oceanography This paper is structured as follows: In Section 2, we show the
predictions (Zheng et al., 2020; Choi et al., 2020). mathematical and theoretical background of the machine learning
The traditional architecture of ANNs had already been improved. If algorithm used in this work, the LSTM . In Section 3, the data used
the network has at least one feedback loop, forming a cycled connection are described and in Section 4 the methodology of our framework is
between layers, then we create recurrent neural networks (RNNs), an presented. Section 5 shows the predictions results based on LSTM and
architecture that aims to give a better accuracy on predictions that comparisons with ERA5 and buoy observations. Finally, the conclusions
needs a memory along the network. Nevertheless, the use of RNNs in its are presented in Section 6.
standard configuration to account for contextual information is still lim-
ited, due to the effect known as vanishing gradient problem (Hochreiter 2. Artificial neural networks
et al., 2001). As the information circles around the recurrent network
in time, the influence of a input on the hidden layer, and consequently Machine learning algorithms are capable of producing data-driven
on the output, either decays or blows up exponentially. One attempt decisions. Supervised learning is a type of machine learning where both
to solve this problem is with the Long Short-Term Memory (LSTM) ar- input and output data are given in the training phase. The algorithm
chitecture, presented for the first time by Hochreiter and Schmidhuber learns the mapping function from the input to output variables, and
(1997). after that, with a new set of data, the algorithm produces an output.

2
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 1. LSTM memory block. The input, output and forget gates that controls the activation of the cell through the multiplicative units (black squares) from inside and outside.
Note that the forget gates will multiply the previous state of the cell. Functions 𝑔 and ℎ are activation functions explained in the text.

Amongst machine learning techniques, artificial neural networks The architecture of LSTMs is build as follow: inside the memory
(ANN) were developed motivated by the way in which the brain block, there is one or more central cells that are self-looped into three
performs a particular task or function of interest (Haykin, 2009). From multiplicative units called input, output and the forget gates. This
a mathematical standpoint, they can be considered as multiple non- difference of having more units controls the flow of information (Good-
linear regression methods able to capture hidden complex nonlinear fellow et al., 2016), where the multiplicative input protects the memory
relationships between input and output variables (Peres et al., 2015). block from receiving perturbation from irrelevant inputs, while the
Several ANNs architectures that are based on layers of the neurons output gates protect other units from irrelevant information of the
are possible. A feedforward network consists of a connection between current block (Hochreiter and Schmidhuber, 1997). The units work as
input layer (where the data are gathered), hidden layers and the output gates to avoid weight conflicts, i.e., the input gate decides when to keep
layer (which gives us our desired result). The term ‘hidden’ refers to the or exclude information within the block, while the output gate decides
fact that this part of the network is not seen directly either by the input when to access the block and prevent other blocks from being perturbed
and the output layers (Haykin, 2009). We can consider a single-layer by itself.
architecture, where no hidden layers are used. The benefit of hidden Fig. 1 shows an overview of the memory block inside a LSTM
layers is to give the network a global perspective due to the extra set of network. The three gates receive activations from both inside and
synaptic connections and dimension of neural interaction (Churchland outside (other memories blocks), controlling the activation of that cell
and Sejnowski, 1994). by multiplications. The input and output gates multiply the input and
Differently, recurrent neural networks store some information about output of the cell while the forget gate multiplies the previous state of
the past time evolution of the system in a hidden state vector. This the cell. If 𝑛, 𝑚 and 𝑘 correspond to the number of inputs, outputs and
means that a neuron’s output can be feedback as an input to all neurons cells in the hidden layer, respectively, the activation of the input gate
of the net. A self-feedback network occurs when an output neuron feeds 𝑏𝑡𝜎 at time 𝑡 is given by
its own input, while a no self-back network means otherwise, that is, ( 𝑛 )
∑ ∑
𝑚 ∑𝑘
the output of one neuron is used as input to all other neurons but itself. 𝑡
𝑏𝜎 = 𝜙 𝑡
𝜔𝑖𝜎 𝑥𝑖 + 𝑡−1
𝜔ℎ𝜎 𝑏ℎ + 𝑡−1
𝜔𝑐𝜎 𝑠𝑐 , (1)
This attribute of RNNs allows a memory of previous inputs to persist 𝑖=1 ℎ=1 𝑐=1
in the network’s internal state, and thereby influence the network where 𝜎 represents the input gate, 𝑥 the signal, 𝜔 the weights that will
output (Graves, 2008). RNNs can be derived from nonlinear first-order connect two units and 𝑠𝑡𝑐 is the activation of the cell 𝑐 at time 𝑡 unit.
non-homogeneous ordinary differential equations; a deep and elucidate Usually, the gate activation function 𝜙 is the logistic sigmoid so that
analysis can be found in Sherstinsky (2020). Nevertheless, the use of the values are between 0 and 1 (Hochreiter and Schmidhuber, 1997;
RNNs in its standard configuration to account for contextual informa- Graves, 2008; Goodfellow et al., 2016). The activation of the forget
tion is still limited, due to the vanishing gradient problem (Hochreiter gate 𝑏𝑡𝜏 at time 𝑡 is given by
et al., 2001). ( 𝑛 )
∑ ∑
𝑚 ∑
𝑘
𝑡 𝑡 𝑡−1 𝑡−1
𝑏𝜏 = 𝜙 𝜔𝑖𝜏 𝑥𝑖 + 𝜔ℎ𝜏 𝑏ℎ + 𝜔𝑐𝜏 𝑠𝑐 . (2)
2.1. Long short-term memory (LSTM)
𝑖=1 ℎ=1 𝑐=1

With the purpose of solving the vanishing gradient problem, long while the activation of the output gate 𝑏𝑡𝛾 at time 𝑡 is
short-term memory (LSTM) architecture incorporates non-linear, data- ( 𝑛 )
∑ ∑
𝑚 ∑𝑘
dependent controls into the RNN cell, so that the gradient of the loss 𝑡 𝑡 𝑡−1 𝑡
𝑏𝛾 = 𝜙 𝜔𝑖𝛾 𝑥𝑖 + 𝜔ℎ𝛾 𝑏ℎ + 𝜔𝑐𝛾 𝑠𝑐 . (3)
function does not vanish (Sherstinsky, 2020). One difference between 𝑖=1 ℎ=1 𝑐=1
LSTMs and RNNs is that the summation in the hidden layer is replaced The activation of the cell 𝑠𝑡𝑐 at time 𝑡 is
by a memory block, which has four neural network connected and in- ( 𝑛 )
teracting together. This structure allows LSTMs to learn and remember ∑ ∑
𝑚
𝑡 𝑡 𝑡−1 𝑡 𝑡 𝑡−1
𝑠 𝑐 = 𝑏𝜏 𝑠 𝑐 + 𝑏𝜎 𝑔 𝜔𝑖𝑐 𝑥𝑖 + 𝜔ℎ𝑐 𝑏ℎ (4)
information for a long-time period, which is its default behaviour. 𝑖=1 ℎ=1

3
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 2. Region of analysis in the Brazilian coast with bathymetry. Red circles indicates the location of the seven buoys studied in this work. The blue colour scale represents the
bathymetry.

where 𝑔, the activation function, is usually a hyperbolic tangent of experiments, the training dataset is selected until one month prior to
logistic sigmoid functions. Finally, the cell output 𝑏𝑡𝑐 at time 𝑡 is given the predictions period (see Table 1).
by Seven buoys locations are considered for this study. All of them
are located in the Brazilian coast, ranging from longitude 49◦ 86′ W
𝑏𝑡𝑐 = 𝑏𝑡𝛾 ℎ(𝑠𝑡𝑐 ) (5)
to 38◦ 25′ W and latitudes 31◦ 33′ S to 3◦ 12′ S. These buoys belong to
where ℎ can be the same function as 𝑔, or even the identity function. the National Program of Buoys (PNBOIA) of the Brazilian Navy, which
In the formulae above, 𝜔𝑐𝜎 , 𝜔𝑐𝜏 and 𝜔𝑐𝛾 indicate the weights from the aims to collect oceanographic and meteorological data of the Atlantic
cell to the input, forget and output gates, respectively. Note that if the Ocean (Pereira et al., 2017; Navy, 2020). A detailed explanation of
input gate has an activation near zero, it will not open, and therefore data quality can be found in Navy (2019). Fig. 2 shows the region of
the activation of the cell will not be overwritten by new inputs, and analysis, with red circles indicating the location of the buoys, while
will be available later in the sequence, i.e., the block store information Table 1 presents the longitudes and latitudes of the seven buoys.
for a longer time than usual (Graves, 2008). Predictions are made for verification times within one month, but the
The calculations explained above are provided for the forward pass month chosen vary for each buoy, due to the lack of complete data in
of a LSTM hidden layer, starting at 𝑡 = 1 and applying the equations the databases. Table 1 also shows the period of predictions to each buoy
iteratively to update 𝑡, until the length of the input sequence of data location. Nowadays, only buoy 7 is still operating, while all others are
is achieved. However, to complete the calculations, a backpropagation in maintenance.
backward pass is necessary, starting at the length of the input and going In this work, we use three metrics to analyse the accuracy of our
back until 𝑡 = 1. The formulae of this phase can be found in Hochreiter results. The well known mean absolute error (MAE), which is given by
and Schmidhuber (1997), Graves (2008) and Goodfellow et al. (2016).
1 ∑|
𝑛
𝑀𝐴𝐸 = 𝑦̃ − 𝑦𝑖 || , (6)
3. Data and area of study 𝑛 𝑖=1 | 𝑖

where the tilde means reference value while non-tilde means predicted
The LSTM methodology described in the previous section is used
value (𝑛 is the number of observations). The relative error (RE) for each
here to predict the significant wave height 𝐻𝑠 , based on the ERA5
step is given by
reanalysis and observational data. Values of 𝐻𝑠 are gathered from ERA5
|𝑦̃𝑖 − 𝑦𝑖 |
𝑅𝐸 = 100 | |,
hourly from 1979 to present (Hersbach et al., 2018, 2020). ERA5 is the
(7)
fifth generation ECMWF reanalysis for the global climate and weather |𝑦̃𝑖 |
| |
for the past 4 to 7 decades. We chose ERA5 reanalysis dataset since
and the mean absolute relative error (MAPE),
it is an approximation of the true sea state, provides comprehensive
1 ∑ ||𝑦̃𝑖 − 𝑦𝑖 ||
𝑛
global data publicly available, it allows any user to test the present
𝑀𝐴𝑃 𝐸 = 100 . (8)
method and provides long-term data (in contrast to buoy data). In our 𝑛 𝑖=1 ||𝑦̃𝑖 ||

4
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 3. MAPE calculated for LSTM predictions using different starting dates: from January/1979 until February/2018. Simulations were performed for buoy number 1 location
and ERA5 as training dataset.

Table 1
Geo-spatial latitude and longitude location of the seven buoys used in this work, period of prediction, water depth and city of location in Brazil.
Longitude Latitude Period of prediction Water depth (m) WMO City/State location
Buoy 1 49◦ 86’ W 31◦ 33’ S March/2018 200 31053 Rio Grande/RS
Buoy 2 47◦ 15’ W 27◦ 24’ S March/2018 200 31231 Itajaí/SC
Buoy 3 42◦ 44’ W 25◦ 30’ S January/2021 2164 31374 Santos/SP
Buoy 4 39◦ 41’ W 19◦ 55’ S March/2017 200 31380 Vitória/ES
Buoy 5 37◦ 56’ W 16◦ 00’ S March/2016 200 31260 Porto Seguro/BA
Buoy 6 34◦ 33’ W 8◦ 09’ S March/2016 200 31229 Recife/PE
Buoy 7 38◦ 25’ W 3◦ 12’ S March/2017 200 31229 Fortaleza/RN

Both MAPE and RE are given in percentages while MAE is in the same used for the purpose of selecting the right dataset size, since training of
unit as the data. the network is performed using ERA5 data in this case. Simulations and
Since the training dataset time span can have impact in the pre- the establishment of the training dataset to be used adopted exclusively
dictions, both on accuracy and processing time, we conducted an buoy number one (Rio Grande). As we can see, the MAPE starts to
experiment with the data gathered from 1979 until 2018, to find that, grow exponentially with a training dataset of only one year, from
for different lead times, the amount of data needed for a better accuracy February/2017 until the end of February/2018, which is an expected
is different. We show in Fig. 3 the MAPE between LSTM predicted and behaviour, considering the need for historical data of the algorithm.
ERA5 reanalysis data, for starting the date of training set from 01- However, as the training set becomes larger than one year, we see that
01-1979 to 28-02-2018 (see the 𝑥-axis), i.e, data range from 39 years although LSTM is developed to remember historical data, it is not that
to one week prior to the prediction period, which is the same for the bigger the dataset, the better is the accuracy. In fact, for all lead
each starting date (defined in Table 1). Although we are comparing a times, there is an increase of MAPE in some datasets that are longer
forecast from LSTM with reanalysis from ERA5, this difference can be than 25 years. This could happen because of the quality of the data

5
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 4. Schematic of the methodology for training applied in this work, for lead time six (other lead times follow analogously). For each verification time 𝑡𝑖 , where 𝑖 = 1, … , 𝑛,
the training set consists of data until six hours steps prior to 𝑡𝑖 .

Table 2 𝐻𝑠 , as input. All above simulations use ERA5 in the training dataset.
Training set starting date best MAPE accuracy results for four
Lastly, we perform simulations with real buoy data observations to train
different lead times considered in this work.
the network. A summary of the simulations is presented in Table 3 and
Lead MAPE Training start date
explained in more detail in Section 5.
6 5.70% 01-01-2003
In the training phase of the LSTM, a cross-validation scheme was
12 10.98% 01-01-2002
18 15.85% 01-01-2005 implemented, where 80% of the data are selected for the training and
24 19.41% 01-01-1987 20% for validation. This strategy is an excellent framework to avoid
overfitting of a model, i.e., a model that yields a good accuracy to the
validation set (seen data) and a bad result to unseen data. Thus, in our
methodology, the data were divided into the effective training set, the
or how the variable is changing over time, and is object for further
validation set (80∕20 split) and the set of values to predict, called test
investigation.
Thus, to reach the better results for a prediction, we consider the set, using a standard terminology from machine learning algorithms.
accuracy, size of the training set and global accuracy, both by MAPE, To avoid excess in the memory usage of the training phase, the
MAE and RE. Based on this analysis, we set the starting date of the full set of data is subdivided into smaller batches. Training is then
training sets for each lead time differently, as summarized in Table 2. performed for each batch, with the target value considered for this
Still, it is possible to reach good accuracy with one year of dataset; sim- batch, and respecting the lead time. To define the batch size, several
ulations will run fast and the predictions will be satisfactory, although simulations were performed, and a optimal value of twelve data points
with some loss of accuracy if compared with the training size datasets was obtained.
we have chosen in the work. Its important to mention also that, with The Python library TensorFlow (Abadi et al., 2015) is an end-to-end
smaller datasets, the larger lead time predictions, specially eighteen and open source platform for machine learning and its Keras API (Chollet
twenty four, will fail to predict peaks or valleys since the network will et al., 2015) are used in this work to implement the LSTM algorithm.
approach the average value of the period. The model is compiled using the mean absolute error as loss function
which is optimized by the Adam algorithm (Kingma and Ba, 2014).
4. Methodology Adam optimization is a stochastic gradient descent method that is based
on adaptive estimation of first-order and second-order moments. The
The main goal of this work is to predict the significant wave height LSTM network is build with three layers, one with 64, other with 48
𝐻𝑠 for the seven buoys locations presented in Table 1, hourly for four neurons and the last with 32. To tune the model, a hyper-parameters
lead times (6, 12, 18 and 24), using the LSTM neural network. As optimization was held and these values yields the best results possible,
mentioned, the starting date of training set is different for each lead considering the metric (MAPE) and the computational time. The code
time, and is defined as showed in Table 2, while the end of the database used in the simulations can be found in Minuzzi (2022).
depends on the lead time, that is, for each verification time, a new
training is performed with the training set consisting of data from the
5. Results
start date until the prediction time minus the lead time. With this, we
guarantee that our training will have as much as data necessary for the
best result, respecting the lead time gap. Thus, in our methodology, we In this section, we present the results of the predictions for each
make an one hour forecast, that is separated from the training dataset buoy location and lead time. Simulations are performed using both GPU
by the lead time hours, for each verification time. This methodology and CPU parallelization, to improve performance. Early stopping is
differs from the traditional train-predict machine learning framework, used for convergence of the solution, i.e., as the learning rate achieves
since we want to preserve the gap between train and predicted step. a desired accuracy based on MAE, the training stops.
Fig. 4 shows a schematic of the training methodology applied. For each lead time, our algorithm took approximately 14 min for
We considered different strategies for the prediction: first, only the training the LSTM and the prediction of a single time value took
historical series of 𝐻𝑠 in the buoy location is used for training. We also 2.62 × 10−6 s. The simulations were performed in a machine with Intel
investigate the influence of other variables in the training set, based Xeon processor with 20 cores, 128 Gb of RAM memory, with a GeForce
both on physical aspects and on a correlation analysis. Thus, predictions RTX 2080 Ti graphic card. The GPU is used to parallelize the training
of 𝐻𝑠 is also performed using other physical variables, but excluding process.

6
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Table 3
Summary of simulations developed in this work.
Framework Features Target Training set Validation set Test set
Only the historical 𝐻𝑠 𝐻𝑠 ERA5 20% of training ERA5
series of 𝐻𝑠
Multivariate 𝐻𝑠 , 𝑇𝑝 , 𝑈10 𝐻𝑠 ERA5 20% of training ERA5
predictions of 𝐻𝑠
Multivariate Best correlated 𝐻𝑠 ERA5 20% of training ERA5
predictions of 𝐻𝑠 variables with 𝐻𝑠
Predictions not 𝑇𝑝 and 𝑈10 𝐻𝑠 ERA5 20% of training ERA5
considering 𝐻𝑠 as
feature
Predictions using 𝐻𝑠 𝐻𝑠 Buoy data 20% of training Buoy data
buoy data in the
training set

Table 4
MAPE and MAE metrics for each buoy location and lead times for 𝐻𝑠 predictions with LSTM.
MAPE MAE
6 12 18 24 6 12 18 24
Buoy 1 6.12% 11.9% 18.82% 24.15% 0.13 m 0.26 m 0.39 m 0.50 m
Buoy 2 4.88% 9.36% 12.98% 14.69% 0.08 m 0.16 m 0.23 m 0.26 m
Buoy 3 4.28% 9.06% 11.83% 13.26% 0.08 m 0.17 m 0.23 m 0.25 m
Buoy 4 4.67% 8.87% 11.34% 12.27% 0.06 m 0.12 m 0.15 m 0.17 m
Buoy 5 3.15% 6.27% 6.64% 8.25% 0.04 m 0.08 m 0.08 m 0.11 m
Buoy 6 2.19% 3.7% 4.97% 5.98% 0.03 m 0.05 m 0.07 m 0.09 m
Buoy 7 2.75% 4.54% 4.71% 6.2% 0.04 m 0.07 m 0.08 m 0.10 m

5.1. Predictions using 𝐻𝑠 both as feature and target Some disparity is seen between predictions and observed values by
the buoys at locations five and seven. As we can see in Figs. 6 and 7, the
In this section, we show the results of the LSTM forecast, using buoy data have a higher variance from the mean, with values ranging
ERA5 as training dataset and only 𝐻𝑠 as feature and target. All metrics from low heights to higher heights, but not crossing 2m. This leads
here are calculated using ERA5 as reference value in the formulas (6)– the LSTM (and also the reanalysis) to smooth the prediction around
(8). Figs. 5–7 present the comparison between predictions obtained a mean value. Nevertheless, the pattern of the observation data is well
with LSTM, the ERA5 reanalysis (same data in the four figures) and described by the deep learning algorithm. This behaviour is not seen in
the observations, for buoys locations number two, five and seven, buoy location number two, and a possible reason for that being due
respectively. As it can be observed, lead time 6 yields a lower MAPE, to the difference in water depths in the regions where the data are
calculated between LSTM prediction and ERA5 reanalysis data, in all obtained. Buoys locations number five and seven are in shallow water
the locations, which is expected since the gap between the final step while buoy location number two is in deeper water.
of training and the predictions step is smaller, so the memory that The predicted results for buoys locations number one, three, four
the network needs to keep is smaller. For buoy location number two, and six are shown in Figs. 8 and 9. Predictions for buoys locations
based on the MAPE, the LSTM results show an accuracy1 of 95.12% number one and three follow closely the data from reanalysis and
for lead time 6, while accuracy is 90.64% for lead time 12, accuracy observations, while despite the fact that reanalysis and predictions are
of 87.02% for lead time 18 and accuracy of 85.31% for lead time 24. very similar, a little discrepancy is seen between buoys locations four
We can see that the pattern for lead time 6 is very well described, with and six, if compared to observations. This could also be due to the depth
the forecast series closely following the reanalysis values and the buoy of the water in this four buoys locations.
data. Except for two times, e.g., 2018-03-20 at 1:00:00 UTC and 2018-
03-20 at 2:00:00 UTC, where the relative error is above 30%, all errors 5.2. Multivariate predictions of 𝐻𝑠
are below 25% for lead time 6 whilst the MAE is equal to a remarkable
0.086 m. Despite a small loss of accuracy for lead times 12, 18 and 24, So far we have considered historical series of only significant wave
we still observe an accuracy of more than 85% for all lead times in this height as input to our model. However, we could consider more than
location. one variable as input to train the network, and also consider one (or
multiple) output as the prediction. In this sense, the LSTM will capture
Predictions for buoys locations number five and seven show the best
how the data from other variables can improve the predicted variable.
accuracies in all studied buoys locations, as we can see in Table 4.
We call the variables that will not be predicted, but used as input to
An outstanding accuracy of 97.25% is obtained for lead time 6 in
the model as features, while the significant height, in this work, will
buoy location number seven, while the 24 h lead time prediction,
remain our target variable. In what follows, we decided to conduct our
expected to behave somewhat poorly, has an accuracy of 93.8%. The
experiments with buoy location number one, since this has a lower
predicted values follows closely the reanalysis data, and the reason
global accuracy compared to the others, both for MAPE and MAE.
for this pattern could be explained by the fact that the historical
Our multivariate simulation, the architecture of the network and the
series of reanalysis has a high consistency in its values, considering the
parameters used are the same as the single variable simulation. The
seasonality of the series that facilitates the training process, leading to
difference here is that two more variables are considered in the input:
a better optimization of the error in the network. The results for buoy
the peak wave period 𝑇𝑝 and the 10m wind speed 𝑈10 . As output, we
location five also have an accuracy above 90% for all lead times, and
have only the prediction of the significant wave height. Note, however,
the same quality of the historical series is observed.
that 𝐻𝑠 is also used as input of the model.
We show in Fig. 10 the predicted values of significant wave height
1
Here, the accuracy is obtained subtracting MAPE from 100%. for this multivariate version of our algorithm. Metrics are calculated

7
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 5. Predictions (black) and reanalysis from ERA5 (red) comparison for buoy location number two. The blue dots show real observations obtained from the buoy.

using ERA5 as reference value. We can see that there is no considerable the LSTM network, as already observed in other works in the literature
improvement in the result and the accuracy of the predictions using two using non-recurrent neural networks (Londhe and Panchang, 2006).
more variables as input. In fact, a large error is observed for lead times Roughly speaking, in a model based only in data, the best correlated
12, 18 and 24, and the MAPE for these lead times are similar, as well as features will not, in theory, provide an improvement of the predictions,
the trend in the graphs. We can conclude here that, the gap between due to the fact that it does not aggregate any new information. Features
the data used in the training and the effective prediction, i.e, the lead that are physically known to affect the target variable (either linearly or
time from 12, affects the results and creates a pattern that is in fact non-linearly) may not improve the accuracy of our model, and can even
the same in the predictions. This behaviour, however, is not seen in reduce it comparing to a single variable simulation. This could happen
the lead time 6, since the gap between the last training data and the because adding more features as input of your model, the network will
prediction is smaller. Thus, based on the accuracy found, we can state try to see the pattern between these values and thus try to correlate
that the multi variable simulation has not improved the results from then, based on what the historical series show. Therefore, we also

8
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 6. Predictions (black) and reanalysis from ERA5 (red) comparison for buoy location number five. The grey dashed line show real observations obtained from the buoy.

developed a multivariate analysis using the best correlated variables this correlation is based only in data, and does not take into consid-
from those available in ERA5 database with 𝐻𝑠 . We use the Pearson’s eration the physical relationship between the variables. As maximum
correlation coefficient, where we multiply deviations from the mean for individual wave height, altimeter wave height and altimeter corrected
the variable 1 times those for variable 2, and divide by the product of wave height are obviously identically correlated (𝑟correlation = 1) and
the standard deviations. We have: can bring no improvement to the training, we choose the following four

𝑛 variables for our multivariate simulation:
(𝑥𝑖 − 𝑥)(𝑦𝑖 − 𝑦)
𝑖=1 1. Mean wave period based on the second moment for swell;
𝑟correlation = . (9)
(𝑛 − 1)𝑠𝑥 𝑠𝑦 2. Mean wave period based on the first moment for swell;
3. Mean wave period based on the second moment for wind waves;
The correlation coefficient always lies between one (perfect pos-
4. Mean wave period based on the first moment for wind waves.
itive correlation) and minus one (perfect negative correlation); zero
indicates no correlation. We show in Fig. 11 the correlation between We have executed the algorithm with the input training data con-
twelve variables available in ERA5 and 𝐻𝑠 . It is important to note that sisting of five variables: the significant wave height, which is our target

9
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 7. Predictions (black) and reanalysis from ERA5 (red) comparison for buoy location number seven. The grey dashed line show real observations obtained from the buoy.

variables, and the four enumerated above. The results are shown in reproduce the target. Again, metrics are calculated using ERA5 as
Fig. 12 and as we can see, the they do not improve significantly the reference value.
previous in this subsection. With the same goal of improving the results at location number
one, we use the time series for this point in space. Fig. 13 shows the
5.3. Predictions not considering 𝐻𝑠 as feature results of this framework of training. The accuracy drops with this
strategy, although are above 70% for all lead times. These results show
In this section, we train our model using 𝐻𝑠 as the target variable, the obvious dependency of the physical variables (peak period and
while peak wave period and the 10m wind speed are used as inputs. wind) with the significant height, since the trend of the series is well
Note that this differ from the previous sections, where 𝐻𝑠 is also described for lead time 6. Naturally, the metrics indicate a deterioration
considered as feature. Thus, the weights in the network are updated in accuracy for larger lead times. The trend in the series is somewhat
optimizing the loss function using only those variables and aims to lost for lead time 24 on, as the peaks are not reproduced by the LSTM.

10
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 8. Predictions (black) and reanalysis from ERA5 (red) comparison for buoys locations number one and three. The blue dots show real observations obtained from the buoy.

5.4. Predictions using real buoy data in the training set Originally, the data were hourly, but with the outliers removal, gaps
in the hourly discretization appeared. Nevertheless, the predictions
We now present the predictions obtained with buoy measured data were made using the same number of points as before, i.e, 751 steps.
as the training set. Only the significant wave height is considered in the The verification time period spans from 06-02-2019, at 08 ∶ 00 GMT
set of features and target. Two buoys locations, namely, number 1 and to 09-03-2019, at 17 ∶ 00 GMT for buoy location number one and
2, are used for this analysis. Both locations have buoy data available from 02-03-2019, at 09 ∶ 00 GMT, to 30-05-2019, at 05 ∶ 00 GMT
since April/2009, but there are several missing values and outliers, for buoy location number two. The same architecture and methodology
due to buoy issues and maintenance. Buoy location number two has for training of the LSTM network explained previously was used. The
more erroneous measurements compared with buoy location number metrics are now calculated between predicted values from LSTM and
one, and therefore the k-Nearest Neighbours classification algorithm buoy data. Results are presented in Figs. 14 and 15.
was applied to filter these outliers, which allowed the removal of 1.6% It can be observed that the accuracy of the results drop considerably
of data for location number one and 14.5% for location number two. A in the predictions using measured data for the algorithm training,
detailed statistics summary can be found in Bose et al. (2022). instead of ERA5 data. We obtained the MAPE for those two buoy

11
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 9. Predictions (black) and reanalysis from ERA5 (red) comparison for buoys locations number four and six. The grey dashed line show real observations obtained from the
buoy.

locations. They are 12.96% (87.04% accuracy) for lead time 6, 19.79% this information could muddle the network. Furthermore, buoys data,
(80.21% accuracy) for lead 12, 22.11% (77.89% accuracy) for lead 18 as can be seen by the graphics in Figs. 14 and 15, have a higher variance
and 26.17% (73.83% accuracy) for lead 24 for buoy location number and noise in the time series.
one, and a MAPE of 18.26% (81.74% accuracy) for lead time 6, 25.73% Comparing the accuracy of our proposed methodology with the
(74.27% accuracy) for lead 12, 27.94% (72.06% accuracy) for lead 18 accuracy of ERA5 data, against the real buoy data, we have a great
and 29.57% (70.43% accuracy) for lead 24 with buoy location number difference for buoy location number one and a similar result for buoy
two. The former has a better global accuracy compared to the latter due location number two. Figs. 14 and 15 show the MAPE metric for ERA5
to the large amount of data that were filtered out, giving a worst pattern reanalysis compared to buoy data. Note that ERA5 dataset is not a
for the historical time series. Since the removal exclude data points in forecast, differently from LSTM predictions. For lead time 6 h, the LSTM
the historical data, both datasets have gaps in the time domain and accuracy is 87.04% against 80.67% of ERA5, in location number one.

12
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 10. Predictions (black) for buoy location number one and reanalysis from ERA5 (red) comparison for multi variable (𝑇𝑝 and 𝑈10 ) simulation. The grey dashed line show real
observations obtained from the buoy.

For location number two, the difference is small, although still superior sizes, depending on the lead time. Our purpose was to develop a new
for LSTM: 81.74% against 80.59%. Considering that ERA5 data are training methodology for a machine learning technique, that has the
more accurate than any forecast produced with the physical model on ability to preserve the lead time between the last known step and the
which is based on, we infer that LSTM with the methodology developed prediction step. In this sense, ERA5 dataset is an acceptable resource to
in this work can improve the 6 h forecast which uses a physical model. validate our strategy, and when using buoy data, we can improve what
reanalysis accomplishes, with lower computational cost and resources.
The LSTM results show that accuracy, with respect to the reanalysis
6. Conclusion
data, depending on the location, can reach almost 95% for very short-
range forecast. For larger lead times, the accuracy naturally decreases,
We have presented a deep learning strategy based on long short- as the gap between the training set and the prediction becomes larger.
term memory to forecast significant wave height in seven different Nevertheless, acceptable results are obtained in a computational time
locations on the Brazil’s coast, for four lead times. A three layer that is proportional to the size of the training set. For simulations using
architecture is built and training is performed for datasets with different only buoy data for training, the accuracy of the LSTM model is 87%

13
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 11. Pearson’s correlation coefficient between the best 12 ERA5 ocean wave variables and 𝐻𝑠 .

Fig. 12. Multi-variable prediction for buoy location number one using LSTM for the best correlated variables. Predictions are in black, reanalysis from ERA5 in red and buoy data
given in grey.

14
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 13. Prediction using LSTM for buoy location number one considering only 𝑇𝑝 and 𝑈10 as the inputs. Predictions are in black, reanalysis from ERA5 in red and buoy data
given in grey.

for lead time of 6 h, and our predictions outperform ERA5 reanalysis deep learning models are, nowadays, easy to implement and can be
results when compared with the observational data, as can be seen implemented without a profound knowledge of the model‘s physics.
in Figs. 14 and 15, for this lead. This demonstrates the ability of the Despite the very promising advantages of using data-driven models,
methodology presented here to forecast significant wave height. a better understanding of a number of their operational aspects are
The explosion on the number of works and applications which use necessary, as well as a stronger mathematical foundation is needed.
machine and/or deep learning algorithms is remarkable. Public domain Neural networks, although typically faster than physical models, can
code libraries and the increasing amount of available datasets (ERA5 have its processing time scaled with data size and regions to be mod-
has recently increased its historical data back to 1958), appears to elled. A global prediction, for every grid point in the globe, considering
be an opportunity to improve present ocean wave forecasting. One a 0.5 grid spacing of ERA5, might be not feasible with the method of
of the main advantages of machine or deep learning approaches is this paper. Another drawback is the fact that neural networks are still
the computational time needed for obtaining a result. Machine and partially ‘‘black box’’ models, i.e., we cannot control the parameters

15
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 14. Prediction of 𝐻𝑠 using LSTM for buoy location number one using measured data as feature. Predictions are in black and buoy data given in red.

that are learned within the training phase, making the results difficult Declaration of competing interest
to analyse from a physical point of view. Nevertheless, the lack of phys-
ical interpretation is a deficiency that can be solved through physical The authors declare the following financial interests/personal rela-
informed neural networks. tionships which may be considered as potential competing interests:
Felipe Minuzzi reports financial support was provided by Office of
Approaches different from the ones used in this work can (and
Naval Research. Leandro Farina reports financial support was provided
should) be tested. Our strategy was based both on complexity of the by Coordination of Higher Education Personnel Improvement.
network and computational time. In this sense, further studies are
necessary to contemplate this goal. Acknowledgements

This work has been funded by the Office of Naval Research Global,
CRediT authorship contribution statement United States, under the contract no. N629091812124. We acknowl-
edge also that this work has been partially funded by the Coordenação
de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) -
Felipe C. Minuzzi: Investigation, Conceptualization, Methodology, Finance Code 001 from the project ROAD-BESM – REGIONAL OCEANIC
Software, Writing – original draft. Leandro Farina: Conceptualization, AND ATMOSPHERIC DOWNSCALING/CAPES (88881.146048/2017-
Methodology, Writing – review & editing, Supervision. 01).

16
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Fig. 15. Prediction of 𝐻𝑠 using LSTM for buoy location number two using measured data as feature. Predictions are in black and buoy data given in red.

References Ardhuin, F., Stopa, J.E., Chapron, B., Collard, F., Husson, R., Jensen, R.E., Johan-
nessen, J., Mouche, A., Passaro, M., Quartly, G.D., et al., 2019. Observing sea
states. Front. Mar. Sci. 6, 124.
Bento, P., Pombo, J., Calado, M.d.R., Mariano, S., 2021. Ocean wave power fore-
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S.,
casting using convolutional neural networks. IET Renew. Power Gener. 15 (14),
Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G.,
3341–3353.
Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D.,
Booij, N., Holthuijsen, L., Ris, R., 1997. The SWAN wave model for shallow water. In:
Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B.,
Coastal Engineering 1996. pp. 668–676.
Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F.,
Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X., 2015. Bose, N., Ramos, M., Correia, G., Saidelles, C., Farina, L., Parise, C., Nicolodi, J.,
TensorFlow: Large-scale machine learning on heterogeneous systems. URL https: 2022. Assessing wind datasets and boundary conditions for wave hindcasting in
//www.tensorflow.org/ Software available from www.tensorflow.org. the southern Brazil nearshore. Computers & Geosciences 159, 104972.
Boukabara, S., Gallagher, F., Helms, D., Kalluri, S., Gerth, J., Swadley, S., Goldberg, M.,
Agrawal, J., Deo, M., 2004. Wave parameter estimation using neural networks. Mar. Kleist, D., Iturbide-Sanchez, F., Yoe, J., 2022. Assessment of solution-agnostic
Struct. 17 (7), 536–550. observational needs for global numerical weather prediction (NWP).

17
F.C. Minuzzi and L. Farina Ocean Modelling 181 (2023) 102151

Boukabara, S.-A., Hoffmanb, R.N., 2022. Optimizing observing systems using ASPEN: Kingma, D.P., Ba, J., 2014. Adam: A method for stochastic optimization. arXiv preprint
An analysis tool to assess the benefit and cost effectiveness of observations to Earth arXiv:1412.6980.
system applications. Bull. Am. Meteorol. Soc.. Komen, G.J., Cavaleri, L., Donelan, M., Hasselmann, K., Hasselmann, S., Janssen, P.,
Boukabara, S.-A., Krasnopolsky, V., Stewart, J.Q., Maddy, E.S., Shahroudi, N., Hoff- 1996. Dynamics and modelling of ocean waves.
man, R.N., 2019. Leveraging modern artificial intelligence for remote sensing and Krasnopolsky, V.M., Chalikov, D.V., Tolman, H.L., 2002. A neural network technique
NWP: Benefits and challenges. Bull. Am. Meteorol. Soc. 100 (12), ES473–ES491. to improve computational efficiency of numerical oceanic models. Ocean Model. 4
Browne, M., Castelle, B., Strauss, D., Tomlinson, R., Blumenstein, M., Lane, C., 2007. (3–4), 363–383.
Near-shore swell estimation from a global wind-wave model: Spectral process, Londhe, S., Panchang, V., 2006. One-day wave forecasts based on artificial neural
linear, and artificial neural network models. Coast. Eng. 54 (5), 445–460. networks. J. Atmos. Ocean. Technol. 23 (11), 1593–1603.
Campos, R.M., Krasnopolsky, V., Alves, J.-H., Penny, S., 2017. Improving NCEP’s Lou, R., Wang, W., Li, X., Zheng, Y., Lv, Z., 2021. Prediction of ocean wave height
probabilistic wave height forecasts using neural networks: a pilot study using buoy suitable for ship autopilot. IEEE Trans. Intell. Transp. Syst..
data. NCEP Office Note 490, 23pp. Makarynskyy, O., 2004. Improving wave predictions with artificial neural networks.
Campos, R.M., Krasnopolsky, V., Alves, J.-H.G., Penny, S.G., 2019. Nonlinear wave Ocean Eng. 31 (5–6), 709–724.
ensemble averaging in the Gulf of Mexico using neural networks. J. Atmos. Ocean. Makarynskyy, O., Pires-Silva, A., Makarynska, D., Ventura-Soares, C., 2005. Artificial
Technol. 36 (1), 113–127. neural networks in wave predictions at the west coast of Portugal. Comput. Geosci.
Campos, R.M., Krasnopolsky, V., Alves, J.-H., Penny, S.G., 2020. Improving ncep’s 31 (4), 415–424.
global-scale wave ensemble averages using neural networks. Ocean Model. 101617. Minuzzi, F., 2022. LSTM-ocean: Ocean waves modelling with LSTM. URL https://github.
Cavaleri, L., Alves, J.-H., Ardhuin, F., Babanin, A., Banner, M., Belibassakis, K., com/felipeminuzzi/lstm-ocean.
Benoit, M., Donelan, M., Groeneweg, J., Herbers, T., et al., 2007. Wave Navy, B., 2019. Data quality control. URL https://www.marinha.mil.br/chm/sites/
modelling–the state of the art. Prog. Oceanogr. 75 (4), 603–674. www.marinha.mil.br.chm/files/u1947/controle_de_qualidade_dos_dados.pdf.
Choi, H., Park, M., Son, G., Jeong, J., Park, J., Mo, K., Kang, P., 2020. Real-time Navy, B., 2020. PNBOIA - National program of buoys. URL https://www.marinha.mil.
significant wave height estimation from raw ocean images based on 2D and 3D br/chm/dados-do-goos-brasil/pnboia, (Accessed 12 April 2021).
deep neural networks. Ocean Eng. 201, 107129. Nitsure, S., Londhe, S., Khare, K., 2012. Wave forecasts using wind information and
Chollet, F., et al., 2015. Keras. URL https://github.com/fchollet/keras. genetic programming. Ocean Eng. 54, 61–69.
Churchland, P.S., Sejnowski, T.J., 1994. The Computational Brain. MIT Press. O’Donncha, F., Zhang, Y., Chen, B., James, S.C., 2018. An integrated framework
Cornejo-Bueno, L., Garrido-Merchán, E.C., Hernández-Lobato, D., Salcedo-Sanz, S., that combines machine learning and numerical models to improve wave-condition
2018. Bayesian optimization of a hybrid system for robust ocean wave features forecasts. J. Mar. Syst. 186, 29–36.
prediction. Neurocomputing 275, 818–828. Oh, J., Suh, K.-D., 2018. Real-time forecasting of wave heights using EOF–wavelet–
Deo, M., Naidu, C.S., 1998. Real time wave forecasting using neural networks. Ocean neural network hybrid model. Ocean Eng. 150, 48–59.
Eng. 26 (3), 191–203. Pereira, H.P.P., Violante-Carvalho, N., Nogueira, I.C.M., Babanin, A., Liu, Q.,
Gaur, S., Deo, M., 2008. Real-time wave forecasting using genetic programming. Ocean de Pinho, U.F., Nascimento, F., Parente, C.E., 2017. Wave observations from an
Eng. 35 (11–12), 1166–1172. array of directional buoys over the southern Brazilian coast. Ocean Dyn. 67 (12),
Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y., 2016. Deep learning, vol. 1, no. 1577–1591.
2. MIT press Cambridge.
Peres, D., Iuppa, C., Cavallaro, L., Cancelliere, A., Foti, E., 2015. Significant wave height
Graves, A., 2008. Supervised Sequence Labelling with Recurrent Neural Networks (Ph.D.
record extension by neural networks and reanalysis wind data. Ocean Model. 94,
thesis).
128–140.
Haykin, S.S., 2009. Neural Networks and Learning Machines. Prentice Hall, New York.
Pirhooshyaran, M., Snyder, L.V., 2020. Forecasting, hindcasting and feature selection
Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, J., Nicolas, J., Peubey, C.,
of ocean waves via recurrent and sequence-to-sequence networks. Ocean Eng. 207,
Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., Thépaut, J.-N.,
107424.
2018. ERA5 hourly data on single levels from 1979 to present. Coperni-
Prahlada, R., Deka, P.C., 2015. Forecasting of time series significant wave height using
cus climate change service (C3S) climate data store (CDS). http://dx.doi.org/
wavelet decomposed neural network. Aquatic Procedia 4, 540–547.
10.24381/cds.adbb2d47, URL https://cds.climate.copernicus.eu/cdsapp/dataset/10.
Sherstinsky, A., 2020. Fundamentals of recurrent neural network (RNN) and long
24381/cds.e2161bac, (Accessed 05 April 2021).
short-term memory (LSTM) network. Physica D 404, 132306.
Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J.,
Tolman, H.L., Krasnopolsky, V.M., Chalikov, D.V., 2005. Neural network approxima-
Nicolas, J., Peubey, C., Radu, R., Schepers, D., et al., 2020. The ERA5 global
tions for nonlinear interactions in wind wave spectra: direct mapping for wind
reanalysis. Q. J. R. Meteorol. Soc. 146 (730), 1999–2049.
seas in deep water. Ocean Model. 8 (3), 253–278.
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J., 2001. Gradient flow in
WAVEWATCH III Development Group (WW3DG), 2019. User manual and system
recurrent nets: the difficulty of learning long-term dependencies. In: Kolen, J.F.,
documentation of WAVEWATCH III version 6.07.. p. 465, Tech. Note 333,
Kremer, S.C. (Eds.), A Field Guide To Dynamical Recurrent Neural Networks. IEEE
NOAA/NWS/NCEP/MMAB, College Park, MD, USA.
Press. Wiley-IEEE Press.
Wei, Z., 2021. Forecasting wind waves in the US Atlantic Coast using an artificial
Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9 (8),
neural network model: Towards an AI-based storm forecast system. Ocean Eng.
1735–1780.
237, 109646.
Hu, H., van der Westhuysen, A.J., Chu, P., Fujisaki-Manome, A., 2021. Predicting
Zamani, A., Solomatine, D., Azimian, A., Heemink, A., 2008. Learning from data for
Lake Erie wave heights and periods using XGBoost and LSTM. Ocean Model. 164,
wind–wave forecasting. Ocean Eng. 35 (10), 953–962.
101832.
James, S.C., Zhang, Y., O’Donncha, F., 2018. A machine learning framework to forecast Zheng, G., Li, X., Zhang, R.-H., Liu, B., 2020. Purely satellite data–driven deep learning
wave conditions. Coast. Eng. 137, 1–10. forecast of complicated tropical instability waves. Sci. Adv. 6 (29), eaba1482.

18

You might also like