Forecasting Tropical Cyclones With Cascaded Diffusion Models
Forecasting Tropical Cyclones With Cascaded Diffusion Models
Diffusion Models
César Quilodrán-Casas
Department of Earth Science and Engineering
Imperial College London
[email protected]
Abstract
As cyclones become more intense due to climate change, the rise of AI-based mod-
elling provides a more affordable and accessible approach compared to traditional
methods based on mathematical models. This work leverages diffusion models to
forecast cyclone trajectories and precipitation patterns by integrating satellite imag-
ing, remote sensing, and atmospheric data, employing a cascaded approach that
incorporates forecasting, super-resolution, and precipitation modelling, with train-
ing on a dataset of 51 cyclones from six major basins. Experiments demonstrate
that the final forecasts from the cascaded models show accurate predictions up to a
36-hour rollout, with SSIM and PSNR values exceeding 0.5 and 20 dB, respectively,
for all three tasks. This work also highlights the promising efficiency of AI methods
such as diffusion models for high-performance needs, such as cyclone forecast-
ing, while remaining computationally affordable, making them ideal for highly
vulnerable regions with critical forecasting needs and financial limitations. Code
accessible at https://github.com/nathzi1505/forecast-diffmodels.
1 Introduction
Climate change is a pressing global issue causing unprecedented changes in the Earth’s climate system,
resulting in altered precipitation patterns and a surge in extreme rainfall events with devastating
environmental consequences [1]. Rising global temperatures and changing atmospheric circulation
patterns are significant drivers of these extreme events [2], posing challenges for water resource
management, infrastructure planning, and disaster risk reduction [3]. Advanced machine learning
(ML) techniques have emerged as a promising solution for predicting and understanding extreme
rainfall behaviour under climate change [4]. These algorithms can analyse large datasets, capture
complex spatio-temporal relationships, and make precise predictions without the need for explicit
programming. Leveraging modern computing systems like GPUs and distributed architectures, ML
offers a revolutionary approach to meteorological modelling, replacing traditional supercomputer-
based simulations [5].
In recent times, diffusion models [6] have garnered substantial attention across various domains, in-
cluding weather forecasting, climate modelling, and image processing. Leinonen et al. [7] introduced
a latent diffusion model (LDM) for precipitation nowcasting, surpassing traditional methods and
deep generative models in accuracy and uncertainty quantification. Bassetti et al. [8] demonstrated
the efficiency of diffusion models, particularly DiffESM, in emulating Earth System Models (ESMs)
2 Data
2.1 Data Acquisition
1. Satellite Data: Infrared (IR) 10.8µm for a total of 51 cyclones (above 2 in the Saffir-
Simpson Hurricane Wind Scale [11]) that have been reported to have major landfall impact
are extracted from six major basins as shown in Table 1 over the time period between January
2019 to March 2023.
Table 1: List of TC basins along with their satellite data providers and cyclone counts
2. Atmospheric Data: Hourly ERA5 [12] reanalysis data for four atmospheric variables as
shown in Table A.3 over the period from formation to dissipation is acquired from the
Copernicus Climate Data Store for each recorded cyclone.
2
Figure 1: Illustration of the cascaded arrangement involving three task-specific diffusion models
4. Dataloader Generation: Task specific dataloaders are created on the metadata data
structure to facilitate model training and streamlining the data loading process. First,
raw satellite and ERA5 image data are downsized to 64x64 (forecasting and precipitation
modelling) and 128x128 (super-resolution) and then divided into randomly sampled batches
of a specified batch size. In addition, supplementary methods such as min-max normalisation
and data augmentation such as rotate90 are introduced at the dataloader level to aid model
training.
3 Methodology
Taking inspiration from the Imagen paper [13] and its application in image generation conditioned on
text inputs [14], this study employs a cascaded arrangement, as depicted in Figure 1. In particular, this
specific arrangement utilises three independently trained U-Net based diffusion models, each tailored
to a specific task which ultimately enhances the efficiency of cyclone forecast generation. Using
the 64x64 satellite IR 10.8µm at time t, forecast at time t + 1, is generated and pushed downstream
onto the super-resolution task and the precipitation modelling task models. The super-resolution
task model creates the 128x128 satellite 10.8µm version of the generated 64x64 forecast, while the
precipitation modelling task model generates the 64x64 total precipitation map corresponding to the
forecast. In all three tasks, the forecasted ERA5 data at t + 1 are used to condition the input.
For each diffusion model, a similar U-Net structure to Imagen [14] is used with additional refinements
including classifier free guidance [15], dynamic thresholding (for maintaining the outputs within the
normalized range) and exponential moving averaged weights. For data augmentation, techniques such
as rotate90 covering all four orientations and low-resolution noise injection (for the super-resolution
task) are also used and found to contribute to better model outputs. To eliminate noise in the total
precipitation maps, a minimum filter as post-processing is also utilized.
To effectively assess the three cascaded diffusion models mentioned in this work, the two evaluation
strategies are undertaken. First, quantitative metrics involving MAE, PSNR, SSIM and FID scoresare
used to assess the one-step performance for the best epoch over the test set. And second, rollout
analysis using SSIM evaluation over the forecast generated in an auto-regressive manner starting
with an initial IR 10.8µm assisted with forecasted ERA5 data is performed over the entire cyclone
duration.
3
4 Results
Performance evaluation of the best performing model checkpoint over four distinct metrics is shown
in Table 2. These results underscore the remarkable predictive capabilities of all three diffusion
models for forecasting purposes, consistently surpassing the thresholds of 20dB and 0.5 for PSNR
and SSIM values, respectively. Additionally, the MAE (measured over normalised images) is found
to consistently yield values below 0.25, while the FID scores remain below 1 for all three models.
Table 2: Task-wise performance metrics over the entire test set
For forecasts of all cyclones belonging to the region wise test set in Table 1 (such as the one for
Cyclone Mocha shown in Fig. 2), upon closer examination of the SSIM charts generated over the
entire cyclone duration as displayed in Figure B.1, a notable decline can be observed in the majority
of cyclone forecasts around the 36-hour mark. Given the challenge of identifying an absolute rollout
with certainty, the consistent occurrence of sharp "dips" at approximately the 36-hour mark (approx.
15 minutes on an RTX 2080Ti) implies that such a time length can be considered a reliable horizon
where the generated forecast can be estimated to closely align with the actual conditions.
Figure 2: Forecast at 31h (left) and 38h (right) of Cyclone Mocha over the North Indian Ocean
on 10th May 2023. The upper rows resemble the ground truth IR 10.8µm satellite image and total
precipitation while the lower rows show the forecast generated at that particular timestep.
5 Conclusion
This work presents a novel cascaded diffusion model architecture for forecasting tropical cyclones
supported by using a custom-built data processing pipeline and trained on IR 10.8µm in addition to
ERA5 atmospheric reanalysis data. With strong enough capabilities for forecasts with horizons of
around 36 hours, when integrated with atmospheric data, like ERA5, these instances of cost-effective
AI-based modelling, optimised for single GPUs, facilitate affordable, almost real-time, precise, and
photorealistic forecasts. This makes them particularly suitable for highly vulnerable regions facing
critical forecasting demands but are financially constrained.
4
References
[1] Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the
Sixth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge,
United Kingdom and New York, NY, USA: Cambridge University Press; 2021.
[2] Donat MG, Lowry AL, Alexander LV, O’Gorman PA, Maher N. More extreme precipitation in
the world’s dry and wet regions. Nature Climate Change. 2016 May;6(5):508–513. Available
from: https://www.nature.com/articles/nclimate2941.
[3] Van Aalst MK. The impacts of climate change on the risk of natural disasters: The Impacts of
Climate Change on The Risk of Natural Disasters. Disasters. 2006 Mar;30(1):5–18. Available
from: https://onlinelibrary.wiley.com/doi/10.1111/j.1467-9523.2006.00303.
x.
[4] Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, et al. A Survey on Deep Learning:
Algorithms, Techniques, and Applications. ACM Computing Surveys. 2018 Sep;51(5):92:1-
92:36. Available from: https://doi.org/10.1145/3234150.
[5] Schalkwijk J, Jonker HJJ, Siebesma AP, Meijgaard EV. Weather Forecasting Using GPU-
Based Large-Eddy Simulations. Bulletin of the American Meteorological Society. 2015
May;96(5):715–723. Available from: https://journals.ametsoc.org/view/journals/
bams/96/5/bams-d-14-00114.1.xml.
[6] Ho J, Jain A, Abbeel P. Denoising Diffusion Probabilistic Models. 2020 Dec;(arXiv:2006.11239).
ArXiv:2006.11239 [cs, stat]. Available from: http://arxiv.org/abs/2006.11239.
[7] Leinonen J, Hamann U, Nerini D, Germann U, Franch G. Latent diffusion models for generative
precipitation nowcasting with accurate uncertainty quantification. 2023 Apr. ArXiv:2304.12891
[physics]. Available from: http://arxiv.org/abs/2304.12891.
[8] Bassetti S, Hutchinson B, Tebaldi C, Kravitz B. DiffESM: Conditional Emulation of Earth
System Models with Diffusion Models. 2023 Apr. ArXiv:2304.11699 [physics]. Available
from: http://arxiv.org/abs/2304.11699.
[9] Hatanaka Y, Glaser Y, Galgon G, Torri G, Sadowski P. Diffusion Models for High-Resolution
Solar Forecasts. 2023 Jan. ArXiv:2302.00170 [physics]. Available from: http://arxiv.org/
abs/2302.00170.
[10] Addison H, Kendon E, Ravuri S, Aitchison L, Watson PA. Machine learning emulation of
a local-scale UK climate model. 2022 Nov. ArXiv:2211.16116 [physics]. Available from:
http://arxiv.org/abs/2211.16116.
[11] A Dictionary of Earth Sciences. Oxford University Press; 2008. Available from:
https://www.oxfordreference.com/display/10.1093/acref/9780199211944.001.
0001/acref-9780199211944.
[12] Lavers DA, Simmons A, Vamborg F, Rodwell MJ. An evaluation of ERA5 precipita-
tion for climate monitoring. Quarterly Journal of the Royal Meteorological Society. 2022
Oct;148(748):3152–3165. Available from: https://onlinelibrary.wiley.com/doi/10.
1002/qj.4351.
[13] Ho J, Saharia C, Chan W, Fleet DJ, Norouzi M, Salimans T. Cascaded Diffusion Models for
High Fidelity Image Generation. Journal of Machine Learning Research. 2022;23(47):1–33.
Available from: http://jmlr.org/papers/v23/21-0635.html.
[14] Wang P. Implementation of Imagen, Google’s Text-to-Image Neural Network, in Pytorch; 2023.
Available from: https://github.com/lucidrains/imagen-pytorch.
[15] Ho J, Salimans T. Classifier-Free Diffusion Guidance. 2022 Jul. ArXiv:2207.12598 [cs].
Available from: http://arxiv.org/abs/2207.12598.
5
Appendix A Additional Data Description
A.1 ERA5 Variables
Name Unit
10m u-component of wind ms−1
10m v-component of wind ms−1
Total cloud cover -
Total precipitation m
For the experiments performed, due to the historical nature of the cyclones, forecasted ERA5 refers to
the ERA5 reanalysis data generated by ECMWF. In deployment scenarios, due to ERA5 having a lag
of 5 days, actual forecasted atmospheric data (corresponding to the three variables) from forecasting
systems can be used. Therefore, to ensure consistency with common forecasting systems such as
GFS, these specific four variables are chosen.