of sources. The complexity of moisture physics requires many parameterizations in these models
for processes like turbulent mixing, convection, subgrid clouds and microphysics [
1
], and such
parameterizations can lead to large biases in NWP precipitation forecasts [
11
]. As a result, global
precipitation forecasts generally achieve inadequate forecast skill [
22
] and, hence, there has been an
increasing interest in fully data-driven solutions, primarily using deep learning, in recent years.
Data-driven models can be orders of magnitude faster with the potential to learn complex parame-
terizations between input and output function spaces directly from data, reducing model bias. With
such models, major advances have been made in the area of precipitation “nowcasting”, where
forecasts are made over limited spatial regions with lead times on the order of minutes to hours. Deep
learning models trained directly on radar and/or satellite observations now outperform traditional
methods for nowcasting [
17
,
10
,
3
]. However, until recently, there has been limited progress for
models predicting precipitation at larger spatiotemporal scales (e.g., over the full globe up to days in
advance), mainly due to computational limitations on resolution [
16
]. FourCastNet [
14
] is the first
deep-learning-based global weather model running at
∼
30km scale, which outperforms IFS in terms
of precipitation forecast skill up to
∼
2 day lead times and is the current state-of-the-art. However,
despite using a dedicated network just for precipitation due to its unique challenges, FourCastNet
predictions still lack fine-scale details (see Figure 1) and thus underestimate the extreme percentiles
of the precipitation distribution.
In this work, we aim to overcome some of these limitations using generative models to advance the
state-of-the-art in deep learning-based precipitation forecasts. In particular, our contributions are as
follows: (i) we apply a state-of-the-art generative adversarial network [
8
] that integrates multi-scale
semantic structure and style information, allowing us to synthesize physically realistic fine-scale
precipitation features; (ii) we show that capturing fine-scale phenomena leads to improved predictions
of extreme precipitation while also preserving forecast skill, attaining comparable skill at 1–2 lead
day times with respect to IFS.
2 Methods
2.1 Dataset
We replicate the data preparation pipeline of the original FourCastNet precipitation model [
14
],
relying on the European Center for Medium-Range Weather Forecasting (ECMWF) global reanalysis
dataset ERA5 [
6
], which combines archived atmospheric observations with physical model outputs.
Following FourCastNet’s time steps of length 6 hours, we model the 6hr accumulated total precipita-
tion
T P
(this also makes for easier comparisons against IFS, which archives
T P
forecasts in 6 hourly
accumulations as well). A sample snapshot of
T P
, in log-normalized units for easy visualization of
fine-scale details, is shown in the bottom-right inset of Figure 1. We use the years 1979-2015 and
2016-2017 as training and validation sets, respectively, and set aside 2018 as a test set. We refer the
reader to the original FourCastNet paper [14] for further details on the dataset.
2.2 Model
Given the success of adversarial learning for high-resolution precipitation models in localized regions
[
17
,
12
,
15
], we explore the utility of conditional generative adversarial networks (cGANs) for
modeling
T P
over the entire globe using prognostic atmospheric variables as conditional input
for operational diagnosis. In particular, we adopt the TSIT architecture [
8
] for our task, due to its
success in varied image tasks and ability to flexibly condition intermediate features at a variety of
scales in the generator. TSIT’s generator
G
employs symmetric downsampling and upsampling
paths, fusing features from the downsampling path via feature-wise adaptive denormalization, and
uses multi-scale patch-based discriminators [
21
] for adversarial training. We take a constant latitude
embedding channel and the 20 atmospheric variables output by FourCastNet (corresponding to various
different prognostic variables such as wind velocities, total column water vapor and temperature at
different pressure levels) as input to the network, as shown in Figure 2. Randomness is injected at
intermediate scales via elementwise additive noise, and we can thus generate “zero-shot” ensembles
for probabilistic forecasting given a single input, which we explore in Appendix A. We refer the reader
to [
8
] for additional details on the TSIT architecture, and list hyperparameters in Appendix A, together
with the particulars of our setup. In this work, we focus on three model versions for demonstrating
the impact of the adversarial training: (i) FourCastNet: the precipitation baseline model from [
14
],
2