ONLINE MODEL ERROR CORRECTION WITH NEURAL NETWORKS IN THE INCREMENTAL 4D-V AR FRAMEWORK PREPRINT

2025-05-02 0 0 600.2KB 19 页 10玖币
侵权投诉
ONLINE MODEL ERROR CORRECTION WITH NEURAL NETWORKS
IN THE INCREMENTAL 4D-VAR FRAMEWORK
PREPRINT
Alban Farchi
CEREA, École des Ponts and EDF R&D
Île–de–France, France
alban.farchi@enpc.fr
Marcin Chrust
ECMWF
Shinfield Park
Reading, United Kingdom
Marc Bocquet
CEREA, École des Ponts and EDF R&D
Île–de–France, France
Patrick Laloyaux
ECMWF
Shinfield Park
Reading, United Kingdom
Massimo Bonavita
ECMWF
Shinfield Park
Reading, United Kingdom
October 26, 2022
ABSTRACT
Recent studies have demonstrated that it is possible to combine machine learning with data assim-
ilation to reconstruct the dynamics of a physical model partially and imperfectly observed. Data
assimilation is used to estimate the system state from the observations, while machine learning
computes a surrogate model of the dynamical system based on those estimated states. The surrogate
model can be defined as an hybrid combination where a physical model based on prior knowledge
is enhanced with a statistical model estimated by a neural network. The training of the neural
network is typically done offline, once a large enough dataset of model state estimates is available.
By contrast, with online approaches the surrogate model is improved each time a new system state
estimate is computed. Online approaches naturally fit the sequential framework encountered in
geosciences where new observations become available with time. In a recent methodology paper,
we have developed a new weak-constraint 4D-Var formulation which can be used to train a neural
network for online model error correction. In the present article, we develop a simplified version of
that method, in the incremental 4D-Var framework adopted by most operational weather centres. The
simplified method is implemented in the ECMWF Object-Oriented Prediction System, with the help
of a newly developed Fortran neural network library, and tested with a two-layer two-dimensional
quasi geostrophic model. The results confirm that online learning is effective and yields a more
accurate model error correction than offline learning. Finally, the simplified method is compatible
with future applications to state-of-the-art models such as the ECMWF Integrated Forecasting System.
Keywords data assimilation ·machine learning ·model error ·surrogate model ·neural networks ·online learning
Plain language summary
We have recently proposed a general framework for combining data assimilation and machine learning techniques to
train a neural network for online model error correction. In the present article, we develop a simplified version of this
online training method, compatible with future applications to more realistic models. Using numerical illustrations, we
arXiv:2210.13817v1 [stat.ML] 25 Oct 2022
Online model error correction with neural networks – preprint – October 26, 2022
show that the new method is effective and yields a more accurate model error correction than the usual offline learning
approach. The results show the potential of incorporating data assimilation and machine learning tightly, and pave the
way towards an application to the Integrated Forecasting System used for operational numerical weather prediction at
the European Centre for Medium-Range Weather Forecasts.
Key points
Weak-constraint 4D-Var variants can be used to train neural networks for online model error correction.
Online learning yields a more accurate model error correction than offline learning.
The new, simplified method, developed in the incremental 4D-Var framework, can be easily applied in
operational weather models.
1 Introduction: machine learning for model error correction
In the geosciences, data assimilation (DA) is used to increase the quality of forecasts by providing accurate initial
conditions (Kalnay,2003;Reich and Cotter,2015;Law et al.,2015;Asch et al.,2016;Carrassi et al.,2018;Evensen
et al.,2022). The initial conditions are obtained by combining all sources of information in a mathematically optimal
way, in particular information from the dynamical model and information from sparse and noisy observations. There
are two main classes of DA methods. In variational DA, the core of the methods is to minimise a cost function, usually
using gradient-based optimisation techniques, to estimate the system state. Examples include 3D- and 4D-Var. In
statistical DA, the methods relies on the sampled error statistics to perform sequential updates to the state estimation.
The most popular examples are the ensemble Kalman filter (EnKF) and the particle filter.
Most of the time, DA methods are applied with the perfect model assumption: this is called strong-constraint DA.
However, despite the significant effort provided by the modellers, geoscientific models remain affected by errors
(Dee,2005), for example due to unresolved small-scale processes. This is why there is a growing interest of the DA
community in weak-constraint (WC) methods, i.e. DA methods relaxing the perfect model assumption (Trémolet,2006).
This has led, for example, to the iterative ensemble Kalman filter in the presence of additive noise (Sakov et al.,2018)
in statistical DA, and to the forcing formulation of WC 4D-Var (Laloyaux et al.,2020a) in variational DA. In practice,
the DA control vector has to be extended to include the model error in addition to the system state. The downside of this
approach is the potentially significant increase of the problem’s dimension since the model trajectory is not anymore
described uniquely by the initial condition. By construction, WC 4D-Var is an online model error correction method,
meaning that the model error is estimated during the assimilation process and only valid for the states in the current
assimilation window.
In parallel, following the renewed impetus of machine learning (ML) applications (LeCun et al.,2015;Goodfellow
et al.,2016;Chollet,2018), data-driven approaches are more and more frequent in the geosciences. The goal of these
approaches (e.g., Brunton et al.,2016;Hamilton et al.,2016;Lguensat et al.,2017;Pathak et al.,2018a;Dueben and
Bauer,2018;Fablet et al.,2018;Scher and Messori,2019;Weyn et al.,2019;Arcomano et al.,2020, among many
others) is to learn a surrogate of the dynamical model using supervised learning, i.e. by minimising a loss function
which measures the discrepancy between the surrogate model predictions and an observation dataset. In order to take
into account sparse and noisy observations, ML techniques can be combined with DA (Abarbanel et al.,2018;Bocquet
et al.,2019;Brajard et al.,2020;Bocquet et al.,2020;Arcucci et al.,2021). The idea is to take the best of both worlds:
DA techniques are used to estimate the state of the system from the observations, and ML techniques are used to
estimate the surrogate model from the estimated state. In practice, the hybrid DA and ML methods can be used both for
full model emulation and model error correction (Rasp et al.,2018;Pathak et al.,2018b;Bolton and Zanna,2019;Jia
et al.,2019;Watson,2019;Bonavita and Laloyaux,2020;Brajard et al.,2021;Gagne et al.,2020;Wikner et al.,2020;
Farchi et al.,2021a,b;Chen et al.,2022). In the first case, the surrogate model is entirely learned from observations,
while in the latter case, the surrogate model is hybrid: a physical, knowledge-based model is corrected by a statistical
model, e.g. a neural network (NN), which is learned from observations. Even though from a technical point of view it
can arguably be more difficult to implement, model error correction has many advantages over full model emulation: by
leveraging the long history of numerical modelling, one can hope to end up with an easier learning problem (Watson,
2019;Farchi et al.,2021b).
Most of the current hybrid DA-ML methods use offline learning strategies: the surrogate model (or model error
correction) is learned using a large dataset of observations (or analyses) and should be generalisable to other situations
(i.e. outside the dataset). There are two main reasons for this choice. First, surrogate modelling requires a large amount
of data to provide accurate results – certainly more than what is available in a single assimilation update with online
2
Online model error correction with neural networks – preprint – October 26, 2022
learning. Second, by doing so, it is possible to use the full potential of the ML variational tools. Nevertheless, online
learning has on paper several advantages over offline learning.
Online learning fits the standard sequential DA approach in the geosciences. Each time a new batch of
observations becomes available, the surrogate model parameters can be corrected.
With online learning, the system state and the surrogate model parameters are jointly estimated, which is often
not the case with offline learning. Joint estimation is in general more consistent, and hence potentially more
accurate, than separate estimation.
With offline learning, the training only starts once a sufficiently large dataset is available. With online learning,
the training begins from the first batch of observations, which means that improvements can be expected before
having a large dataset.
With online learning, since the surrogate model is constantly updated, it can adapt to new (previously unseen)
conditions. An example could be, in the case of model error correction, an update of the physical model to
correct. Another example could be slowly-varying effects on the dynamics (e.g., seasonality).
Fundamentally, online learning is very similar to parameter estimation in DA: the goal is to estimate at the same
time the system state and some parameters – in this case the surrogate model parameters. Several example of online
learning methods have recently emerged. Bocquet et al. (2021); Malartic et al. (2022) have developed several variants
of the EnKF to perform a joint estimation of the state and the parameters of surrogate model which fully emulates
the dynamics. Gottwald and Reich (2021) have used a very similar approach for the parameters of an echo state
network used as surrogate model. Finally, Farchi et al. (2021a) have derived a variant of WC 4D-Var to perform a joint
estimation of the state and the parameters of a NN which correct the tendencies of a physical model. In this article, we
revisit the method of Farchi et al. (2021a). A new simplified method is derived, compatible with future applications
to more realistic models. The method is implemented in the Object-Oriented Prediction System (OOPS) framework
developed at the European Center for Medium-Range Weather Forecasts (ECMWF), and tested using the two-layer
quasi-geostrophic channel model developed in OOPS. To us, this is a final step before considering an application with
the Integrated Forecasting System (IFS, Bonavita et al.,2017), since the IFS will soon rely on OOPS for its DA part.
The article is organised as follows. Section 2 presents the methodology. The quasi-geostrophic (QG) model is described
in section 3. The experimental results are then presented in section 4 for offline learning, and in section 5 for online
learning. Finally, conclusions are given in section 6.
2 A simplified neural network variant of weak-constraint 4D-Var
2.1 Strong-constraint 4D-Var
Suppose that we follow the evolution of a system using a series of observations taken at discrete times. In the classical
4D-Var, the observations are gathered into time windows
(y0,...,yL)
. The integer
L0
is the window length,
and
ykRNy
, the
k
-th batch of observations, contains all the observations taken at time
tk
, for
k= 0, . . . , L
. For
convenience, we assume that the time interval between consecutive observation batches
tk+1 tk= ∆t
is constant.
This assumption is not fundamental; it just makes the presentation much easier. Within the window, the system state
xkRNxat time tkis obtained by integrating the model in time from t0to tk:
xk=Mk:0 (x0),(1)
where
Mk:l:RNxRNx
is the resolvent of the dynamical (or physical) model from
tl
to
tk
. Moreover, the
observations are related to the state by the observation operator Hk:RNxRNyvia
yk=Hk(xk) + vk,(2)
where
vk
is the observation error at time
tk
, which could be a random vector. Let us make the assumption that the
observation errors are independent from each other.
The 4D-Var cost function is defined as the negative log-likelihood:
Jsc (x0),ln p(x0|y0,...,yL),(3a)
ln p(x0)ln p(y0,...,yL|x0),(3b)
ln p(x0)
L
X
k=0
ln p(yk|x0),(3c)
3
Online model error correction with neural networks – preprint – October 26, 2022
where conditional independence of the observation vectors on
x0
was used. The background
p(x0)
is Gaussian with
mean
xb
0
and covariance matrix
B
, and the observation errors
vk
are also Gaussian distributed with mean
0
and
covariance matrices Rk, in such a way that Jsc becomes:
Jsc (x0) = 1
2
x0xb
0
2
B1+1
2
L
X
k=0
kykHkMk:0 (x0)k2
R1
k,(4)
where we have dropped the constant terms and where the notation
kvk2
M
stands for the squared Mahalanobis norm
v>Mv.
This formulation is called strong-constraint 4D-Var because it relies on the perfect model assumption eq. (1). In practice,
eq. (4) is minimised using scalable gradient-based optimisation methods to provide the analysis
xa
0
. In cycled DA, the
model is then used to propagate xa
0till the start of the next window, yielding thus a value for the background state xb
0.
2.2 Weak-constraint 4D-Var
Recognising that the model is not perfect, we can replace the strong constraint eq. (1) by the more general model
evolution
xk+1 =Mk+1:k(xk) + wk,(5)
where
wkRNx
is the model error from
tk
to
tk+1
, potentially random. Let us make the assumption that the model
errors are independent from each other and independent from the background errors. This implies that the model
evolution satisfies the Markov property.
The updated cost function now depends on all states inside the window:
Jwc (x0,...,xL),ln p(x0,...,xL|y0,...,yL),(6a)
ln p(x0,...,xL)ln p(y0,...,yL|x0,...,xL),(6b)
ln p(x0)
L1
X
k=0
ln p(xk+1|xk)
L
X
k=0
ln p(yk|xk).(6c)
With the Gaussian assumptions of section 2.1 and the additional hypothesis that the model errors
wk
also follow a
Gaussian distribution with mean wb
kand covariance matrices Qk,Jwc becomes
Jwc (x0,...,xL) = 1
2
x0xb
0
2
B1+1
2
L1
X
k=0
xk+1 Mk+1:k(xk)wb
k
2
Q1
k
+1
2
L
X
k=0
kykHk(xk)k2
R1
k,(7)
where we have once again dropped the constant terms. This formulation is called weak-constraint 4D-Var (Trémolet,
2006) because it relaxes the perfect model assumption eq. (1), which means that the analysis
xa
0,...,xa
L1
is not any
more a trajectory of the model. However, this comes at a price: the dimension of the problem has increased from
Nx
to
LNx.
This dimensionality increase can be mitigated by making additional assumptions. For example, one can assume that the
model error is constant throughout the window, i.e.
w0=. . . =wL1,w,(8a)
wb
0=. . . =wb
L1,wb,(8b)
Q0=. . . =QL1,LQ.(8c)
In this case, the trajectory (x0,...,xL)is fully determined by (w,x0):
xk=Mk+1:k(xk) + w=Mk+1:k(Mk:k1(xk1) + w) + w=. . . ,Mwc
k+1:0 (w,x0),(9)
with
x7→ Mwc
k+1:0 (w,x)
being the resolvent of the
w
-debiased model from
t0
to
tk+1
. The Gaussian cost function
Jwc eq. (7) can hence be written
Jwc (w,x0) = 1
2
x0xb
0
2
B1+1
2
wwb
2
Q1+1
2
L
X
k=0
kykHkMwc
k:0 (w,x0)k2
R1
k.(10)
4
摘要:

ONLINEMODELERRORCORRECTIONWITHNEURALNETWORKSINTHEINCREMENTAL4D-VARFRAMEWORKPREPRINTAlbanFarchiCEREA,ÉcoledesPontsandEDFR&DÎle–de–France,Francealban.farchi@enpc.frMarcinChrustECMWFShineldParkReading,UnitedKingdomMarcBocquetCEREA,ÉcoledesPontsandEDFR&DÎle–de–France,FrancePatrickLaloyauxECMWFShineldP...

展开>> 收起<<
ONLINE MODEL ERROR CORRECTION WITH NEURAL NETWORKS IN THE INCREMENTAL 4D-V AR FRAMEWORK PREPRINT.pdf

共19页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:19 页 大小:600.2KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 19
客服
关注