ONLINE MODEL ERROR CORRECTION WITH NEURAL NETWORKS IN THE INCREMENTAL 4D-V AR FRAMEWORK PREPRINT

2025-05-02 0 0 600.2KB 19 页 10玖币

侵权投诉

ONLINE MODEL ERROR CORRECTION WITH NEURAL NETWORKS

IN THE INCREMENTAL 4D-VAR FRAMEWORK

PREPRINT

Alban Farchi

CEREA, École des Ponts and EDF R&D

Île–de–France, France

alban.farchi@enpc.fr

Marcin Chrust

ECMWF

Shinﬁeld Park

Reading, United Kingdom

Marc Bocquet

CEREA, École des Ponts and EDF R&D

Île–de–France, France

Patrick Laloyaux

ECMWF

Shinﬁeld Park

Reading, United Kingdom

Massimo Bonavita

ECMWF

Shinﬁeld Park

Reading, United Kingdom

October 26, 2022

ABSTRACT

Recent studies have demonstrated that it is possible to combine machine learning with data assim-

ilation to reconstruct the dynamics of a physical model partially and imperfectly observed. Data

assimilation is used to estimate the system state from the observations, while machine learning

computes a surrogate model of the dynamical system based on those estimated states. The surrogate

model can be deﬁned as an hybrid combination where a physical model based on prior knowledge

is enhanced with a statistical model estimated by a neural network. The training of the neural

network is typically done ofﬂine, once a large enough dataset of model state estimates is available.

By contrast, with online approaches the surrogate model is improved each time a new system state

estimate is computed. Online approaches naturally ﬁt the sequential framework encountered in

geosciences where new observations become available with time. In a recent methodology paper,

we have developed a new weak-constraint 4D-Var formulation which can be used to train a neural

network for online model error correction. In the present article, we develop a simpliﬁed version of

that method, in the incremental 4D-Var framework adopted by most operational weather centres. The

simpliﬁed method is implemented in the ECMWF Object-Oriented Prediction System, with the help

of a newly developed Fortran neural network library, and tested with a two-layer two-dimensional

quasi geostrophic model. The results conﬁrm that online learning is effective and yields a more

accurate model error correction than ofﬂine learning. Finally, the simpliﬁed method is compatible

with future applications to state-of-the-art models such as the ECMWF Integrated Forecasting System.

Keywords data assimilation ·machine learning ·model error ·surrogate model ·neural networks ·online learning

Plain language summary

We have recently proposed a general framework for combining data assimilation and machine learning techniques to

train a neural network for online model error correction. In the present article, we develop a simpliﬁed version of this

online training method, compatible with future applications to more realistic models. Using numerical illustrations, we

arXiv:2210.13817v1 [stat.ML] 25 Oct 2022

Online model error correction with neural networks – preprint – October 26, 2022

show that the new method is effective and yields a more accurate model error correction than the usual ofﬂine learning

approach. The results show the potential of incorporating data assimilation and machine learning tightly, and pave the

way towards an application to the Integrated Forecasting System used for operational numerical weather prediction at

the European Centre for Medium-Range Weather Forecasts.

Key points

• Weak-constraint 4D-Var variants can be used to train neural networks for online model error correction.

• Online learning yields a more accurate model error correction than ofﬂine learning.

•

The new, simpliﬁed method, developed in the incremental 4D-Var framework, can be easily applied in

operational weather models.

1 Introduction: machine learning for model error correction

In the geosciences, data assimilation (DA) is used to increase the quality of forecasts by providing accurate initial

conditions (Kalnay,2003;Reich and Cotter,2015;Law et al.,2015;Asch et al.,2016;Carrassi et al.,2018;Evensen

et al.,2022). The initial conditions are obtained by combining all sources of information in a mathematically optimal

way, in particular information from the dynamical model and information from sparse and noisy observations. There

are two main classes of DA methods. In variational DA, the core of the methods is to minimise a cost function, usually

using gradient-based optimisation techniques, to estimate the system state. Examples include 3D- and 4D-Var. In

statistical DA, the methods relies on the sampled error statistics to perform sequential updates to the state estimation.

The most popular examples are the ensemble Kalman ﬁlter (EnKF) and the particle ﬁlter.

Most of the time, DA methods are applied with the perfect model assumption: this is called strong-constraint DA.

However, despite the signiﬁcant effort provided by the modellers, geoscientiﬁc models remain affected by errors

(Dee,2005), for example due to unresolved small-scale processes. This is why there is a growing interest of the DA

community in weak-constraint (WC) methods, i.e. DA methods relaxing the perfect model assumption (Trémolet,2006).

This has led, for example, to the iterative ensemble Kalman ﬁlter in the presence of additive noise (Sakov et al.,2018)

in statistical DA, and to the forcing formulation of WC 4D-Var (Laloyaux et al.,2020a) in variational DA. In practice,

the DA control vector has to be extended to include the model error in addition to the system state. The downside of this

approach is the potentially signiﬁcant increase of the problem’s dimension since the model trajectory is not anymore

described uniquely by the initial condition. By construction, WC 4D-Var is an online model error correction method,

meaning that the model error is estimated during the assimilation process and only valid for the states in the current

assimilation window.

In parallel, following the renewed impetus of machine learning (ML) applications (LeCun et al.,2015;Goodfellow

et al.,2016;Chollet,2018), data-driven approaches are more and more frequent in the geosciences. The goal of these

approaches (e.g., Brunton et al.,2016;Hamilton et al.,2016;Lguensat et al.,2017;Pathak et al.,2018a;Dueben and

Bauer,2018;Fablet et al.,2018;Scher and Messori,2019;Weyn et al.,2019;Arcomano et al.,2020, among many

others) is to learn a surrogate of the dynamical model using supervised learning, i.e. by minimising a loss function

which measures the discrepancy between the surrogate model predictions and an observation dataset. In order to take

into account sparse and noisy observations, ML techniques can be combined with DA (Abarbanel et al.,2018;Bocquet

et al.,2019;Brajard et al.,2020;Bocquet et al.,2020;Arcucci et al.,2021). The idea is to take the best of both worlds:

DA techniques are used to estimate the state of the system from the observations, and ML techniques are used to

estimate the surrogate model from the estimated state. In practice, the hybrid DA and ML methods can be used both for

full model emulation and model error correction (Rasp et al.,2018;Pathak et al.,2018b;Bolton and Zanna,2019;Jia

et al.,2019;Watson,2019;Bonavita and Laloyaux,2020;Brajard et al.,2021;Gagne et al.,2020;Wikner et al.,2020;

Farchi et al.,2021a,b;Chen et al.,2022). In the ﬁrst case, the surrogate model is entirely learned from observations,

while in the latter case, the surrogate model is hybrid: a physical, knowledge-based model is corrected by a statistical

model, e.g. a neural network (NN), which is learned from observations. Even though from a technical point of view it

can arguably be more difﬁcult to implement, model error correction has many advantages over full model emulation: by

leveraging the long history of numerical modelling, one can hope to end up with an easier learning problem (Watson,

2019;Farchi et al.,2021b).

Most of the current hybrid DA-ML methods use ofﬂine learning strategies: the surrogate model (or model error

correction) is learned using a large dataset of observations (or analyses) and should be generalisable to other situations

(i.e. outside the dataset). There are two main reasons for this choice. First, surrogate modelling requires a large amount

of data to provide accurate results – certainly more than what is available in a single assimilation update with online

Online model error correction with neural networks – preprint – October 26, 2022

learning. Second, by doing so, it is possible to use the full potential of the ML variational tools. Nevertheless, online

learning has on paper several advantages over ofﬂine learning.

•

Online learning ﬁts the standard sequential DA approach in the geosciences. Each time a new batch of

observations becomes available, the surrogate model parameters can be corrected.

•

With online learning, the system state and the surrogate model parameters are jointly estimated, which is often

not the case with ofﬂine learning. Joint estimation is in general more consistent, and hence potentially more

accurate, than separate estimation.

•

With ofﬂine learning, the training only starts once a sufﬁciently large dataset is available. With online learning,

the training begins from the ﬁrst batch of observations, which means that improvements can be expected before

having a large dataset.

•

With online learning, since the surrogate model is constantly updated, it can adapt to new (previously unseen)

conditions. An example could be, in the case of model error correction, an update of the physical model to

correct. Another example could be slowly-varying effects on the dynamics (e.g., seasonality).

Fundamentally, online learning is very similar to parameter estimation in DA: the goal is to estimate at the same

time the system state and some parameters – in this case the surrogate model parameters. Several example of online

learning methods have recently emerged. Bocquet et al. (2021); Malartic et al. (2022) have developed several variants

of the EnKF to perform a joint estimation of the state and the parameters of surrogate model which fully emulates

the dynamics. Gottwald and Reich (2021) have used a very similar approach for the parameters of an echo state

network used as surrogate model. Finally, Farchi et al. (2021a) have derived a variant of WC 4D-Var to perform a joint

estimation of the state and the parameters of a NN which correct the tendencies of a physical model. In this article, we

revisit the method of Farchi et al. (2021a). A new simpliﬁed method is derived, compatible with future applications

to more realistic models. The method is implemented in the Object-Oriented Prediction System (OOPS) framework

developed at the European Center for Medium-Range Weather Forecasts (ECMWF), and tested using the two-layer

quasi-geostrophic channel model developed in OOPS. To us, this is a ﬁnal step before considering an application with

the Integrated Forecasting System (IFS, Bonavita et al.,2017), since the IFS will soon rely on OOPS for its DA part.

The article is organised as follows. Section 2 presents the methodology. The quasi-geostrophic (QG) model is described

in section 3. The experimental results are then presented in section 4 for ofﬂine learning, and in section 5 for online

learning. Finally, conclusions are given in section 6.

2 A simpliﬁed neural network variant of weak-constraint 4D-Var

2.1 Strong-constraint 4D-Var

Suppose that we follow the evolution of a system using a series of observations taken at discrete times. In the classical

4D-Var, the observations are gathered into time windows

(y0,...,yL)

. The integer

L≥0

is the window length,

and

yk∈RNy

, the

-th batch of observations, contains all the observations taken at time

, for

k= 0, . . . , L

. For

convenience, we assume that the time interval between consecutive observation batches

tk+1 −tk= ∆t

is constant.

This assumption is not fundamental; it just makes the presentation much easier. Within the window, the system state

xk∈RNxat time tkis obtained by integrating the model in time from t0to tk:

xk=Mk:0 (x0),(1)

where

Mk:l:RNx→RNx

is the resolvent of the dynamical (or physical) model from

. Moreover, the

observations are related to the state by the observation operator Hk:RNx→RNyvia

yk=Hk(xk) + vk,(2)

where

is the observation error at time

, which could be a random vector. Let us make the assumption that the

observation errors are independent from each other.

The 4D-Var cost function is deﬁned as the negative log-likelihood:

Jsc (x0),−ln p(x0|y0,...,yL),(3a)

∝ − ln p(x0)−ln p(y0,...,yL|x0),(3b)

∝ − ln p(x0)−

k=0

ln p(yk|x0),(3c)

Online model error correction with neural networks – preprint – October 26, 2022

where conditional independence of the observation vectors on

was used. The background

p(x0)

is Gaussian with

mean

and covariance matrix

, and the observation errors

are also Gaussian distributed with mean

and

covariance matrices Rk, in such a way that Jsc becomes:

Jsc (x0) = 1

2

x0−xb

0

2

B−1+1

k=0

kyk−Hk◦Mk:0 (x0)k2

R−1

k,(4)

where we have dropped the constant terms and where the notation

kvk2

stands for the squared Mahalanobis norm

v>Mv.

This formulation is called strong-constraint 4D-Var because it relies on the perfect model assumption eq. (1). In practice,

eq. (4) is minimised using scalable gradient-based optimisation methods to provide the analysis

. In cycled DA, the

model is then used to propagate xa

0till the start of the next window, yielding thus a value for the background state xb

2.2 Weak-constraint 4D-Var

Recognising that the model is not perfect, we can replace the strong constraint eq. (1) by the more general model

evolution

xk+1 =Mk+1:k(xk) + wk,(5)

where

wk∈RNx

is the model error from

tk+1

, potentially random. Let us make the assumption that the model

errors are independent from each other and independent from the background errors. This implies that the model

evolution satisﬁes the Markov property.

The updated cost function now depends on all states inside the window:

Jwc (x0,...,xL),−ln p(x0,...,xL|y0,...,yL),(6a)

∝ − ln p(x0,...,xL)−ln p(y0,...,yL|x0,...,xL),(6b)

∝ − ln p(x0)−

L−1

k=0

ln p(xk+1|xk)−

k=0

ln p(yk|xk).(6c)

With the Gaussian assumptions of section 2.1 and the additional hypothesis that the model errors

also follow a

Gaussian distribution with mean wb

kand covariance matrices Qk,Jwc becomes

Jwc (x0,...,xL) = 1

2

x0−xb

0

2

B−1+1

L−1

k=0 

xk+1 −Mk+1:k(xk)−wb

k

2

Q−1

k=0

kyk−Hk(xk)k2

R−1

k,(7)

where we have once again dropped the constant terms. This formulation is called weak-constraint 4D-Var (Trémolet,

2006) because it relaxes the perfect model assumption eq. (1), which means that the analysis

xa

0,...,xa

L−1

is not any

more a trajectory of the model. However, this comes at a price: the dimension of the problem has increased from

LNx.

This dimensionality increase can be mitigated by making additional assumptions. For example, one can assume that the

model error is constant throughout the window, i.e.

w0=. . . =wL−1,w,(8a)

0=. . . =wb

L−1,wb,(8b)

Q0=. . . =QL−1,LQ.(8c)

In this case, the trajectory (x0,...,xL)is fully determined by (w,x0):

xk=Mk+1:k(xk) + w=Mk+1:k(Mk:k−1(xk−1) + w) + w=. . . ,Mwc

k+1:0 (w,x0),(9)

with

x7→ Mwc

k+1:0 (w,x)

being the resolvent of the

-debiased model from

tk+1

. The Gaussian cost function

Jwc eq. (7) can hence be written

Jwc (w,x0) = 1

2

x0−xb

0

2

B−1+1

2

w−wb

2

Q−1+1

k=0

kyk−Hk◦Mwc

k:0 (w,x0)k2

R−1

k.(10)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ONLINEMODELERRORCORRECTIONWITHNEURALNETWORKSINTHEINCREMENTAL4D-VARFRAMEWORKPREPRINTAlbanFarchiCEREA,ÉcoledesPontsandEDFR&DÎledeFrance,Francealban.farchi@enpc.frMarcinChrustECMWFShineldParkReading,UnitedKingdomMarcBocquetCEREA,ÉcoledesPontsandEDFR&DÎledeFrance,FrancePatrickLaloyauxECMWFShineldP...

展开>> 收起<<

ONLINE MODEL ERROR CORRECTION WITH NEURAL NETWORKS IN THE INCREMENTAL 4D-V AR FRAMEWORK PREPRINT.pdf

共19页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

ONLINE MODEL ERROR CORRECTION WITH NEURAL NETWORKS IN THE INCREMENTAL 4D-V AR FRAMEWORK PREPRINT

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: