Dont Waste Data Transfer Learning to Leverage All Data for Machine-Learnt Climate Model Emulation Raghul Parthipan12

2025-05-01 2 0 309.04KB 9 页 10玖币

侵权投诉

Don’t Waste Data: Transfer Learning to Leverage All

Data for Machine-Learnt Climate Model Emulation

Raghul Parthipan1,2

rp542@cam.ac.uk

Damon J. Wischik1

damon.wischik@cl.cam.ac.uk

1Department of Computer Science and Technology, University of Cambridge, UK

2British Antarctic Survey, Cambridge, UK

Abstract

How can we learn from all available data when training machine-learnt climate

models, without incurring any extra cost at simulation time? Typically, the

training data comprises coarse-grained high-resolution data. But only keeping

this coarse-grained data means the rest of the high-resolution data is thrown

out. We use a transfer learning approach, which can be applied to a range

of machine learning models, to leverage all the high-resolution data. We use

three chaotic systems to show it stabilises training, gives improved general-

isation performance and results in better forecasting skill. Our code is at

https://github.com/raghul-parthipan/dont_waste_data.

1 Introduction

Accurate weather and climate models are key to climate science and decision-making. Often we have

a high-resolution physics-based model which we trust, and want to use that to create a lower-cost

(lower-resolution) emulator of similar accuracy. There has been much work using machine learning

(ML) to learn such models from data [

], due to the difﬁculty

in manually specifying them.

The naive approach is to use coarse-grained high-resolution model data as training data. The high-

resolution data is averaged onto the lower-resolution grid and treated as source data. The goal is

to match the evolution of the coarse-grained high-resolution model using the lower-resolution one.

Such procedures have been used successfully [

]. This has several beneﬁts over

using observations, including excellent spatio-temporal coverage. But it has a key downside — the

averaging procedure means much high-resolution data is thrown away.

Our novelty is showing that climate model emulation can be framed as a transfer learning task.

We can do better by using all of the high-resolution data as an auxiliary task to help learn the low-

resolution emulator. And we can do this without any further cost at simulation time. As far as we

know, this has not yet been reported in the climate literature. This results in improved generalization

performance and forecasting ability, and we demonstrate this on three chaotic dynamical systems.

Related Work.

Transfer learning (TL) has been successfully used for ﬁne-tuning models, including

sequence models, in various domains such as natural language processing (NLP) and image classi-

ﬁcation. There are various methods used such as (1) ﬁne-tuning on an auxiliary task and then the

target task [17, 18, 19]; (2) multi-task learning, where ﬁne-tuning is done on the target task and one

or more auxiliary tasks simultaneously [

]; and mixtures of the two. Our

Tackling Climate Change with Machine Learning workshop at NeurIPS 2022.

arXiv:2210.04001v2 [cs.LG] 30 Oct 2022

approach is most similar to the ﬁrst one. However, our models are not pre-trained as is standard in

NLP. Despite this, we show our approach remains successful.

Climate Impact.

A major source of inaccuracies in weather and climate models arises from

‘unresolved’ processes (such as those relating to convection and clouds) [

These occur at scales smaller than the resolution of the climate model but have key effects on the

overall climate. For example, most of the variability in how much global surface temperatures

increase after

CO2

concentrations double is due to the representation of clouds [

]. There

will always be processes too costly to be explicitly resolved by our current operational models.

The standard approach to deal with these unresolved processes is to model their effects as a function

of the resolved ones. This is known as ‘parameterization’ and there is much ML work on this

[

]. We propose that by using all available high-resolution

data, better ML parameterization schemes and therefore better climate models can be created.

2 Methods

Our approach is a two-step process: ﬁrst, we train our model on the high-resolution data, and second,

we ﬁne-tune it on the low-resolution (target) data.

We denote the low-resolution data at time

Xt∈Rd

. The goal is to create a sequence model for

the evolution of

through time, whilst only tracking

. We denote the high-resolution data at time

Yt∈Rdm

. In parameterization,

is often a temporal and/or spatial averaging of

. We wish

to use Ytto learn a better model of Xt.

A range of ML models for sequences may be used, but we suggest they should contain both shared

and task-speciﬁc layers.

We ﬁrst model

, training in the standard teacher-forcing way for ML sequence models. We use the

framework of probability, and so train by maximising the log-likelihood of

log Pr(y1,y2, ..., yn)

Informally, the likelihood measures how likely

is to be generated by our sequence model. Next,

the weights of the shared layers are frozen and the weights of the target-speciﬁc layers are trained

to model the low-resolution training data,

. Again, under the probability framework, this means

maximising the log-likelihood of Xt,log Pr(x1,x2, ..., xn).

2.1 RNN Model

We use the recurrent neural network (RNN) to demonstrate our approach (though it is not limited to the

RNN). RNNs are well-suited to parameterization tasks [

] as they only track a summary

representation of the system history, reducing simulation cost. This is unlike the Transformer [

]

which requires a slice of the actual history of Xt.

For our RNN, the hidden state is shared and its evolution is described by

ht+1 =fθ(ht,Xt)

where

ht∈RHand fθis a GRU cell [36]. We model the low-resolution data as

Xt+1 =Xt+gθ(ht+1) + σzt(1)

and the high-resolution as

Yt+1 =Yt+jθ(ht+1) + ρwt(2)

where the functions

gθ

and

jθ

are represented by task-speciﬁc dense layers,

zt∼ N (0, I)

and

wt∼ N (0, I)

. The learnable parameters are the neural network weights

and the noise terms

σ∈R1and ρ∈R1. Further details are in Appendix A.

2.2 Evaluation

We use hold-out log-likelihood to assess generalization to unseen data, a standard probabilistic

approach in ML. The models were trained with 15 different random seed initializations to ensure

the differences in the results were due to our approach as opposed to a quirk of a particular random

seed. This is used to generate 95% conﬁdence intervals. Likelihood is not easily interpretable nor the

end-goal of operational climate models. Ultimately we want to use weather and climate models to

make forecasts, and it is common to measure forecast skill with error and spread [

] so this is

also done for evaluation.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Don'tWasteData:TransferLearningtoLeverageAllDataforMachine-LearntClimateModelEmulationRaghulParthipan1;2rp542@cam.ac.ukDamonJ.Wischik1damon.wischik@cl.cam.ac.uk1DepartmentofComputerScienceandTechnology,UniversityofCambridge,UK2BritishAntarcticSurvey,Cambridge,UKAbstractHowcanwelearnfromallavailabled...

展开>> 收起<<

Dont Waste Data Transfer Learning to Leverage All Data for Machine-Learnt Climate Model Emulation Raghul Parthipan12.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Dont Waste Data Transfer Learning to Leverage All Data for Machine-Learnt Climate Model Emulation Raghul Parthipan12

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: