CLOINet Ocean State Reconstructions

2025-04-29 0 0 4.55MB 15 页 10玖币

侵权投诉

CLOINET: OCEAN STATE RECONSTRUCTIONS THROUGH

REMOTE-SENSING,IN-SITU SPARSE OBSERVATIONS AND DEEP

LEARNING

A PREPRINT

Eugenio Cutolo

IMEDEA (CSIC-UIB),

Esporles, Spain,

e.cutolo@imedea.uib-csic.es

Ananda Pascual

IMEDEA (CSIC-UIB),

Esporles, Spain,

ananda.pascual@imedea.uib-csic.es

Simon Ruiz

IMEDEA (CSIC-UIB),

Esporles, Spain,

simon.ruiz@imedea.uib-csic.es

Nikolaos Zarokanellos

SOCIB,

Palma, Spain

nzarokanellos@socib.es

Ronan Fablet

IMT Atlantique,

CNRS UMR Lab-STICC,

Brest, France

ronan.fablet@imt-atlantique.fr

December 13, 2023

ABSTRACT

Combining remote-sensing data with in-situ observations to achieve a comprehensive 3D reconstruc-

tion of the ocean state presents signiﬁcant challenges for traditional interpolation techniques. To

address this, we developed the CLuster Optimal Interpolation Neural Network (CLOINet), which

combines the robust mathematical framework of the Optimal Interpolation (OI) scheme with a self-

supervised clustering approach. CLOINet efﬁciently segments remote sensing images into clusters to

reveal non-local correlations, thereby enhancing ﬁne-scale oceanic reconstructions. We trained our

network using outputs from an Ocean General Circulation Model (OGCM), which also facilitated

various testing scenarios. Our Observing System Simulation Experiments aimed to reconstruct deep

salinity ﬁelds using Sea Surface Temperature (SST) or Sea Surface Height (SSH), alongside sparse

in-situ salinity observations. The results showcased a signiﬁcant reduction in reconstruction error up

40%

and the ability to resolve scales

50%

smaller compared to baseline OI techniques. Remarkably,

even though CLOINet was trained exclusively on simulated data, it accurately reconstructed an unseen

SST ﬁeld using only glider temperature observations and satellite chlorophyll concentration data.

This demonstrates how deep learning networks like CLOINet can potentially lead the integration of

modeling and observational efforts in developing an ocean digital twin.

1 Introduction

Nowadays, there is an increased consciousness of the role played by the ocean in many crucial aspects of human

safety, health, and well-being due to the cumulative impacts of climate change, unsustainable exploitation of marine

resources, pollution, and uncoordinated development (UNESCO, 2019,Pascual et al. [2021]). In response to these

challenges, which UNESCO has encapsulated in 10 objectives for the Ocean Decade (2021-2030), the European

Union is endeavoring to develop a digital twin of the ocean. The concept of digital twins involves creating a digital

representation of real-world entities or processes, based on both real-time and historical observations, to depict the past

and present and to model potential future scenarios.

In the ocean case and especially to address climate change-related concerns, one major challenge is understanding the

state and evolution of the ocean’s interior. Its stratiﬁcation signiﬁcantly inﬂuences large-scale integrated variables like

ocean heat content, acidiﬁcation, and oxygenation [Wang et al., 2018, Durack et al., 2014]. Moreover, numerous studies

arXiv:2210.10767v3 [physics.ao-ph] 12 Dec 2023

CLOINet: Ocean State Reconstructions A PREPRINT

have highlighted the importance of resolving submesoscale dynamics to account for the majority of vertical ocean

transport, which is vital for carbon export, ﬁsheries, nutrient availability, and pollution displacement [Pascual et al.,

2017]. These challenges underscore the need for high-resolution, three-dimensional representations of the ocean state.

High-resolution numerical models and data assimilation techniques, which align model outputs with actual observations,

are currently the most common solutions [Carrassi et al., 2018, Mourre et al., 2004].

Operational simulations now assimilate near-real-time observations, including in-situ (ship-based observations, under-

water gliders, and ﬂoats) and remote sensing data. CIT Satellite observations provide frequent global snapshots of the

sea surface, for instance Sea Surface Temperature and Chlorophyll concentration images offer resolutions as ﬁne as 1

km on a daily basis. In contrast, the current capabilities of remote altimeters are limited to a 200 km wavelength for the

global ocean at mid-latitudes and about 130 km for the Mediterranean Sea [Ballarotta et al., 2019], though signiﬁcant

advancements are upcoming with the Surface Water and Ocean Topography (SWOT) mission successfully launched in

December 2022 [Morrow et al., 2019]. Notably, Sea Surface Height (SSH) data are unaffected by cloud cover. However,

the uncertainties regarding the ocean interior remain signiﬁcant due to the sparse distribution of in-situ observations in

time and space [Siegelman et al., 2019]. As a result, while data-assimilating models adhere to physical balances, they

still lack accuracy [Arcucci et al., 2021].

The ocean twin strategy proposes data-driven approaches as a complementary method for revealing the ocean state.

In previous oceanographic studies, multivariate methods allowed to elaborate three-dimensional hydrographic ﬁelds

relying on their vast in-situ measurements collected during ocean campaigns [Cutolo et al., 2022, Gomis et al., 2001].

However, these methods are not easily scalable to a global observing system due to the sheer number of parameters

involved, such as correlation lengths. Machine learning techniques offer a solution to these scalability issues, as the

models are directly learned from the data. A key challenge for these techniques is the need for a substantial quantity of

realistic training data. General circulation and process study models play a new role here, providing a cost-effective way

to generate large datasets that adhere to ocean physics. Even datasets that only approximately match the true ocean state

can be valuable, provided they encompass a wide range of scenarios.This last point is especially crucial in preventing

the risk of deep networks memorizing the input climatology rather than capturing the actual ocean dynamics. Such a

focus ensures that the networks can understand and adapt to scenarios that signiﬁcantly deviate from the average, rather

than being conﬁned to repetitive patterns. To effectively generalize beyond their training data, neural networks require

careful design to preserve relevant input features across their layers. In this context, explainable AI aims to advance

beyond the black-box applications typical in ocean remote sensing studies, promoting a deeper understanding of the

model workings ([Zhu et al., 2017].

Despite these difﬁculties, recent studies have demonstrated the potential of deep-learning methods for various dynamical

system tasks. These range from idealized situations [Fablet et al., 2021] to realistic case studies, such as interpolating

missing data in satellite-derived observations of sea surface dynamics [Barth et al., 2020, Manucharyan et al., 2021,

Fablet et al., 2020]. With regard to reconstructing hydrographic proﬁles from satellite data, there’s a spectrum of

approaches: from proof-of-concept studies using self-organizing maps (SOMs) and neural networks (Charantonis et al.

[2015], Gueye et al. [2014]) and feed-forward or long short-term memory (LSTM) neural networks [Contractor and

Roughan, 2021, Sammartino et al., 2020, Jiang et al., 2021, Fablet et al., 2021] as well as [Pauthenet et al., 2022]

relying instead on multilayer perceptron. Even considering these past works the interpolation of temperature and salinity

proﬁles given some in-situ and sea surface information is an open challenge.

In this study, we introduce an innovative modular neural network designed to seamlessly integrate remote-sensing

images with in-situ observations for a complete 3D reconstruction of the ocean state. This integration is underpinned by

the Optimal Interpolation (OI) scheme’s mathematical principles [Gandin, 1966]. Unlike traditional applications of OI,

which typically use Euclidean distance to estimate the correlation between points, our approach involves computing

distances within a specially designed latent space. A speciﬁc module within our neural network transform all our input

information into this latent space made of ’clusters’. Within these clusters, non-local correlations become more easily

identiﬁable and can be effectively applied to enhance the correlation matrix. Like attention mechanisms in advanced

neural models [Vaswani et al., 2017], which focus on key aspects in large datasets for tasks such as language processing

or image recognition, our neural network module similarly identiﬁes crucial correlational patterns through the latent

space of clusters.

We privileged a network structure composed of independent nested modules to facilitate the understanding and analysis

of its internal information ﬂow from the input data to the covariance structure. To the best of our knowledge, this is the

ﬁrst work in which neural networks achieve the most optimal combination of remote-sensing and in-situ observations

without previous knowledge of the study area’s climatology. This study is structured as follows: section 2 presents the

main synthetic dataset that we used for the training and testing and some real observations for some preliminar use case

scenario. All the details regarding the network architecture are in section 3 while the results are shown in section 4.

CLOINet: Ocean State Reconstructions A PREPRINT

2 Data

Neural networks need large amounts of data to be trained appropriately. A common choice in oceanography where such

a signiﬁcant quantity of actual observations are unavailable is relying on numerical models. In our case, we chased

NATL60, a simulation based on the Nucleus for European Modelling of the Ocean described. We used the ﬁelds of this

model to simulate both remote-sensing and in-situ observations in a so-called Observing System Simulation Experiment

(OSSE). The model output is sampled in these experiments to replicate the different types of partial observations

available. The advantage is that we can quickly check the obtained improvements since the model output also provides

the ground truth we aim to reconstruct. The danger of what is usually called called "supervised learning" only aiming to

minimize the discrepancy with the provided ground truth is that the network weights memorize the "right answers"

so in our context the model climatology. We faced this problem, including two self-supervised terms in our loss

function as we describe later but also accurately selecting a highly varying training and test dataset as presented here in

subsection 2.1.

Finally, we proved the generalization capabilities of our network, testing it with actual multi-platform observations. In

particular, we used the remote-sensing products of Sea Surface Temperature (SST) and Chlorophyll-a concentration

(CHL) from CMEMS, together with temperature observations from gliders, as described in subsection 2.2.

Figure 1: Training area (A) and testing area (B) presented with the SWOT passages in the fast-sampling phase.

2.1 eNATL60 based OSSE

Our primary experiments utilized the eNATL60 conﬁguration of the Nucleus for the European Modelling of the Ocean

(NEMO) model [Gurvan et al., 2022], featuring a

1/60◦

horizontal resolution and 300 vertical levels across the North

Atlantic. This high-resolution conﬁguration is essential for understanding ocean dynamics, particularly for surface

oceanic motions down to 15 km, which aligns with SWOT observations (Ajayi et al. [2020]). We direct readers to

this work for a detailed understanding of NATL60’s capabilities. Additionally, numerous studies have employed the

non-extended version of NATL60 for resolving ﬁne-scale dynamical processes ([Metref et al., 2019, 2020, Fresnay

et al., 2018, Amores et al., 2018]).

For our training and testing data, we utilized daily averages of Sea Surface Temperature (SST) and Sea Surface Height

(SSH), both individually and combined, from the eNATL60 simulation spanning an entire year. Alongside these, we

gathered in-situ salinity observations at three speciﬁc depths: 5m, 75m, and 150m. Our focus was then to reconstruct

the 2D salinity ﬁelds at these depths. In particular our analysis predominantly focused on the

m and

150

m depths,

selected to assess the robustness of our model both within and beyond the mixed-layer depth. To ensure that our

network’s training and testing in-situ observations mirrored real oceanographic conditions, we adopted two distinct

sampling strategies: random and regular. This approach allowed us to evaluate the network’s performance in various

realistic observational scenarios. The random strategy selects

domain points based on a uniform distribution, while

the regular strategy uses a homogeneous grid sampling with a ﬁxed spacing of

δx

. By varying

and

δx

, we conducted

different experiments to observe metric variations.

Our focus was on two marine areas: the subpolar northwest Atlantic for training, and the Western Mediterranean

Sea for testing. Both regions are notable for SWOT passages during its rapid-sampling phase (see Figure 1). The

Mediterranean region, in particular, is known for its dynamic oceanographic characteristics and has been extensively

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

CLOINET:OCEANSTATERECONSTRUCTIONSTHROUGHREMOTE-SENSING,IN-SITUSPARSEOBSERVATIONSANDDEEPLEARNINGAPREPRINTEugenioCutoloIMEDEA(CSIC-UIB),Esporles,Spain,e.cutolo@imedea.uib-csic.esAnandaPascualIMEDEA(CSIC-UIB),Esporles,Spain,ananda.pascual@imedea.uib-csic.esSimonRuizIMEDEA(CSIC-UIB),Esporles,Spain,simon...

展开>> 收起<<

CLOINet Ocean State Reconstructions.pdf

共15页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

CLOINet Ocean State Reconstructions

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: