CLOINet Ocean State Reconstructions

2025-04-29 0 0 4.55MB 15 页 10玖币
侵权投诉
CLOINET: OCEAN STATE RECONSTRUCTIONS THROUGH
REMOTE-SENSING,IN-SITU SPARSE OBSERVATIONS AND DEEP
LEARNING
A PREPRINT
Eugenio Cutolo
IMEDEA (CSIC-UIB),
Esporles, Spain,
e.cutolo@imedea.uib-csic.es
Ananda Pascual
IMEDEA (CSIC-UIB),
Esporles, Spain,
ananda.pascual@imedea.uib-csic.es
Simon Ruiz
IMEDEA (CSIC-UIB),
Esporles, Spain,
simon.ruiz@imedea.uib-csic.es
Nikolaos Zarokanellos
SOCIB,
Palma, Spain
nzarokanellos@socib.es
Ronan Fablet
IMT Atlantique,
CNRS UMR Lab-STICC,
Brest, France
ronan.fablet@imt-atlantique.fr
December 13, 2023
ABSTRACT
Combining remote-sensing data with in-situ observations to achieve a comprehensive 3D reconstruc-
tion of the ocean state presents significant challenges for traditional interpolation techniques. To
address this, we developed the CLuster Optimal Interpolation Neural Network (CLOINet), which
combines the robust mathematical framework of the Optimal Interpolation (OI) scheme with a self-
supervised clustering approach. CLOINet efficiently segments remote sensing images into clusters to
reveal non-local correlations, thereby enhancing fine-scale oceanic reconstructions. We trained our
network using outputs from an Ocean General Circulation Model (OGCM), which also facilitated
various testing scenarios. Our Observing System Simulation Experiments aimed to reconstruct deep
salinity fields using Sea Surface Temperature (SST) or Sea Surface Height (SSH), alongside sparse
in-situ salinity observations. The results showcased a significant reduction in reconstruction error up
to
40%
and the ability to resolve scales
50%
smaller compared to baseline OI techniques. Remarkably,
even though CLOINet was trained exclusively on simulated data, it accurately reconstructed an unseen
SST field using only glider temperature observations and satellite chlorophyll concentration data.
This demonstrates how deep learning networks like CLOINet can potentially lead the integration of
modeling and observational efforts in developing an ocean digital twin.
1 Introduction
Nowadays, there is an increased consciousness of the role played by the ocean in many crucial aspects of human
safety, health, and well-being due to the cumulative impacts of climate change, unsustainable exploitation of marine
resources, pollution, and uncoordinated development (UNESCO, 2019,Pascual et al. [2021]). In response to these
challenges, which UNESCO has encapsulated in 10 objectives for the Ocean Decade (2021-2030), the European
Union is endeavoring to develop a digital twin of the ocean. The concept of digital twins involves creating a digital
representation of real-world entities or processes, based on both real-time and historical observations, to depict the past
and present and to model potential future scenarios.
In the ocean case and especially to address climate change-related concerns, one major challenge is understanding the
state and evolution of the ocean’s interior. Its stratification significantly influences large-scale integrated variables like
ocean heat content, acidification, and oxygenation [Wang et al., 2018, Durack et al., 2014]. Moreover, numerous studies
arXiv:2210.10767v3 [physics.ao-ph] 12 Dec 2023
CLOINet: Ocean State Reconstructions A PREPRINT
have highlighted the importance of resolving submesoscale dynamics to account for the majority of vertical ocean
transport, which is vital for carbon export, fisheries, nutrient availability, and pollution displacement [Pascual et al.,
2017]. These challenges underscore the need for high-resolution, three-dimensional representations of the ocean state.
High-resolution numerical models and data assimilation techniques, which align model outputs with actual observations,
are currently the most common solutions [Carrassi et al., 2018, Mourre et al., 2004].
Operational simulations now assimilate near-real-time observations, including in-situ (ship-based observations, under-
water gliders, and floats) and remote sensing data. CIT Satellite observations provide frequent global snapshots of the
sea surface, for instance Sea Surface Temperature and Chlorophyll concentration images offer resolutions as fine as 1
km on a daily basis. In contrast, the current capabilities of remote altimeters are limited to a 200 km wavelength for the
global ocean at mid-latitudes and about 130 km for the Mediterranean Sea [Ballarotta et al., 2019], though significant
advancements are upcoming with the Surface Water and Ocean Topography (SWOT) mission successfully launched in
December 2022 [Morrow et al., 2019]. Notably, Sea Surface Height (SSH) data are unaffected by cloud cover. However,
the uncertainties regarding the ocean interior remain significant due to the sparse distribution of in-situ observations in
time and space [Siegelman et al., 2019]. As a result, while data-assimilating models adhere to physical balances, they
still lack accuracy [Arcucci et al., 2021].
The ocean twin strategy proposes data-driven approaches as a complementary method for revealing the ocean state.
In previous oceanographic studies, multivariate methods allowed to elaborate three-dimensional hydrographic fields
relying on their vast in-situ measurements collected during ocean campaigns [Cutolo et al., 2022, Gomis et al., 2001].
However, these methods are not easily scalable to a global observing system due to the sheer number of parameters
involved, such as correlation lengths. Machine learning techniques offer a solution to these scalability issues, as the
models are directly learned from the data. A key challenge for these techniques is the need for a substantial quantity of
realistic training data. General circulation and process study models play a new role here, providing a cost-effective way
to generate large datasets that adhere to ocean physics. Even datasets that only approximately match the true ocean state
can be valuable, provided they encompass a wide range of scenarios.This last point is especially crucial in preventing
the risk of deep networks memorizing the input climatology rather than capturing the actual ocean dynamics. Such a
focus ensures that the networks can understand and adapt to scenarios that significantly deviate from the average, rather
than being confined to repetitive patterns. To effectively generalize beyond their training data, neural networks require
careful design to preserve relevant input features across their layers. In this context, explainable AI aims to advance
beyond the black-box applications typical in ocean remote sensing studies, promoting a deeper understanding of the
model workings ([Zhu et al., 2017].
Despite these difficulties, recent studies have demonstrated the potential of deep-learning methods for various dynamical
system tasks. These range from idealized situations [Fablet et al., 2021] to realistic case studies, such as interpolating
missing data in satellite-derived observations of sea surface dynamics [Barth et al., 2020, Manucharyan et al., 2021,
Fablet et al., 2020]. With regard to reconstructing hydrographic profiles from satellite data, there’s a spectrum of
approaches: from proof-of-concept studies using self-organizing maps (SOMs) and neural networks (Charantonis et al.
[2015], Gueye et al. [2014]) and feed-forward or long short-term memory (LSTM) neural networks [Contractor and
Roughan, 2021, Sammartino et al., 2020, Jiang et al., 2021, Fablet et al., 2021] as well as [Pauthenet et al., 2022]
relying instead on multilayer perceptron. Even considering these past works the interpolation of temperature and salinity
profiles given some in-situ and sea surface information is an open challenge.
In this study, we introduce an innovative modular neural network designed to seamlessly integrate remote-sensing
images with in-situ observations for a complete 3D reconstruction of the ocean state. This integration is underpinned by
the Optimal Interpolation (OI) scheme’s mathematical principles [Gandin, 1966]. Unlike traditional applications of OI,
which typically use Euclidean distance to estimate the correlation between points, our approach involves computing
distances within a specially designed latent space. A specific module within our neural network transform all our input
information into this latent space made of ’clusters’. Within these clusters, non-local correlations become more easily
identifiable and can be effectively applied to enhance the correlation matrix. Like attention mechanisms in advanced
neural models [Vaswani et al., 2017], which focus on key aspects in large datasets for tasks such as language processing
or image recognition, our neural network module similarly identifies crucial correlational patterns through the latent
space of clusters.
We privileged a network structure composed of independent nested modules to facilitate the understanding and analysis
of its internal information flow from the input data to the covariance structure. To the best of our knowledge, this is the
first work in which neural networks achieve the most optimal combination of remote-sensing and in-situ observations
without previous knowledge of the study area’s climatology. This study is structured as follows: section 2 presents the
main synthetic dataset that we used for the training and testing and some real observations for some preliminar use case
scenario. All the details regarding the network architecture are in section 3 while the results are shown in section 4.
2
CLOINet: Ocean State Reconstructions A PREPRINT
2 Data
Neural networks need large amounts of data to be trained appropriately. A common choice in oceanography where such
a significant quantity of actual observations are unavailable is relying on numerical models. In our case, we chased
NATL60, a simulation based on the Nucleus for European Modelling of the Ocean described. We used the fields of this
model to simulate both remote-sensing and in-situ observations in a so-called Observing System Simulation Experiment
(OSSE). The model output is sampled in these experiments to replicate the different types of partial observations
available. The advantage is that we can quickly check the obtained improvements since the model output also provides
the ground truth we aim to reconstruct. The danger of what is usually called called "supervised learning" only aiming to
minimize the discrepancy with the provided ground truth is that the network weights memorize the "right answers"
so in our context the model climatology. We faced this problem, including two self-supervised terms in our loss
function as we describe later but also accurately selecting a highly varying training and test dataset as presented here in
subsection 2.1.
Finally, we proved the generalization capabilities of our network, testing it with actual multi-platform observations. In
particular, we used the remote-sensing products of Sea Surface Temperature (SST) and Chlorophyll-a concentration
(CHL) from CMEMS, together with temperature observations from gliders, as described in subsection 2.2.
Figure 1: Training area (A) and testing area (B) presented with the SWOT passages in the fast-sampling phase.
2.1 eNATL60 based OSSE
Our primary experiments utilized the eNATL60 configuration of the Nucleus for the European Modelling of the Ocean
(NEMO) model [Gurvan et al., 2022], featuring a
1/60
horizontal resolution and 300 vertical levels across the North
Atlantic. This high-resolution configuration is essential for understanding ocean dynamics, particularly for surface
oceanic motions down to 15 km, which aligns with SWOT observations (Ajayi et al. [2020]). We direct readers to
this work for a detailed understanding of NATL60’s capabilities. Additionally, numerous studies have employed the
non-extended version of NATL60 for resolving fine-scale dynamical processes ([Metref et al., 2019, 2020, Fresnay
et al., 2018, Amores et al., 2018]).
For our training and testing data, we utilized daily averages of Sea Surface Temperature (SST) and Sea Surface Height
(SSH), both individually and combined, from the eNATL60 simulation spanning an entire year. Alongside these, we
gathered in-situ salinity observations at three specific depths: 5m, 75m, and 150m. Our focus was then to reconstruct
the 2D salinity fields at these depths. In particular our analysis predominantly focused on the
5
m and
150
m depths,
selected to assess the robustness of our model both within and beyond the mixed-layer depth. To ensure that our
network’s training and testing in-situ observations mirrored real oceanographic conditions, we adopted two distinct
sampling strategies: random and regular. This approach allowed us to evaluate the network’s performance in various
realistic observational scenarios. The random strategy selects
N
domain points based on a uniform distribution, while
the regular strategy uses a homogeneous grid sampling with a fixed spacing of
δx
. By varying
N
and
δx
, we conducted
different experiments to observe metric variations.
Our focus was on two marine areas: the subpolar northwest Atlantic for training, and the Western Mediterranean
Sea for testing. Both regions are notable for SWOT passages during its rapid-sampling phase (see Figure 1). The
Mediterranean region, in particular, is known for its dynamic oceanographic characteristics and has been extensively
3
摘要:

CLOINET:OCEANSTATERECONSTRUCTIONSTHROUGHREMOTE-SENSING,IN-SITUSPARSEOBSERVATIONSANDDEEPLEARNINGAPREPRINTEugenioCutoloIMEDEA(CSIC-UIB),Esporles,Spain,e.cutolo@imedea.uib-csic.esAnandaPascualIMEDEA(CSIC-UIB),Esporles,Spain,ananda.pascual@imedea.uib-csic.esSimonRuizIMEDEA(CSIC-UIB),Esporles,Spain,simon...

展开>> 收起<<
CLOINet Ocean State Reconstructions.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:4.55MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注