ArgoSSM A State-space Model of Ocean Floats Under Ice_2

2025-05-01 0 0 5.15MB 19 页 10玖币

侵权投诉

APROBABILISTIC MODEL OF OCEAN FLOATS UNDER ICE

A PREPRINT

Derek Hansen∗

Department of Statistics

University of Michigan

dereklh@umich.edu

Drew Yarger

Department of Statistics

University of Michigan

dyarger@umich.edu

October 4, 2022

ABSTRACT

The Argo project deploys thousands of ﬂoats throughout the world’s oceans. Carried only by the

current, these ﬂoats take measurements such as temperature and salinity at depths of up to two

kilometers. These measurements are critical for scientiﬁc tasks such as modeling climate change,

estimating temperature and salinity ﬁelds, and tracking the global hydrological cycle. In the Southern

Ocean, Argo ﬂoats frequently drift under ice cover which prevents tracking via GPS. Managing this

missing location data is an important scientiﬁc challenge for the Argo project. To predict the ﬂoats’

trajectories under ice and quantify their uncertainty, we introduce a probabilistic state-space model

(SSM) called ArgoSSM. ArgoSSM infers the posterior distribution of a ﬂoat’s position and velocity

at each time based on all available data, which includes GPS measurements, ice cover, and potential

vorticity. This inference is achieved via an efﬁcient particle ﬁltering scheme, which is effective despite

the high signal-to-noise ratio in the GPS data. Compared to existing interpolation approaches in

oceanography, ArgoSSM more accurately predicts held-out GPS measurements. Moreover, because

uncertainty estimates are well-calibrated in the posterior distribution, ArgoSSM enables more robust

and accurate temperature, salinity, and circulation estimates.

Keywords =

Sequential Monte Carlo, Kalman ﬁlter, physical oceanography, sensor network, spatiotemporal modeling

1 Introduction

Reliable measurements of the ocean are essential for scientiﬁc tasks such as modeling climate change [Lyman and

Johnson, 2014], estimating temperature and salinity ﬁelds [Chang et al., 2009], and tracking the global hydrological

cycle [Hosoda et al., 2009]. However, measuring the vast reaches of the ocean is a challenging task. Historically,

measurements were taken by sensors onboard ships, but this limited most collection to popular routes, far from a

comprehensive survey of the ocean.

To address this issue, the Argo project [Argo, 2020] was started in 1998 as a multinational collaboration to collect

data about the world’s oceans. Instead of using ships, the Argo project deploys ﬂoats that freely drift with the ocean’s

currents at a depth of one kilometer. Every ten days, a ﬂoat descends to two kilometers and then ascends to the surface,

measuring temperature, salinity, and pressure at multiple depths along the way. At the surface, the ﬂoat records its

position via GPS and transmits data via the Iridium satellite system to Argo data centers for processing. Each iteration

of this data collection process is called a proﬁle.

The implementation of an ice avoidance program in Klatt et al. [2007] has enabled ﬂoats to explore previously

inaccessible regions such as the Southern Ocean near Antarctica. In these regions, when ice cover is detected, an

Argo ﬂoat does not attempt to resurface to avoid damage. While the ﬂoat continues to take proﬁles as normal, GPS

tracking can be lost for more than six months. Because the ﬂoat moves freely, GPS tracking is critical for pinpointing

∗Corresponding author.

arXiv:2210.00118v1 [stat.AP] 30 Sep 2022

ArgoSSM A PREPRINT

Figure 1: (Left) Recorded locations of 46 selected ﬂoats in the Weddell Gyre, a region of the Southern Ocean. Each dot

represents a proﬁle, a collection of temperature and salinity measurements. In this plot, missing locations of proﬁles are

linearly interpolated between valid location measurements and do not reﬂect the ﬂoats’ true positions (or the positions’

uncertainties). (Right) Number of proﬁles collected by these ﬂoats in each month of the year. Nearly all proﬁles

collected in the winter had missing locations.

its whereabouts when data is collected. Figure 1 illustrates both the extent and seasonality of missing locations in the

Southern Ocean. Proﬁles with missing locations make up

16%

of the Argo dataset in the Southern Ocean [Chamberlain

et al., 2018, Reeve et al., 2019]. Thus, simply removing the measurements without locations leads to both spatial and

seasonal biases in estimation tasks [Gray and Riser, 2014, Reeve et al., 2016, 2019]. In turn, these biases can affect our

understanding of systems in which the Southern Ocean plays an important role, such as the global climate [e.g. Gray

et al., 2018].

The most common way to handle missing locations is to linearly interpolate between the nearest two observed locations

[Wong et al., 2020]. Chamberlain et al. [2018] improved this technique by interpolating a path based on local estimates

of potential vorticity (PV), which takes into account the effect of the Earth’s rotation on the ocean water column [see

Section 7.7 of Talley et al., 2011, for more information]. In parallel, Chamberlain et al. [2018] showed how positioning

data from sound sources could constrain the locations using a velocity-driven linear state-space model (SSM), but with

ﬁxed parameters and no PV component. While their underlying model is probabilistic, they only use the predicted mean

and do not report any model-estimated uncertainty.

All of these previous approaches fail to offer well-calibrated estimates of uncertainty alongside their predictions.

Chamberlain et al. [2018] estimated that location uncertainty can be as much as

116km

. However, this estimate came

from the error in held-out data estimates and is not available for unseen data points actually under the ice. Moreover,

these approaches do not offer a way to incorporate location uncertainty into downstream estimation tasks, leading to

overconﬁdent ﬁnal estimates.

To solve both of these issues, we introduce a probabilistic state-space model of Argo ﬂoat movements. Our model,

ArgoSSM, combines available GPS measurements with local conservation of potential vorticity (PV) to more accurately

model the ﬂoats’ positions and velocities. In addition, we use data on ice concentration in the Southern Ocean to

further constrain the ﬂoats’ positions and directly characterize the missing data mechanism in our statistical model.

Because ArgoSSM is a generative model, we infer a posterior distribution of the trajectory of each ﬂoat as well as model

parameters given all available information. This inference is achieved via an efﬁcient particle ﬁltering scheme that fully

adapts to the high signal-to-noise ratio in the GPS data. In a case study of Argo ﬂoats in the Southern Ocean, ArgoSSM

more accurately predicts held-out GPS measurements than previous interpolation methods. Moreover, ArgoSSM

illuminates where location data is particularly sparse and uncertain, quantifying a source of uncertainty for downstream

estimates (e.g., temperature, salinity, or ocean circulation) that would have been ignored with imputed location estimates.

By providing a principled and ﬂexible statistical framework to handle missing location measurements, ArgoSSM can

easily be incorporated into future scientiﬁc study of the Southern Ocean.

ArgoSSM A PREPRINT

2 A generative model of ocean ﬂoat movement

To motivate the probabilistic framework of ArgoSSM, we start with a simple model of how Argo ﬂoats move through

the ocean. Proﬁle data is collected at times

t1< t2< . . . tN

, approximately ten days apart. At each time

, let

the geographic position in latitude and longitude at time

. We take into account the elapsed time

∆tn=tn−tn−1

when updating the position from

Xn−1

. If the ﬂoat’s position at

n−1

was

Xn−1

, we might expect that

will be close to

Xn−1

with noise proportional to the time passed. This can be written explicitly as a two-dimensional

random walk (RW) model:

Xn=Xn−1+X

n,(1)

where X

nfollows a multivariate Gaussian distribution with zero mean and covariance ∆tnΣX.

For a given index

, the expected value of

conditioned on

Xn−1

and

Xn+1

is a time-weighted average of

Xn−1

and

Xn+1

. Thus, the RW model is a generative model of ﬂoat movement where the optimal predictor for an unseen

point is linear interpolation. Linear interpolation might work well for short gaps in time, but it breaks down for large

gaps in time that are seen in the Southern Ocean. This because linear interpolation ignores local information such as

momentum. If we know the current has carried the ﬂoat from position

Xn−1

to position

, that same current will

likely carry the ﬂoat further in the same direction. More speciﬁcally, each ﬂoat has a velocity

that indicates where it

is headed next. Thus, we modify Equation 1 to take velocity into account:

Xn=Xn−1+ ∆tnVn−1+X

n.(2)

The velocity Vnalso changes over time according to an auto-regressive (AR) model:

Vn= (1 −α∆tn)v0+α∆tnVn−1+V

n,(3)

where

V

follows a multivariate Gaussian distribution with zero mean and covariance

∆tnΣV

. Two parameters govern

the velocity update: the long-run velocity of the ﬂoat

and the autoregressive term

α∈[0,1]

. The parameter

determines how quickly the velocity reverts to the long-run velocity v0.

We refer to Equations 2 and 3 collectively as the AR model. While the AR model is more realistic than the RW model,

it simpliﬁes the true behavior of the ﬂoats. Notably, it ignores the ﬂoat’s vertical movement as it rises or drops in the

ocean and intra-day movement on the ocean surface. Thus, the velocity state-variable should be interpreted as the

average direction of the ﬂoat over several days rather than a local estimate of the instantaneous velocity. The AR model

is similar to the state-space model introduced in Chamberlain et al. [2018], though with key differences. The main

modeling difference is that Chamberlain et al. [2018] updates the velocity according to a random walk (corresponding to

α= 1

in Equation 3). In Section 5, we ﬁnd for many ﬂoats that the inferred autoregressive parameter

is signiﬁcantly

less than

. Moreover, Chamberlain et al. [2018] ﬁxes all parameter values, whereas we estimate them alongside the

positions and velocities.

Information about the ﬂoat’s position comes from GPS measurements Yncorresponding to each Xn:

Yn=Xn+Y

n,(4)

where

Y

is the measurement error that follows a multivariate Gaussian with zero mean and covariance

ΣY

. With the

Iridium satellite system, Argo GPS measurements are rated to be accurate to within eight meters [Wong et al., 2020], so

the variability of the measurement error

Y

in Equation 4 will typically be magnitudes lower than that of the transition

error X

nin Equation 2.

2.1 Missing due to ice cover

While GPS measurements accurately pin down the ﬂoats’ locations, they may not always be available. To represent this

availability, let

be an indicator variable that equals one if GPS is available at time

and zero otherwise. In the

Southern Ocean, since the ﬂoat only surfaces after three consecutive ice-free detections,

is mostly determined by

the ice-avoidance algorithm [Klatt et al., 2007], which depends on the concentration of ice in the area.

To model the availability indicator

, we ﬁrst require an estimated probability of detecting ice. We have available

daily ice concentration estimates from Fetterer et al. [2017], which uses remotely-sensed data from microwave

instruments on satellites. Let

E(x, t)

be the concentration of ice at position

and time

. Accounting for imperfect

ice detection due to limited resolution, the probability that the ﬂoat detects ice at position

and time

E(x, t) =

pTPRE(x, t) + (1 −pTNR)E(x, t)

, where

pTPR

is the “true positive rate” (correctly detecting ice) and

pTNR

is the “true

negative rate” (correctly detecting no ice). We expect

pTPR

and

pTNR

to be close to

, but since detections are based on

the temperature of the water, we expect to see more false positives than false negatives (i.e. pTNR < pTPR).

ArgoSSM A PREPRINT

Sn= 0 Sn= 1 Sn= 2 Sn= 3

1−˜

Do not surface Surface

Figure 2: Transition diagram of the state

of the Argo ice detection algorithm. The transition to the next state is

determined based on whether ice is detected or not. The probability of detecting ice,

En≡˜

E(Xn, tn)

, depends on

both the geographic position Xnand the time tn. Surfacing requires at least three consecutive ice-free detections.

With the probability of detecting ice at a particular time and location, we model the state of the ice-avoidance algorithm

as a Markov chain, illustrated in Figure 2. The state of the algorithm, denoted

, can take one of four possible values in

{0,1,2,3}

. If ice is detected, the state

resets to

. Otherwise, the state increments by one (i.e.

Sn= (Sn−1+ 1) ∧3

The probability of transitioning to

Sn= 0

is equal to the probability of detecting ice

E(Xn, tn)

. Likewise, the

probability of maintaining the streak of ice-free detections is

1−˜

E(Xn, tn)

. The ice-avoidance state

determines

whether the ﬂoat surfaces, which directly impacts the availability of the GPS measurement

. For

Sn∈ {0,1,2}

the ﬂoat will not surface, so the measurement is missing (

An= 0

) with probability 1. If

Sn= 3

, the ice-avoidance

algorithm will no longer prevent the ﬂoat from surfacing. In this case,

P(An= 1|Sn= 3) = (1 −pMAR)

, where

pMAR

is the probability that the GPS measurement is missing for reasons other than ice avoidance.

Even if not of direct interest, knowledge of the ice-avoidance state

provides information about the ﬂoat’s position.

In particular, whenever

equals

, there was most likely ice present at position

, so it is more likely the ﬂoat is

in a region with high concentration of ice. Similarly, if

Sn∈ {1,2,3}

, then the ﬂoat did not detect ice, so it is more

likely the ﬂoat is in a region with a low concentration of ice. This relationship is naturally captured in the joint posterior

distribution of the three state variables (

Xn, Vn, Sn

) given the observed data. We discuss how this posterior distribution

is estimated in Section 3.

2.2 Local conservation of potential vorticity

The motion of a freely-circulating object in the ocean conserves potential vorticity (PV) [Talley et al., 2011]. PV is a

function of the depth of the ocean and the vorticity, or local spin, which itself is a function of latitude. Incorporating PV

conservation is important for both generating more realistic ﬂoat trajectories and improving predictive performance in

periods without GPS measurements. To incorporate local conservation of PV into the state-space model, we frame it as

a probabilistic constraint. From a ﬁrst-order Taylor approximation and Equation 2, we have that the difference in PV

between two positions

Xn+1

and

is approximately

PV

n=∇PV(Xn)·∆tnVn

, where

∇PV(Xn)

is the gradient of

PV with respect to position. Since PV is a conserved quantity,

PV

should be close to zero, but not exactly zero due to

the ﬁrst-order approximation and imperfect estimation of PV. To account for this, we suppose

PV

follows a univariate

Gaussian distribution with mean

and standard deviation

∆tnσP V

, with

σP V

determining the relative strictness of PV

conservation.

Because

PV

is linear with respect to

and Gaussian, it is an implicit measurement of

that can be incorporated as a

Bayesian update of the autoregressive velocity update from Equation 3:

Vn|Xn, Vn−1=B(1 −α∆tn)v0+α∆tnVn−1+B1

2V

n,(5)

where

B=I+1

σ2

P V

(∆tnΣV)∇PV(Xn)∇PV(Xn)0−1

is a matrix that encompasses the effect of PV conservation.

The form of

follows from a conjugate Bayesian update with a Gaussian prior and likelihood [see Section 3.5 of

Gelman et al., 2015]. Notably,

shrinks the component of velocity in the direction of

∇PV(Xn)

while leaving the

other component unchanged. This discourages large changes in PV, leading to more realistic predictions of under ice

ﬂoat trajectories. In turn, this improves predictive performance during periods with no GPS measurements (Section 5).

The value of

∇PV(Xn)

used in the update is estimated from known ocean depth and latitude. We use bathymetry

exported from the Southern Ocean State Estimate (SOSE) [Verdy and Mazloff, 2017]. Potential vorticity is ap-

proximated by the expression

PV(Xn) = f(Xn)/h(Xn)

where

h(Xn)

is the ocean depth at

and

f(Xn) =

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

APROBABILISTICMODELOFOCEANFLOATSUNDERICEAPREPRINTDerekHansenDepartmentofStatisticsUniversityofMichigandereklh@umich.eduDrewYargerDepartmentofStatisticsUniversityofMichigandyarger@umich.eduOctober4,2022ABSTRACTTheArgoprojectdeploysthousandsofoatsthroughouttheworld'soceans.Carriedonlybythecurrent,th...

展开>> 收起<<

ArgoSSM A State-space Model of Ocean Floats Under Ice_2.pdf

共19页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

ArgoSSM A State-space Model of Ocean Floats Under Ice_2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: