ArgoSSM A State-space Model of Ocean Floats Under Ice_2

2025-05-01 0 0 5.15MB 19 页 10玖币
侵权投诉
APROBABILISTIC MODEL OF OCEAN FLOATS UNDER ICE
A PREPRINT
Derek Hansen
Department of Statistics
University of Michigan
dereklh@umich.edu
Drew Yarger
Department of Statistics
University of Michigan
dyarger@umich.edu
October 4, 2022
ABSTRACT
The Argo project deploys thousands of floats throughout the world’s oceans. Carried only by the
current, these floats take measurements such as temperature and salinity at depths of up to two
kilometers. These measurements are critical for scientific tasks such as modeling climate change,
estimating temperature and salinity fields, and tracking the global hydrological cycle. In the Southern
Ocean, Argo floats frequently drift under ice cover which prevents tracking via GPS. Managing this
missing location data is an important scientific challenge for the Argo project. To predict the floats’
trajectories under ice and quantify their uncertainty, we introduce a probabilistic state-space model
(SSM) called ArgoSSM. ArgoSSM infers the posterior distribution of a float’s position and velocity
at each time based on all available data, which includes GPS measurements, ice cover, and potential
vorticity. This inference is achieved via an efficient particle filtering scheme, which is effective despite
the high signal-to-noise ratio in the GPS data. Compared to existing interpolation approaches in
oceanography, ArgoSSM more accurately predicts held-out GPS measurements. Moreover, because
uncertainty estimates are well-calibrated in the posterior distribution, ArgoSSM enables more robust
and accurate temperature, salinity, and circulation estimates.
Keywords =
Sequential Monte Carlo, Kalman filter, physical oceanography, sensor network, spatiotemporal modeling
1 Introduction
Reliable measurements of the ocean are essential for scientific tasks such as modeling climate change [Lyman and
Johnson, 2014], estimating temperature and salinity fields [Chang et al., 2009], and tracking the global hydrological
cycle [Hosoda et al., 2009]. However, measuring the vast reaches of the ocean is a challenging task. Historically,
measurements were taken by sensors onboard ships, but this limited most collection to popular routes, far from a
comprehensive survey of the ocean.
To address this issue, the Argo project [Argo, 2020] was started in 1998 as a multinational collaboration to collect
data about the world’s oceans. Instead of using ships, the Argo project deploys floats that freely drift with the ocean’s
currents at a depth of one kilometer. Every ten days, a float descends to two kilometers and then ascends to the surface,
measuring temperature, salinity, and pressure at multiple depths along the way. At the surface, the float records its
position via GPS and transmits data via the Iridium satellite system to Argo data centers for processing. Each iteration
of this data collection process is called a profile.
The implementation of an ice avoidance program in Klatt et al. [2007] has enabled floats to explore previously
inaccessible regions such as the Southern Ocean near Antarctica. In these regions, when ice cover is detected, an
Argo float does not attempt to resurface to avoid damage. While the float continues to take profiles as normal, GPS
tracking can be lost for more than six months. Because the float moves freely, GPS tracking is critical for pinpointing
Corresponding author.
arXiv:2210.00118v1 [stat.AP] 30 Sep 2022
ArgoSSM A PREPRINT
Figure 1: (Left) Recorded locations of 46 selected floats in the Weddell Gyre, a region of the Southern Ocean. Each dot
represents a profile, a collection of temperature and salinity measurements. In this plot, missing locations of profiles are
linearly interpolated between valid location measurements and do not reflect the floats’ true positions (or the positions’
uncertainties). (Right) Number of profiles collected by these floats in each month of the year. Nearly all profiles
collected in the winter had missing locations.
its whereabouts when data is collected. Figure 1 illustrates both the extent and seasonality of missing locations in the
Southern Ocean. Profiles with missing locations make up
16%
of the Argo dataset in the Southern Ocean [Chamberlain
et al., 2018, Reeve et al., 2019]. Thus, simply removing the measurements without locations leads to both spatial and
seasonal biases in estimation tasks [Gray and Riser, 2014, Reeve et al., 2016, 2019]. In turn, these biases can affect our
understanding of systems in which the Southern Ocean plays an important role, such as the global climate [e.g. Gray
et al., 2018].
The most common way to handle missing locations is to linearly interpolate between the nearest two observed locations
[Wong et al., 2020]. Chamberlain et al. [2018] improved this technique by interpolating a path based on local estimates
of potential vorticity (PV), which takes into account the effect of the Earth’s rotation on the ocean water column [see
Section 7.7 of Talley et al., 2011, for more information]. In parallel, Chamberlain et al. [2018] showed how positioning
data from sound sources could constrain the locations using a velocity-driven linear state-space model (SSM), but with
fixed parameters and no PV component. While their underlying model is probabilistic, they only use the predicted mean
and do not report any model-estimated uncertainty.
All of these previous approaches fail to offer well-calibrated estimates of uncertainty alongside their predictions.
Chamberlain et al. [2018] estimated that location uncertainty can be as much as
116km
. However, this estimate came
from the error in held-out data estimates and is not available for unseen data points actually under the ice. Moreover,
these approaches do not offer a way to incorporate location uncertainty into downstream estimation tasks, leading to
overconfident final estimates.
To solve both of these issues, we introduce a probabilistic state-space model of Argo float movements. Our model,
ArgoSSM, combines available GPS measurements with local conservation of potential vorticity (PV) to more accurately
model the floats’ positions and velocities. In addition, we use data on ice concentration in the Southern Ocean to
further constrain the floats’ positions and directly characterize the missing data mechanism in our statistical model.
Because ArgoSSM is a generative model, we infer a posterior distribution of the trajectory of each float as well as model
parameters given all available information. This inference is achieved via an efficient particle filtering scheme that fully
adapts to the high signal-to-noise ratio in the GPS data. In a case study of Argo floats in the Southern Ocean, ArgoSSM
more accurately predicts held-out GPS measurements than previous interpolation methods. Moreover, ArgoSSM
illuminates where location data is particularly sparse and uncertain, quantifying a source of uncertainty for downstream
estimates (e.g., temperature, salinity, or ocean circulation) that would have been ignored with imputed location estimates.
By providing a principled and flexible statistical framework to handle missing location measurements, ArgoSSM can
easily be incorporated into future scientific study of the Southern Ocean.
2
ArgoSSM A PREPRINT
2 A generative model of ocean float movement
To motivate the probabilistic framework of ArgoSSM, we start with a simple model of how Argo floats move through
the ocean. Profile data is collected at times
t1< t2< . . . tN
, approximately ten days apart. At each time
tn
, let
Xn
be
the geographic position in latitude and longitude at time
tn
. We take into account the elapsed time
tn=tntn1
when updating the position from
Xn1
to
Xn
. If the float’s position at
n1
was
Xn1
, we might expect that
Xn
will be close to
Xn1
with noise proportional to the time passed. This can be written explicitly as a two-dimensional
random walk (RW) model:
Xn=Xn1+X
n,(1)
where X
nfollows a multivariate Gaussian distribution with zero mean and covariance tnΣX.
For a given index
n
, the expected value of
Xn
conditioned on
Xn1
and
Xn+1
is a time-weighted average of
Xn1
and
Xn+1
. Thus, the RW model is a generative model of float movement where the optimal predictor for an unseen
point is linear interpolation. Linear interpolation might work well for short gaps in time, but it breaks down for large
gaps in time that are seen in the Southern Ocean. This because linear interpolation ignores local information such as
momentum. If we know the current has carried the float from position
Xn1
to position
Xn
, that same current will
likely carry the float further in the same direction. More specifically, each float has a velocity
Vn
that indicates where it
is headed next. Thus, we modify Equation 1 to take velocity into account:
Xn=Xn1+ ∆tnVn1+X
n.(2)
The velocity Vnalso changes over time according to an auto-regressive (AR) model:
Vn= (1 αtn)v0+αtnVn1+V
n,(3)
where
V
n
follows a multivariate Gaussian distribution with zero mean and covariance
tnΣV
. Two parameters govern
the velocity update: the long-run velocity of the float
v0
and the autoregressive term
α[0,1]
. The parameter
α
determines how quickly the velocity reverts to the long-run velocity v0.
We refer to Equations 2 and 3 collectively as the AR model. While the AR model is more realistic than the RW model,
it simplifies the true behavior of the floats. Notably, it ignores the float’s vertical movement as it rises or drops in the
ocean and intra-day movement on the ocean surface. Thus, the velocity state-variable should be interpreted as the
average direction of the float over several days rather than a local estimate of the instantaneous velocity. The AR model
is similar to the state-space model introduced in Chamberlain et al. [2018], though with key differences. The main
modeling difference is that Chamberlain et al. [2018] updates the velocity according to a random walk (corresponding to
α= 1
in Equation 3). In Section 5, we find for many floats that the inferred autoregressive parameter
α
is significantly
less than
1
. Moreover, Chamberlain et al. [2018] fixes all parameter values, whereas we estimate them alongside the
positions and velocities.
Information about the float’s position comes from GPS measurements Yncorresponding to each Xn:
Yn=Xn+Y
n,(4)
where
Y
n
is the measurement error that follows a multivariate Gaussian with zero mean and covariance
ΣY
. With the
Iridium satellite system, Argo GPS measurements are rated to be accurate to within eight meters [Wong et al., 2020], so
the variability of the measurement error
Y
n
in Equation 4 will typically be magnitudes lower than that of the transition
error X
nin Equation 2.
2.1 Missing due to ice cover
While GPS measurements accurately pin down the floats’ locations, they may not always be available. To represent this
availability, let
An
be an indicator variable that equals one if GPS is available at time
tn
and zero otherwise. In the
Southern Ocean, since the float only surfaces after three consecutive ice-free detections,
An
is mostly determined by
the ice-avoidance algorithm [Klatt et al., 2007], which depends on the concentration of ice in the area.
To model the availability indicator
An
, we first require an estimated probability of detecting ice. We have available
daily ice concentration estimates from Fetterer et al. [2017], which uses remotely-sensed data from microwave
instruments on satellites. Let
E(x, t)
be the concentration of ice at position
x
and time
t
. Accounting for imperfect
ice detection due to limited resolution, the probability that the float detects ice at position
x
and time
t
is
˜
E(x, t) =
pTPRE(x, t) + (1 pTNR)E(x, t)
, where
pTPR
is the “true positive rate” (correctly detecting ice) and
pTNR
is the “true
negative rate” (correctly detecting no ice). We expect
pTPR
and
pTNR
to be close to
1
, but since detections are based on
the temperature of the water, we expect to see more false positives than false negatives (i.e. pTNR < pTPR).
3
ArgoSSM A PREPRINT
Sn= 0 Sn= 1 Sn= 2 Sn= 3
1˜
En
˜
En
Do not surface Surface
Figure 2: Transition diagram of the state
Sn
of the Argo ice detection algorithm. The transition to the next state is
determined based on whether ice is detected or not. The probability of detecting ice,
˜
En˜
E(Xn, tn)
, depends on
both the geographic position Xnand the time tn. Surfacing requires at least three consecutive ice-free detections.
With the probability of detecting ice at a particular time and location, we model the state of the ice-avoidance algorithm
as a Markov chain, illustrated in Figure 2. The state of the algorithm, denoted
Sn
, can take one of four possible values in
{0,1,2,3}
. If ice is detected, the state
Sn
resets to
0
. Otherwise, the state increments by one (i.e.
Sn= (Sn1+ 1) 3
).
The probability of transitioning to
Sn= 0
is equal to the probability of detecting ice
˜
E(Xn, tn)
. Likewise, the
probability of maintaining the streak of ice-free detections is
1˜
E(Xn, tn)
. The ice-avoidance state
Sn
determines
whether the float surfaces, which directly impacts the availability of the GPS measurement
An
. For
Sn∈ {0,1,2}
,
the float will not surface, so the measurement is missing (
An= 0
) with probability 1. If
Sn= 3
, the ice-avoidance
algorithm will no longer prevent the float from surfacing. In this case,
P(An= 1|Sn= 3) = (1 pMAR)
, where
pMAR
is the probability that the GPS measurement is missing for reasons other than ice avoidance.
Even if not of direct interest, knowledge of the ice-avoidance state
Sn
provides information about the float’s position.
In particular, whenever
Sn
equals
0
, there was most likely ice present at position
Xn
, so it is more likely the float is
in a region with high concentration of ice. Similarly, if
Sn∈ {1,2,3}
, then the float did not detect ice, so it is more
likely the float is in a region with a low concentration of ice. This relationship is naturally captured in the joint posterior
distribution of the three state variables (
Xn, Vn, Sn
) given the observed data. We discuss how this posterior distribution
is estimated in Section 3.
2.2 Local conservation of potential vorticity
The motion of a freely-circulating object in the ocean conserves potential vorticity (PV) [Talley et al., 2011]. PV is a
function of the depth of the ocean and the vorticity, or local spin, which itself is a function of latitude. Incorporating PV
conservation is important for both generating more realistic float trajectories and improving predictive performance in
periods without GPS measurements. To incorporate local conservation of PV into the state-space model, we frame it as
a probabilistic constraint. From a first-order Taylor approximation and Equation 2, we have that the difference in PV
between two positions
Xn+1
and
Xn
is approximately
PV
n=PV(Xn)·tnVn
, where
PV(Xn)
is the gradient of
PV with respect to position. Since PV is a conserved quantity,
PV
n
should be close to zero, but not exactly zero due to
the first-order approximation and imperfect estimation of PV. To account for this, we suppose
PV
n
follows a univariate
Gaussian distribution with mean
0
and standard deviation
tnσP V
, with
σP V
determining the relative strictness of PV
conservation.
Because
PV
n
is linear with respect to
Vn
and Gaussian, it is an implicit measurement of
Vn
that can be incorporated as a
Bayesian update of the autoregressive velocity update from Equation 3:
Vn|Xn, Vn1=B(1 αtn)v0+αtnVn1+B1
2V
n,(5)
where
B=I+1
σ2
P V
(∆tnΣV)PV(Xn)PV(Xn)01
is a matrix that encompasses the effect of PV conservation.
The form of
B
follows from a conjugate Bayesian update with a Gaussian prior and likelihood [see Section 3.5 of
Gelman et al., 2015]. Notably,
B
shrinks the component of velocity in the direction of
PV(Xn)
while leaving the
other component unchanged. This discourages large changes in PV, leading to more realistic predictions of under ice
float trajectories. In turn, this improves predictive performance during periods with no GPS measurements (Section 5).
The value of
PV(Xn)
used in the update is estimated from known ocean depth and latitude. We use bathymetry
exported from the Southern Ocean State Estimate (SOSE) [Verdy and Mazloff, 2017]. Potential vorticity is ap-
proximated by the expression
PV(Xn) = f(Xn)/h(Xn)
where
h(Xn)
is the ocean depth at
Xn
and
f(Xn) =
4
摘要:

APROBABILISTICMODELOFOCEANFLOATSUNDERICEAPREPRINTDerekHansenDepartmentofStatisticsUniversityofMichigandereklh@umich.eduDrewYargerDepartmentofStatisticsUniversityofMichigandyarger@umich.eduOctober4,2022ABSTRACTTheArgoprojectdeploysthousandsofoatsthroughouttheworld'soceans.Carriedonlybythecurrent,th...

展开>> 收起<<
ArgoSSM A State-space Model of Ocean Floats Under Ice_2.pdf

共19页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:19 页 大小:5.15MB 格式:PDF 时间:2025-05-01

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 19
客服
关注