Contrastive Trajectory Similarity Learning with Dual-Feature Attention Yanchuan Chang1 Jianzhong Qi1Yuxuan Liang2 Egemen Tanin1

2025-04-27 0 0 1.58MB 15 页 10玖币
侵权投诉
Contrastive Trajectory Similarity Learning with
Dual-Feature Attention
Yanchuan Chang1, Jianzhong Qi1,§Yuxuan Liang2, Egemen Tanin1
1The University of Melbourne,2National University of Singapore
{yanchuanc@student., jianzhong.qi@, etanin@}unimelb.edu.au, yuxliang@outlook.com
Abstract—Trajectory similarity measures act as query pred-
icates in trajectory databases, making them the key player in
determining the query results. They also have a heavy impact
on the query efficiency. An ideal measure should have the
capability to accurately evaluate the similarity between any two
trajectories in a very short amount of time. Towards this aim,
we propose a contrastive learning-based trajectory modeling
method named TrajCL. We present four trajectory augmentation
methods and a novel dual-feature self-attention-based trajectory
backbone encoder. The resultant model can jointly learn both
the spatial and the structural patterns of trajectories. Our model
does not involve any recurrent structures and thus has a high
efficiency. Besides, our pre-trained backbone encoder can be fine-
tuned towards other computationally expensive measures with
minimal supervision data. Experimental results show that TrajCL
is consistently and significantly more accurate than the state-of-
the-art trajectory similarity measures. After fine-tuning, i.e., to
serve as an estimator for heuristic measures, TrajCL can even
outperform the state-of-the-art supervised method by up to 56%
in the accuracy for processing trajectory similarity queries.
Index Terms—Trajectory similarity, spatial databases, con-
trastive learning, transformer
I. INTRODUCTION
A trajectory is commonly represented as a sequence of
location points to describe the movement of an object, such
as a person or a vehicle. Measuring the similarity between
trajectories is a fundamental step in trajectory queries [?],
[1]–[6], since it is used as a query predicate which determines
query results and efficiency. Unlike numeric data and character
data, there are not many universally applicable comparison
criteria for trajectory data, and thus measuring similarity
between trajectories is an important area of research.
A series of trajectory similarity measures [7]–[13] have been
proposed, which can be classified into two categories: heuristic
measures and learned measures. Heuristic trajectory similar-
ity measures [7]–[10] mainly aim to find a point-oriented
matching between two trajectories based on hand-crafted rules.
For example, Hausdorff [9] leverages the Euclidean distances
between points on two trajectories to measure trajectory sim-
ilarity. Learned trajectory similarity measures [11]–[14], on
the other hand, utilize deep learning models to predict sim-
ilarity values by computing the distance between trajectory-
oriented embeddings (i.e., numeric vector representations of
trajectories). For example, t2vec [11] and E2DTC [14] adapt
recurrent neural networks (RNN) to encode trajectories into
§Corresponding author
embeddings, TrjSR [12] uses convolutional neural networks
(CNN) to embed trajectories.
(a) Hausdorff (heuristic) (b) t2vec (learned) (c) TrajCL (ours)
Fig. 1: Querying the 3NN trajectories (The query trajectory is
in yellow with extra thick lines for easy viewing. The 3NN
results are colored in red,green and blue, respectively.)
TABLE I: Trajectory similarity computation time
Hausdorff t2vec TrajCL
Time (µs) 6.63 0.34 0.14
Measures in the both categories above face the following
challenges. (1) Ineffectiveness: Trajectories with different
sampling rates or containing noise can degrade the effective-
ness of the existing measures. This is because the heuristic
measures using hand-crafted rules are prone to errors by low-
quality trajectories. The learned measures also suffer from
this problem, since they mostly adopt deep learning models
which are not originally designed for trajectory data and may
fail to capture long spatial correlations between trajectory
points and between similar trajectories. For example, Fig. 1
shows the 3-nearest neighbor query results on the Porto taxi
trajectory dataset [15]. The query results obtained using t2vec
(Fig. 1b) are far from the query trajectory. Those obtained
by Hausdorff are closer to the query trajectory (Fig. 1a),
but not as close as those obtained by our TrajCL method
(Fig. 1c), while Hausdorff suffers in efficiency (discussed
next). (2) Inefficiency: Existing heuristic measures compute
the distance between each pair of points on two trajectories.
They take at least a quadratic time w.r.t. the number of
trajectory points, which is unacceptable in online systems,
especially when trajectories become longer. Although the
learned measures get rid of pairwise point comparisons, they
are still limited in efficiency. As Table I shows, Hausdorff takes
6.63 microseconds to compute the similarity of two Porto taxi
arXiv:2210.05155v3 [cs.DB] 20 Feb 2023
trajectories. t2vec reduces the time by more than an order
of magnitude to 0.34 microseconds. However, its recurrent
structure has not fully exploited the parallel power of GPUs.
Our TrajCL avoids this recurrent structure and further brings
the computation time down to 0.14 microseconds.
To address these issues, we propose TrajCL, a contrastive
learning-based trajectory similarity measure with a dual-
feature self-attention-based trajectory backbone encoder (Du-
alSTB). TrajCL first leverages our proposed trajectory aug-
mentation methods to generate diverse trajectory variants (i.e.,
so called views) with different characteristics for each training
sample. Then, the proposed DualSTB encoder embeds the
augmented trajectories into trajectory embeddings, which can
capture the spatial distance correlation between the trajecto-
ries. After that, we compute the similarity of two trajectories
simply as the L1distance between their embeddings.
Due to the lack of ground-truth for trajectory similarity,
we train our proposed DualSTB encoder by adopting self-
supervised contrastive learning [16], [17] that aims to max-
imize the agreement between the representations of positive
(i.e., similar) data pairs and minimize that of the negative (i.e.,
dissimilar) data pairs, where the positive and negative data
pairs are generated from input data via augmentation methods.
The idea of using contrastive learning for representation
learning is not new. By introducing it into trajectory embed-
ding learning, our first technical contribution is four trajectory
augmentation methods that enable obtaining the positive and
negative data pairs for contrastive learning over trajectories.
These methods include point shifting, point masking, trajec-
tory truncating, and trajectory simplification. The augmented
trajectories can be regarded as a set of low-quality variants of
the input trajectories with uncertainty. Such diverse trajectories
guide our model to learn the key patterns to differentiate
between similar and not-so-similar trajectory pairs.
Our second technical contribution is a dual-feature self-
attention-based trajectory backbone encoder (i.e., DualSTB)
that encodes both structural and spatial trajectory features
of a trajectory into its learned embedding. The two types
of features together provide coarse-grained and fine-grained
location information of trajectories. To obtain a comprehensive
embedding based on the two types of features, we devise a
dual-feature multi-head self-attention module that first learns
the correlations between trajectory points based on each type
of features. Then, the module adaptively combines the two
types of correlations, and finally it forms the output embed-
dings. Such a module can capture the long-term dependency
between trajectory points, while its non-recurrent structure
enables model inference with high efficiency.
After TrajCL is trained, it can be fine-tuned towards any
existing heuristic measure as a fast estimator with little training
effort, similar to the approximate learned measures [18]–[21].
To sum up, we make the following contributions:
1) We propose TrajCL, a contrastive learning-based trajectory
similarity measure that does not rely on any supervision
data during training. Our measure is robust to low-quality
trajectories and efficient on trajectory similarity compu-
tation. Besides, pre-trained TrajCL models can be used to
fast approximate any existing heuristic trajectory similarity
measure with little training effort.
2) We design four trajectory augmentation methods for our
trajectory contrastive learning framework, to enhance the
robustness of TrajCL on measuring trajectory similarity.
3) We present a dual-feature self-attention-based trajec-
tory backbone encoder, which incorporates the structural
feature-based attention and the spatial feature-based at-
tention adaptively. It can capture more comprehensive
correlations between trajectory points comparing with a
vanilla self-attention-based encoder.
4) We conduct extensive experiments on four trajectory
datasets. The results show that: (i) Compared with the
state-of-the-art learned trajectory similarity measures, Tra-
jCL improves the measuring accuracy by 3.22 times and
reduces the running time by more than 50%, on average.
(ii) When acting as a fast estimator of a heuristic measure,
TrajCL outperforms the state-of-the-art supervised method
by up to 56% in terms of the prediction accuracy.
II. RELATED WORK
Trajectory similarity measures. Existing studies on mea-
suring the similarity between two trajectories can be divided
into two categories: heuristic measures and learned measures.
Heuristic measures, in general, compare pairs of points
from two trajectories to find optimal point matches [7]–[10],
[22]–[24]. The (Euclidean) distances aggregated from the
matched points formulate the similarity of two trajectories.
Such methods usually take O(n2)time given trajectories of
npoints each. For example, Hausdorff [9] computes the
maximum point-to-trajectory distance between two trajecto-
ries. Fr´
echet [10] resembles Hausdorff but requires the point
matches to strictly follow the sequential point order. EDR [7]
and EDwP [8] compute edit distance between trajectories,
while EDwP [8] further considers the real point distances,
and it allows interpolation points to account for non-uniform
sampling frequencies. A few other studies [2]–[4] measure
similarity on spatial networks, which are less relevant and are
omitted.
A few recent studies [18]–[21], [25]–[27] take a supervised
approach and train a deep learning model to approximate a
heuristic measure (e.g., Hausdorff). Once trained, the model
can predict trajectory similarity in time linear to the embed-
ding dimensionality. For example, NEUTRAJ [18] leverages
LSTMs [28] with a spatial memory module to capture the
correlation between trajectories. Traj2SimVec [19] accelerates
NEUTRAJ training with a sampling strategy, and it uses an
auxiliary loss to capture sub-trajectory similarity. T3S [20] uses
vanilla LSTMs and self-attention [29] to learn heuristic mea-
sures. TrajGAT [21] proposes a graph-based attention model
to capture the long-term dependency between trajectories.
Learned measures [11]–[14] do not require a given heuristic
measure to generate model training signals. These methods
still learn trajectory embeddings with deep learning, which
are expected to be more robust to low-quality (e.g., noisy
or with low sampling rates) trajectories, since deep learning
models are strong in capturing the distinctive data features.
t2vec [11] uses an RNN-based sequence-to-sequence model
to learn trajectory embeddings and then the similarity. It uses
a spatial proximity-aware loss that helps encode the spatial
distance between trajectories. E2DTC [14] leverages t2vec
as the backbone encoder for trajectory clustering. It adds
two loss functions to capture the similarity between trajec-
tories from the same cluster. TrjSR [12] captures the spatial
pattern of trajectories by converting trajectories into images.
CSTRM [13] uses vanilla self-attention as its trajectory encoder
and proposes a multi-view hinge loss to capture both point-
level and trajectory-level similarities between trajectories. It
generates positive trajectory pairs using two augmentation
methods, i.e., point shifting and point masking, which are
empirically shown to be sub-optimal in Section V.
Our model is a learned trajectory similarity measure. It aims
to address the limitations of the existing learned measures in
effectiveness and efficiency as discussed in Section I.
Contrastive learning. Contrastive learning [16], [17], [30]–
[38] is a self-supervised learning technique. Its core idea is to
maximize the agreement between the learned representations
of similar objects (i.e., positive pairs) while minimizing that
between dissimilar objects (i.e., negative pairs). The positive
and the negative sample pairs are generated from an input
dataset, and no supervision (labeled) data is needed. Once
trained, the representation generation model (i.e., a backbone
encoder) can be connected to downstream models, to generate
object representations for downstream learning tasks (e.g.,
classification). A few studies introduce contrastive learning
into spatial problems, such as traffic flow prediction [39].
Self-attention models. Self-attention-based models [29],
[40]–[42] learn the correlation between every two elements
of an input sequence. Studies have adopted self-attention for
trajectory similarity measurement (i.e., T3S and CSTRM).
Unlike our model, both T3S and CSTRM adopt the vanilla
multi-head self-attention encoder [29], while we propose a
dual-feature self-attention-based encoder which can capture
trajectory features from two levels of granularity and thus
generate more robust embeddings.
III. SOLUTION OVERVIEW
We consider a trajectory Tas a sequence of points recording
discrete locations of the movement of some entity, denoted by
T= [p1, p2, ..., p|T|], where piis the i-th point on T, and |T|
denotes the number of points on T. A point piis represented
by its coordinates in an Euclidean space, i.e., pi= (xi, yi).
Problem statement. Given a set of trajectories, we aim to
learn a trajectory encoder F:Ththat maps a trajectory
Tto a d-dimensional embedding vector hRd. The distance
between the learned embeddings of two trajectories should
be negatively correlated to the similarity between the two
trajectories (we use the L1distance in the experiments).
Model overview. Fig. 2 shows an overview of our TrajCL
model. The model follows the dual-branch structure of a strong
contrastive learning framework, MoCo [16]. Our technical
contributions come in the design of the learning modules as
highlighted in red in Fig. 2, to be detailed in the next section.
Given an input trajectory T, it first goes through a trajec-
tory augmentation module to generate two different trajectory
views (i.e., variants) of T, denoted as e
Tand e
T0, respectively.
We propose four different augmentation methods that em-
phasize different features of a trajectory (Section IV-A). The
augmentation process is based on Tdirectly, and hence no
additional manual data labeling efforts are needed.
The generated views e
Tand e
T0are fed into pointwise
trajectory feature enrichment layers to generate pointwise
features beyond just the coordinates, which reflect the key
characteristics of e
Tand e
T0(Section IV-B). We represent the
enriched features by two types of embeddings, the structural
feature embedding and the spatial feature embedding, for each
point in e
T(and e
T0). These embeddings encode pointwise
structural and spatial features, and form a structural embedding
matrix T(T0) and a spatial embedding matrix S(S0).
Then, we input (T,S) and (T0,S0) into trajectory backbone
encoders Fand F0to obtain embeddings hand h0for e
Tand
e
T0, respectively (Section IV-C). Our backbone encoders are
adapted from Transformer [29], and they encode structural and
spatial features of trajectories into the embeddings.
Next, hand h0go through two projection heads Pand P0
(which are fully connected layers of the same structure) to be
mapped into lower-dimensional vectors zand z0, respectively:
z=P(h) = (FC ReLU FC)(h)(1)
Here, FC denotes a fully connected layer, ReLU denotes the
ReLU activation function, and denotes function composition.
We omit the equation for P0as it is the same. Such projections
have been shown to improve the embedding quality [17], [30].
Model training. Following previous contrastive learning
models, we use the InfoNCE [43] loss for model training.
We use zand z0as a pair of positive samples, as they both
come from variants of Tand are supposed to be similar in
the learned latent space. The embeddings (except z0) from
projection head P0that are in the current and recent past
training batches are used as negative samples of z. The
InfoNCE loss Lmaximizes the agreement between positive
samples and minimizes that between negative samples:
L(T) = log expsim(z,z0)
expsim(z,z0)+P|Qneg |
j=1 expsim(z,z
j)
(2)
Here, sim is the cosine similarity. τis a temperature parameter
that controls the contribution of the negative samples [44].
We use a queue Qneg of a fixed size (an empirical pa-
rameter) to store negative samples. The queue includes the
embeddings from P0in recent batches, to enlarge the negative
sample pool, since more negative samples help produce more
robust embeddings [16], [17]. To reuse negative samples
from recent batches, the parameters of F0and P0should
change smoothly between batches. We follow the momentum
update [16] procedure to satisfy this requirement:
ΘF0=mΘF0+(1mF; ΘP0=mΘP0+(1mP(3)
摘要:

ContrastiveTrajectorySimilarityLearningwithDual-FeatureAttentionYanchuanChang1,JianzhongQi1,§YuxuanLiang2,EgemenTanin11TheUniversityofMelbourne,2NationalUniversityofSingaporefyanchuanc@student.,jianzhong.qi@,etanin@gunimelb.edu.au,yuxliang@outlook.comAbstract—Trajectorysimilaritymeasuresactasquerypr...

展开>> 收起<<
Contrastive Trajectory Similarity Learning with Dual-Feature Attention Yanchuan Chang1 Jianzhong Qi1Yuxuan Liang2 Egemen Tanin1.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:1.58MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注