Reconstruction Attack on Differential Private Trajectory Protection Mechanisms

2025-04-29 0 0 1.63MB 14 页 10玖币
侵权投诉
Reconstruction Aack on Dierential Private Trajectory
Protection Mechanisms
Erik Buchholz
University of New South Wales
Data61, Cyber Security CRC
Sydney, NSW, Australia
e.buchholz@unsw.edu.au
Alsharif Abuadbba
CSIRO’s Data61
Cyber Security CRC
Sydney, NSW, Australia
sharif.abuadbba@data61.csiro.au
Shuo Wang
CSIRO’s Data61
Cyber Security CRC
Sydney, NSW, Australia
shuo.wang@data61.csiro.au
Surya Nepal
CSIRO’s Data61
Cyber Security CRC
Sydney, NSW, Australia
surya.nepal@data61.csiro.au
Salil S. Kanhere
University of New South Wales
Cyber Security CRC
Sydney, NSW, Australia
salil.kanhere@unsw.edu.au
ABSTRACT
Location trajectories collected by smartphones and other devices
represent a valuable data source for applications such as location-
based services. Likewise, trajectories have the potential to reveal
sensitive information about individuals, e.g., religious beliefs or sex-
ual orientations. Accordingly, trajectory datasets require appropri-
ate sanitization. Due to their strong theoretical privacy guarantees,
dierential private publication mechanisms receive much attention.
However, the large amount of noise required to achieve dierential
privacy yields structural dierences, e.g., ship trajectories passing
over land. We propose a deep learning-based Reconstruction At-
tack on Protected Trajectories (RAoPT), that leverages the mentioned
dierences to partly reconstruct the original trajectory from a dif-
ferential private release. The evaluation shows that our RAoPT
model can reduce the Euclidean and Hausdor distances between
the released and original trajectories by over
68 %
on two real-world
datasets under protection with
𝜀
1. In this setting, the attack in-
creases the average Jaccard index of the trajectories’ convex hulls,
representing a user’s activity space, by over 180 %. Trained on the
GeoLife dataset, the model still reduces the Euclidean and Hausdor
distances by over
60 %
for T-Drive trajectories protected with a state-
of-the-art mechanism (
𝜀=
0
.
1). This work highlights shortcomings
of current trajectory publication mechanisms, and thus motivates
further research on privacy-preserving publication schemes.
CCS CONCEPTS
Security and privacy Data anonymization and sanitiza-
tion
;Privacy protections;
Computing methodologies
Neural
networks;
KEYWORDS
Trajectory Privacy, Dierential Privacy, Location Privacy, Deep
Learning
ACSAC ’22, December 5–9, 2022, Austin, TX, USA
©2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
This is the author’s version of the work. It is posted here for your personal use. Not
for redistribution. The denitive Version of Record was published in Annual Computer
Security Applications Conference (ACSAC ’22), December 5–9, 2022, Austin, TX, USA,
https://doi.org/10.1145/3564625.3564628.
ACM Reference Format:
Erik Buchholz, Alsharif Abuadbba, Shuo Wang, Surya Nepal, and Salil S.
Kanhere. 2022. Reconstruction Attack on Dierential Private Trajectory
Protection Mechanisms. In Annual Computer Security Applications Confer-
ence (ACSAC ’22), December 5–9, 2022, Austin, TX, USA. ACM, New York,
NY, USA, 14 pages. https://doi.org/10.1145/3564625.3564628
1 INTRODUCTION
Due to the omnipresence of smartphones and wearables in our daily
lives, a large amount of personal location data is collected every day.
The sequence of locations visited by an individual represents a tra-
jectory. This data is valuable for many services such as research [
4
],
market analysis [
58
], navigation [
19
,
56
], social gaming [
39
], and
most recently for contact tracing in the context of the COVID-19
pandemic [
26
,
43
,
45
]. However, signicant privacy concerns are
associated with the release of location information. The location
trajectory of an individual may reveal sensitive information, such as
religious, political, or sexual beliefs [
2
,
47
]. For instance, someone
with access to the trajectories of a taxi eet could determine which
drivers are practising Muslims based on the correlation of their
breaks and mandatory prayer times [
17
]. Moreover, De Montjoye et
al. showed that only four spatio-temporal points suce to uniquely
identify
95 %
of individuals [
10
]. These examples illustrate the need
of trajectories for appropriate protection before being released.
To hide the exact route of individual trajectories with the goal to
prevent pirate attacks [
23
], stalking [
18
] or other security threats,
multiple approaches that extend
𝑘
-anonymity [
2
,
3
,
18
,
38
,
51
,
57
]
and dierential privacy [
5
,
6
,
22
,
23
,
28
,
30
] to the trajectory domain
have been proposed [
47
]. However, existing approaches either sig-
nicantly reduce the utility of the released data or provide limited
privacy [
46
,
47
].
𝐾
-anonymity based approaches are susceptible
to attacks utilising background knowledge, and cannot provide
strong privacy guarantees [
6
,
7
,
18
,
24
,
33
]. Therefore, recent re-
search focused on publication mechanisms achieving dierential
privacy. Due to the high information content of location data, these
approaches signicantly degrade data utility to achieve privacy pro-
tection [
33
,
48
,
49
] because there is an inherent trade-o between
privacy and utility. The random distortion added by dierential
private protection mechanisms yields unrealistic trajectories that
provide limited utility and can easily be recognised [
48
] because
arXiv:2210.09375v1 [cs.CR] 17 Oct 2022
ACSAC ’22, December 5–9, 2022, Austin, TX, USA Erik Buchholz, Alsharif Abuadbba, Shuo Wang, Surya Nepal, and Salil S. Kanhere
they do not take geographical constraints into consideration [
37
].
For instance, protected trajectories of cars do not follow roads or
ship trajectories pass over land. To highlight the risk posed by
current publication mechanisms, we address the research question:
RQ1.
Can an adversary (partly) reconstruct trajectories from a
dierential private trajectory release?
We nd that the dierential private mechanisms, such as the Sam-
pling Distance and Direction (SDD) approach [
23
], yield trajectories
that are structurally distinguishable from unperturbed trajectories.
By exploiting these characteristics, we propose a novel Long Short-
Term Memory (LSTM)-based Reconstruction Attack on Protected
Trajectories (RAoPT) to address the dened research question
RQ1
.
The RAoPT model receives trajectories protected with a dierential
private publication mechanism as input and outputs reconstructed
trajectories which are closer to the original trajectories.
Our attack is evaluated on two real-world GPS datasets, the T-
Drive [
59
], and the GeoLife [
60
] dataset. We evaluate the reduction
of the Euclidean and Hausdor distances between the recovered
and original trajectories compared to the distances between the
protected and original trajectories. These metrics are commonly
used to measure the distance between trajectories [
22
,
23
,
28
,
33
,
49
,
52
,
52
]. We also measure the increase of the Jaccard index of the
trajectories’ convex hulls before and after reconstruction, as the
convex hull can represent a trajectory’s activity space, i.e., the area
in which a user is active [
27
]. A small physical distance to a victim
represents a security threat in various settings, e.g., stalkers can
follow or intercept a victim [18], or pirates can plan attacks [23].
For an adversary with knowledge about the used protection
method and access to ground truth trajectories for the training, our
RAoPT model can reduce both distances even for privacy settings
with very high privacy guarantees (
𝜀
0
.
1) by over
98 %
in case of
Laplace noise-based protection and by
68 %
to
84 %
considering a
state-of-the-art protection mechanism. In these settings, the Jaccard
index is increased by at least
180 %
. In the realistic scenario that the
adversary knows about the protection method (SDD with
𝜀=
0
.
1)
but no ground truth for training is accessible, the Euclidean distance
can still be reduced by approx.
40
to
61 %
, and the Hausdor distance
by approx.
62 %
to
67 %
when transferring from one dataset to
the other. Even an adversary without background knowledge can
achieve reductions of over 60 % in some settings.
Contributions. Our main contributions are as follows:
We propose the rst LSTM-based reconstruction attack on
dierential private protected trajectories.
The RAoPT model can reduce the Euclidean and Hausdor
distances by over
68 %
on T-Drive trajectories protected with
the Laplace mechanism or a state-of-the-art protection mech-
anism and 𝜀1, decreasing the provided level of privacy.
We show the real-world applicability of the attack via two
datasets, namely T-Drive [
59
] and GeoLife [
60
], with dier-
ent granularities and transport modes.
We open-source our RAoPT model
1
including the considered
protection mechanisms and pre-processing scripts.
1Our source code is available at: https://github.com/erik-buchholz/RAoPT
This article is organised as follows. We introduce the required
background knowledge in Section 2 and dene our problem state-
ment and threat model in Sections 2.3 and 2.4, respectively. In
Section 3, we discuss related work addressing attacks on protection
mechanisms, and approaches utilising deep learning for trajectory
protection. Then, we introduce our proposed RAoPT model in Sec-
tion 4. In Section 5, we provide implementation details and show
the results of our evaluations. Subsequently, we discuss our ndings
in Section 6. Finally, we conclude this paper in Section 7.
2 PRELIMINARIES
In Section 2.1, we rst dene the term location trajectory and provide
some general knowledge on trajectory protection mechanisms. The
SDD mechanism, which is used for the evaluation of our attack,
is explained in Section 2.2. In Section 2.3, we state our problem
statement, and in Section 2.4, we present our threat model. We refer
readers not familiar with dierential privacy to Appendix B.
2.1 Location Trajectory Protection
A location trajectory
𝑇
consists of a sequence of locations
𝑇=
(𝑡1, . . . , 𝑡𝑛)
. In the most basic case, each location consists of two
values
𝑡𝑖=(𝑥𝑖, 𝑦𝑖)
which can either represent the location within a
reference (coordinate) system or latitude and longitude. For exact
localisation, the altitude or elevation can be added to latitude and
longitude as a third coordinate. Many trajectory datasets record
atimestamp for each location [
59
,
60
]. Lastly, trajectories can be
enhanced by semantic information [55], such as Point of Interests
(POIs), i.e., the knowledge of whether a location within a trajectory
represents a restaurant, shop, or gym. While such additional infor-
mation can increase the utility of a dataset for analyses, semantic
information can be exploited for attacks such as Trajectory User
Linking (TUL) [
34
,
55
]. Semantic information can also be added to
a dataset retrospectively, e.g., by matching the location points of a
trajectory against semantic datasets, such as OpenStreetMap [9].
Protection mechanisms for the release of trajectory datasets tar-
get two dierent scenarios. In the rst scenario, a dataset of multiple
trajectories is released, each trajectory represents one record of
the dataset. Second, one trajectory represents a database and each
location can be considered as one record. Moreover, protection
mechanisms either rely on
𝑘
-anonymity and related privacy no-
tions, or on dierential privacy.
𝐾
-Anonymity follows the intuition
of hiding one user in a crowd of users. While this class of privacy
notion is very intuitive, it cannot provide strong privacy guarantees
[
6
,
7
,
18
,
24
,
33
]. Background knowledge can be exploited to derive
knowledge from the protected dataset even when a dataset pro-
vides
𝑘
-anonymity or a similar notion. Therefore, much research
has focused on dierential privacy (cf. Appendix B for details).
The dierential private SDD [
23
] mechanism is described in the
following section, as we use this mechanism for our evaluation.
2.2 Sampling Distance and Direction
The Sampling Distance and Direction (SDD) mechanism is one ap-
proach to publish location trajectories while providing dierential
privacy. We consider the SDD mechanism as one target for our
reconstruction attack because the mechanism can provide signi-
cantly better utility than the standard Laplace mechanism [
23
]. SDD
Reconstruction Aack on Dierential Private Trajectory Protection Mechanisms ACSAC ’22, December 5–9, 2022, Austin, TX, USA
is considered as baseline in recent literature on protection mecha-
nisms [
7
,
41
] and follows an intuitive approach. As motivation for
the approach, the authors refer to the publication of trajectories
from ships in the Singapore Straits because unprotected trajectories
could be utilised by pirates to plan and launch attacks [23].
The authors make the assumption that start and end locations
are not vulnerable and can be published without protection. Start-
ing from the rst location, the exponential mechanism [
35
] is used
to sample the distance and direction to the next location. The pa-
rameters of the exponential mechanism are chosen such that the
sampling achieves
𝜀
-dierential privacy. Moreover, the point de-
ned by the sampled values must lie within a certain distance to the
end point, to preserve utility. Otherwise, the sampling is repeated.
The evaluation shows that the SDD mechanism achieves signif-
icantly better utility than the Laplace mechanism, especially for
smaller values of
𝜀
(and hence, higher privacy levels) [
23
]. Hence,
we target this particular mechanism for the evaluation of our attack.
2.3 Problem Statement
The protection of ship trajectories with the goal to prevent pirate
attacks serves as motivation for the SDD mechanism [
23
]. Other
authors [
18
] use stalking and interception of individuals as motiva-
tion. For both threats, the attackers do not necessarily require the
exact coordinates. Instead, the pirates might successfully launch an
attack as long as they manage to get into line of sight of the target
ship, and for a stalker, an approximate route of the victim suces.
However, through observation of the output trajectories pro-
duced by protection mechanisms, we observe structural dierences
to the genuine input trajectories. Structural dierences include,
for instance, that protected trajectories exhibit zigzag patterns not
found in genuine ship trajectories or pass very close or over land,
which is impossible for real ships (compare Figure 3 and 4 in [
23
]).
The goal of this article is to highlight the risk posed by a Re-
construction Attack on Protected Trajectories (RAoPT) which ex-
ploits such characteristics. We dene the reconstruction attack
as a method to reduce the distance between the original and the
reconstructed trajectories (called OR-Distance) signicantly com-
pared to the distance between the protected and original trajectories
(OP-Distance). I.e., the locations of the reconstructed trajectory are
physically closer to the locations of the original trajectory than the
locations of the protected trajectories. Reducing the OR-Distance
signicantly, potentially to the level that the reconstructed and
original trajectory overlap, represents a serious security threat to
the users in the released dataset. For instance, in case of ships, this
would allow the planning of pirate attacks [
23
], or in case of individ-
uals, this information could be used by stalkers [
18
]. While a perfect
reconstruction of a protected trajectory is not realistic, a partial re-
construction yielding a close physical distance or even intersection,
represents a serious privacy breach and security threat.
2.4 Threat Model
For the evaluation of RAoPT, we assume dierent levels of back-
ground information known to the adversary executing the attack.
Adversary 1: Full Knowledge.
The strongest adversary with full
knowledge knows which mechanism with which parameters have
been used for the protection of the released trajectories. Moreover,
this adversary has access to unprotected trajectory data with the
same distribution as the target dataset for the training of the attack
model. This adversary model is the best case for an attack as the
model can be trained on data that has the same properties as the real
data. However, the assumptions are very strong and not realistic
in the real-world. We use this adversary model to examine the
inuence of dierent parameters on the reconstruction success.
Adversary 2: Partial Knowledge.
The adversary with partial
knowledge has no access to unprotected data with the same distribu-
tion but is aware of the used protection mechanism and parameters.
Hence, the adversary must train the attack model with data from a
dierent trajectory dataset, e.g., with a publicly available dataset.
Using another dataset during training might lower the attack per-
formance because the trajectories of the training set might not
possess the same unique features as the target dataset. The adver-
sary with partial knowledge is more realistic than adversary
1
, as
an adversary is unlikely to get access to unprotected data from
the same source as the target dataset. The assumption that the
protection mechanism is public knowledge is not unrealistic as
security through obscurity [
44
] is generally discouraged such that
the protection method might be published alongside the dataset.
Hence, this adversary targets a real-world scenario.
Adversary 3: No Knowledge.
In the worst-case, the adversary has
no background knowledge, i.e., they have neither access to a dataset
with the same distribution nor any information about the used
protection mechanism or its parameters. Hence, the attack model
has to be trained on data that might display dierent properties than
the dataset that shall be attacked. Accordingly, the reconstruction
success will be lower than for the previous two adversaries. We
provide an overview of related work in the following section.
3 RELATED WORK
In this section, we rst describe existing attacks on trajectory pro-
tection mechanisms, and then, summarise works applying deep
learning to the trajectory domain.
Existing Attacks.
Shao et al. [
52
] proposed an attack framework
called iTracker that recovers original trajectories from a dieren-
tial private release by exploiting the correlation between multiple
trajectories. Previous attacks only used the information of a single
trajectory and mostly relied on Markov models [
52
]. The iTracker
framework is based on a location sparsity matrix and uses two
approximation algorithms to converge towards the most probable
original trajectories. The evaluation of the approach shows that
iTracker is able to recover trajectories that are more similar to
the original ones than any of the trajectories recovered by related
work. To the best of our knowledge, this framework represents the
most eective attack on dierential private trajectory publication
mechanisms and highlights that dierential private mechanisms
can be attacked. While this framework represents the closest work
to our approach, iTracker is only evaluated on Laplace noise-based
mechanisms. As described in Section 2.2, the Laplace noise-based
mechanisms add substantially more distortion to a trajectory than
more advanced approaches such as the SDD mechanism. Hence,
recovery of Laplace noise-protected trajectories is less challeng-
ing than trajectories from more advanced protection schemes, as
conrmed by our evaluation in Section 5. Unfortunately, a direct
摘要:

ReconstructionAttackonDifferentialPrivateTrajectoryProtectionMechanismsErikBuchholzUniversityofNewSouthWalesData61,CyberSecurityCRCSydney,NSW,Australiae.buchholz@unsw.edu.auAlsharifAbuadbbaCSIRO’sData61CyberSecurityCRCSydney,NSW,Australiasharif.abuadbba@data61.csiro.auShuoWangCSIRO’sData61CyberSecur...

展开>> 收起<<
Reconstruction Attack on Differential Private Trajectory Protection Mechanisms.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:1.63MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注