
introduced noises are almost certain to hurt the similarity mea-
suring performance. Besides, the concept of secure multiparty
computation [17], [18] and homomorphic encryption [19]
may potentially benefit the GSL tasks, as the input graphs
can be naturally seen as multiple parties that involve in
the computation. However, existing solutions that fulfill the
conceptions [13] typically necessitate extreme computation
and only support a few simple operators. Due to these re-
strictions, it’s difficult to address complex non-linear neural
networks with the above-mentioned algorithms. As a result,
we should design special privacy-preserving graph similarity
learning models, taking a comprehenisve consideration of the
effectiveness, efficiency, and privacy-preservation trade-offs.
To tackle the above challenges, the basic idea of our
approach is to deploy neural networks on user devices in
a distributed computing environment [20], [21]. We would
like to keep most graph calculations on the user side. In
this way, once the device can be viewed as a trustworthy
environment, then only the representations for communication
between devices are at risk of privacy attacks. Although several
neural GSL models can be deployed in a distributed computing
environment, we argue that the privacy-preserving abilities
of these models still vary greatly. The major reason is that
these representations carry significantly different levels of user
privacy information. Attackers can still leverage the represen-
tations sent off the devices to reconstruct graph structure or
infer properties with user privacy. As a result, it is necessary
to analyze the privacy leakage level for each model. Then we
can accordingly design highly privacy-preserved and effective
GSL models, making the communicated representations hard
to attack.
To this end, we first introduce the concept of attackable
representations to qualitatively analyze the types of potential
privacy attacks when a neural GSL model is deployed in dis-
tributed computing environments. We then propose a Privacy-
Preserved neural Graph Matching network for graph similarity
learning, named PPGM. The key point is to learn obfuscated
graph representations that are used for communication. First,
as no node representations are involved in communication, the
proposed method can naturally prevent reconstruction attacks.
Second, each obfuscated feature is fused from representations
provided by both the input graphs. In this way, properties of
one single graph are difficult to infer from the obfuscated
features, alleviating the property inference attacks. In detail,
our approach takes a pair of graphs as input and learns prelim-
inary node representations inside each device. Based on shared
context code vectors, multi-perspective graph representations
are learned and communicated as messages. Then graph-node
matching is performed on each device to generate obfuscated
features. As the attackers will have an equal chance to intercept
a graph representation (i.e., message) or an obfuscated feature
from the communicated representations of PPGM, so that
we can achieve the goal of privacy protection. Meanwhile,
the learning of obfuscated features introduces comprehensive
node-graph representation interactions, which make the pro-
posed model also effective on graph similarity learning tasks.
Neural GSL
Models
Graph with
User Privacy
Trustworthy Environment Untrustworthy Environment
Data Centor
Other Devices
Attackable
Representations
(Node/Graph)
Attacker
Intercepted
Fig. 2. Illustration of the privacy attacks on attackable representations while
deploying neural GSL models in a distributed computing environment. In this
work, we hypothesize that only the user devices are trustworthy environments,
while any representations sent off the devices (i.e., into untrustworthy envi-
ronments) may be intercepted by attackers.
To evaluate the proposed model PPGM, we conduct exten-
sive experiments on widely-used benchmark datasets. In ad-
dition, we propose a quantitative way to evaluate the privacy-
preserving ability of neural GSL models. Shadow datasets
extracted from original benchmarks and well-trained GSL
models are leveraged to train supervised black-box attack
models. Experimental results demonstrate that the proposed
approach is privacy-preserving as well as effective. In sum-
mary, the main contributions are highlighted as follows:
•To the best of our knowledge, we are the first to em-
phasize the privacy-preserving concerns for neural graph
similarity learning models. We introduce the concept
of attackable representations, which is a useful tool to
systematically analyze the privacy attacks that neural
GSL models may suffer from. (Section II-C)
•We propose a privacy-preserved neural graph similarity
learning model PPGM. With obfuscated features commu-
nicated between user devices, our model can be effective
in similarity measuring while preventing reconstruction
attacks and alleviating graph property inference attacks.
(Section III)
•We propose a protocol to quantitatively evaluate the
privacy-preserving ability of neural GSL models via
training supervised black-box attack models on shadow
datasets. (Section IV-A)
II. TASK
In this section, we first briefly introduce the problem formu-
lation and notation for graph similarity learning tasks. Then,
we introduce the concept of attackable representations, which
is useful to qualitatively analyze the potential privacy attacks
for neural GSL models. Finally, we summarize three kinds of
privacy attacks that neural GSL models may suffer from when
deployed in a distributed computing environment.
A. Graph Similarity Learning Tasks
Given a pair of graph hG1, G2iand the corresponding
label y, the task of graph similarity learning is to predict y