Privacy-Preserved Neural Graph Similarity Learning Yupeng Hou13 Wayne Xin Zhao134B Yaliang Li2 and Ji-Rong Wen13 1Gaoling School of Artiﬁcial Intelligence Renmin University of China

2025-05-02 0 0 638.22KB 10 页 10玖币

侵权投诉

Privacy-Preserved Neural Graph Similarity Learning

Yupeng Hou1,3, Wayne Xin Zhao1,3,4 B, Yaliang Li2, and Ji-Rong Wen1,3

1Gaoling School of Artiﬁcial Intelligence, Renmin University of China

2Alibaba Group

3Beijing Key Laboratory of Big Data Management and Analysis Methods

4Engineering Research Center of Next-Generation Intelligent Search and Recommendation, Ministry of Education

{houyupeng,jrwen}@ruc.edu.cn, batmanﬂy@gmail.com, yaliang.li@alibaba-inc.com

Abstract—To develop effective and efﬁcient graph similarity

learning (GSL) models, a series of data-driven neural algorithms

have been proposed in recent years. Although GSL models

are frequently deployed in privacy-sensitive scenarios, the user

privacy protection of neural GSL models has not drawn much

attention. To comprehensively understand the privacy protection

issues, we ﬁrst introduce the concept of attackable representation

to systematically characterize the privacy attacks that each

model can face. Inspired by the qualitative results, we propose

a novel Privacy-Preserving neural Graph Matching network

model, named PPGM, for graph similarity learning. To prevent

reconstruction attacks, the proposed model does not commu-

nicate node-level representations between devices. Instead, we

learn multi-perspective graph representations based on learnable

context vectors. To alleviate the attacks to graph properties, the

obfuscated features that contain information from both graphs

are communicated. In this way, the private properties of each

graph can be difﬁcult to infer. Based on the node-graph matching

techniques while calculating the obfuscated features, PPGM

can also be effective in similarity measuring. To quantitatively

evaluate the privacy-preserving ability of neural GSL models, we

further propose an evaluation protocol via training supervised

black-box attack models. Extensive experiments on widely-used

benchmarks show the effectiveness and strong privacy-protection

ability of the proposed model PPGM. The code is available at:

https://github.com/RUCAIBox/PPGM.

Index Terms—graph similarity learning, privacy-preserving,

graph neural networks

I. INTRODUCTION

Graph similarity learning (GSL) is one of the most fun-

damental tasks in the literature of graph machine learning,

intending to quantify the similarity of two given graphs [1].

Various methods have been proposed to improve the per-

formance of graph similarity learning methods, from early

algorithms based on graph edit distance (GED) [2] or maxi-

mum common subgraph (MCS) [3] metrics to the later graph

kernels [4]–[6]. However, these methods typically require

exponential time complexity, largely limiting the application

on real-world tasks. To further improve the performance and

efﬁciency of GSL models, a series of data-driven approximate

approaches based on graph neural networks (GNNs) have been

proposed recenrly [7]–[9]. These methods greatly broaden the

application scope of GSL to more realistic-sized graph data.

A basic setting of graph similarity learning assumed in

most methods is that the entire graph data is available to use

BCorresponding author.

65 70 75 80 85

Eﬀective

Privacy-Preserved

(ours)PPGM

SimGNN

GMN GraphSim

SGNN

MGMN

H2MN

Graph-Graph Classif. AUC (%)

Graph Property Inference Attack AUC (%)

Fig. 1. Performance comparison of different neural graph similarity learning

models on privacy protection and graph-graph classiﬁcation tasks on FFmpeg

[50,200] dataset.

from user devices, i.e., nodes, edges and the corresponding

features, while the assumption is usually not realistic in real

scenarios. We observe that GSL models are frequently used in

privacy-sensitive scenarios, such as binary function similarity

search [8], healthcare data management [4], and user portrait

matching in recommender systems [10]. For example, when

using code-checking systems based on GSL algorithms, users

may upload their compiled programs (as binary function

graphs [8]) from user devices. These uploaded graph data with

private information should be carefully protected, as they are

usually at high risk of privacy attacks. External attackers may

intercept the uploaded graphs from the communication (e.g.,

uploading) or disguise themselves as a fake data center to

steal these graph data with user privacy. As a result, it is

necessary to care about the privacy issue while developing

graph similarity learning methods.

However, it is challenging to conduct both privacy-

preserved and effective graph similarity learning models. Al-

though there have been several privacy protection techniques

proposed for graph neural networks [11], [12] or general

multi-party computation approaches [13], these techniques

usually can not be directly applied to the graph similarity

learning tasks. For example, graph publishing [14], [15] and

local differential privacy [11], [16] are widely studied, which

directly modify the raw graph data or learned representations

and introduce randomness. However, the modiﬁcations make

it even harder to measure precise graph similarity scores. The

arXiv:2210.11730v1 [cs.LG] 21 Oct 2022

introduced noises are almost certain to hurt the similarity mea-

suring performance. Besides, the concept of secure multiparty

computation [17], [18] and homomorphic encryption [19]

may potentially beneﬁt the GSL tasks, as the input graphs

can be naturally seen as multiple parties that involve in

the computation. However, existing solutions that fulﬁll the

conceptions [13] typically necessitate extreme computation

and only support a few simple operators. Due to these re-

strictions, it’s difﬁcult to address complex non-linear neural

networks with the above-mentioned algorithms. As a result,

we should design special privacy-preserving graph similarity

learning models, taking a comprehenisve consideration of the

effectiveness, efﬁciency, and privacy-preservation trade-offs.

To tackle the above challenges, the basic idea of our

approach is to deploy neural networks on user devices in

a distributed computing environment [20], [21]. We would

like to keep most graph calculations on the user side. In

this way, once the device can be viewed as a trustworthy

environment, then only the representations for communication

between devices are at risk of privacy attacks. Although several

neural GSL models can be deployed in a distributed computing

environment, we argue that the privacy-preserving abilities

of these models still vary greatly. The major reason is that

these representations carry signiﬁcantly different levels of user

privacy information. Attackers can still leverage the represen-

tations sent off the devices to reconstruct graph structure or

infer properties with user privacy. As a result, it is necessary

to analyze the privacy leakage level for each model. Then we

can accordingly design highly privacy-preserved and effective

GSL models, making the communicated representations hard

to attack.

To this end, we ﬁrst introduce the concept of attackable

representations to qualitatively analyze the types of potential

privacy attacks when a neural GSL model is deployed in dis-

tributed computing environments. We then propose a Privacy-

Preserved neural Graph Matching network for graph similarity

learning, named PPGM. The key point is to learn obfuscated

graph representations that are used for communication. First,

as no node representations are involved in communication, the

proposed method can naturally prevent reconstruction attacks.

Second, each obfuscated feature is fused from representations

provided by both the input graphs. In this way, properties of

one single graph are difﬁcult to infer from the obfuscated

features, alleviating the property inference attacks. In detail,

our approach takes a pair of graphs as input and learns prelim-

inary node representations inside each device. Based on shared

context code vectors, multi-perspective graph representations

are learned and communicated as messages. Then graph-node

matching is performed on each device to generate obfuscated

features. As the attackers will have an equal chance to intercept

a graph representation (i.e., message) or an obfuscated feature

from the communicated representations of PPGM, so that

we can achieve the goal of privacy protection. Meanwhile,

the learning of obfuscated features introduces comprehensive

node-graph representation interactions, which make the pro-

posed model also effective on graph similarity learning tasks.

Neural GSL

Models

Graph with

User Privacy

Trustworthy Environment Untrustworthy Environment

Data Centor

Other Devices

Attackable

Representations

(Node/Graph)

Attacker

Intercepted

Fig. 2. Illustration of the privacy attacks on attackable representations while

deploying neural GSL models in a distributed computing environment. In this

work, we hypothesize that only the user devices are trustworthy environments,

while any representations sent off the devices (i.e., into untrustworthy envi-

ronments) may be intercepted by attackers.

To evaluate the proposed model PPGM, we conduct exten-

sive experiments on widely-used benchmark datasets. In ad-

dition, we propose a quantitative way to evaluate the privacy-

preserving ability of neural GSL models. Shadow datasets

extracted from original benchmarks and well-trained GSL

models are leveraged to train supervised black-box attack

models. Experimental results demonstrate that the proposed

approach is privacy-preserving as well as effective. In sum-

mary, the main contributions are highlighted as follows:

•To the best of our knowledge, we are the ﬁrst to em-

phasize the privacy-preserving concerns for neural graph

similarity learning models. We introduce the concept

of attackable representations, which is a useful tool to

systematically analyze the privacy attacks that neural

GSL models may suffer from. (Section II-C)

•We propose a privacy-preserved neural graph similarity

learning model PPGM. With obfuscated features commu-

nicated between user devices, our model can be effective

in similarity measuring while preventing reconstruction

attacks and alleviating graph property inference attacks.

(Section III)

•We propose a protocol to quantitatively evaluate the

privacy-preserving ability of neural GSL models via

training supervised black-box attack models on shadow

datasets. (Section IV-A)

II. TASK

In this section, we ﬁrst brieﬂy introduce the problem formu-

lation and notation for graph similarity learning tasks. Then,

we introduce the concept of attackable representations, which

is useful to qualitatively analyze the potential privacy attacks

for neural GSL models. Finally, we summarize three kinds of

privacy attacks that neural GSL models may suffer from when

deployed in a distributed computing environment.

A. Graph Similarity Learning Tasks

Given a pair of graph hG1, G2iand the corresponding

label y, the task of graph similarity learning is to predict y

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Privacy-PreservedNeuralGraphSimilarityLearningYupengHou1,3,WayneXinZhao1,3,4B,YaliangLi2,andJi-RongWen1,31GaolingSchoolofArticialIntelligence,RenminUniversityofChina2AlibabaGroup3BeijingKeyLaboratoryofBigDataManagementandAnalysisMethods4EngineeringResearchCenterofNext-GenerationIntelligentSearchand...

展开>> 收起<<

Privacy-Preserved Neural Graph Similarity Learning Yupeng Hou13 Wayne Xin Zhao134B Yaliang Li2 and Ji-Rong Wen13 1Gaoling School of Artiﬁcial Intelligence Renmin University of China.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Privacy-Preserved Neural Graph Similarity Learning Yupeng Hou13 Wayne Xin Zhao134B Yaliang Li2 and Ji-Rong Wen13 1Gaoling School of Artiﬁcial Intelligence Renmin University of China

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: