Heterogeneous Graph Neural Network for Privacy-Preserving Recommendation Yuecen Weiyz Xingcheng Fux Qingyun Sunx Hao Pengx Jia Wuk Jinyan Wangyzand Xianxian Liyz

2025-05-06 0 0 5.11MB 10 页 10玖币
侵权投诉
Heterogeneous Graph Neural Network for
Privacy-Preserving Recommendation
Yuecen Wei, Xingcheng Fu§, Qingyun Sun§, Hao Peng§, Jia Wuk, Jinyan Wangand Xianxian Li
Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin, China
School of Computer Science and Engineering, Guangxi Normal University, Guilin, China
§Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing, China
School of Computer Science and Engineering, Beihang University, Beijing, China
kSchool of Computing, Macquarie University, Sydney, Australia
Email: weiyc@stu.gxnu.edu.cn, {fuxc,sunqy,penghao}@act.buaa.edu.cn,
jia.wu@mq.edu.au, {wangjy612,lixx}@gxnu.edu.cn
Abstract—Social networks are considered to be heterogeneous
graph neural networks (HGNNs) with deep learning technological
advances. HGNNs, compared to homogeneous data, absorb vari-
ous aspects of information about individuals in the training stage.
That means more information has been covered in the learning
result, especially sensitive information. However, the privacy-
preserving methods on homogeneous graphs only preserve the
same type of node attributes or relationships, which cannot
effectively work on heterogeneous graphs due to the complexity.
To address this issue, we propose a novel heterogeneous graph
neural network privacy-preserving method based on a differential
privacy mechanism named HeteDP, which provides a double
guarantee on graph features and topology. In particular, we
first define a new attack scheme to reveal privacy leakage
in the heterogeneous graphs. Specifically, we design a two-
stage pipeline framework, which includes the privacy-preserving
feature encoder and the heterogeneous link reconstructor with
gradients perturbation based on differential privacy to tolerate
data diversity and against the attack. To better control the
noise and promote model performance, we utilize a bi-level
optimization pattern to allocate a suitable privacy budget for the
above two modules. Our experiments on four public benchmarks
show that the HeteDP method is equipped to resist heterogeneous
graph privacy leakage with admirable model generalization.
Index Terms—privacy-preserving, recommendation, differen-
tial privacy, heterogeneous graph
I. INTRODUCTION
The heterogeneous graph is an extraordinary information
network, which consists of multiple node types and multiple
relation types [1]. Social relations are one of the networks
that are most complex and closest to people’s lives. According
to their interactions and inter-dependencies, recommendation
predicts the products the user will purchase while inferring
the user’s implicit tendency [2], [3]. Therefore, heterogeneous
information networks (HINs) [4] are widely used in recom-
mender systems due to their enriched heterogeneous data. For
example, in movie recommendation, entities have not only
users and movies but also stores, and the relationship has
collections in addition to purchases [5], etc. For adapting
the non-Euclidean structure of HINs, existing works leverage
high-level information [6]–[10] by other platforms sharing
Corresponding author.
Fig. 1. An example of privacy risk from a homogeneous graph to a
heterogeneous. Change (1) represents general privacy-preserving measures
for nodes on the homogeneous graph. Change (2) indicates that the former
method has a poor protection effect on heterogeneous graphs because more
node types are considered.
(e.g., logging in with a third-party account) [11]–[14] or
semantic-level information from multiple entities [1], [15],
[16]. In this way, these works always fuse the social network
data and other side information of the users and items as a
unified heterogeneous graph to improve model performance.
However, while HINs boost recommendation capabilities, they
also bring an additional risk of privacy leakage.
Graph neural networks (GNNs) are widely used to im-
plement heterogeneous graph learning and achieve remark-
able results, as a popular and powerful graph representation
model [17]–[21], such as recommended systems [22]–[25].
However, most existing works focus on how to improve the
representational power of graphs and ignore the security issues
of sensitive information in graph data. For user privacy, some
non-Euclidean data may more intuitively discover the relation-
ships between users and some sensitive information [26], such
as social relationships [13], behavioral trajectories [27]–[29],
and medical records. While people benefit from the conve-
nience of the recommendation, they are faced with recorded
behavior data and learned and used all aspects of information
that would bring a series of privacy leakage risks. In the real
social world, some malicious people can obtain individuals’
sensitive characteristics from enriching recommendations [5],
such as identification and phone number, address, and even
social relationships. The privacy leakage risk of this hetero-
geneous information is reflected in both feature and topology
levels.
Recently, to address privacy problems in graph data, some
arXiv:2210.00538v2 [cs.LG] 9 Oct 2022
Fig. 2. An example of privacy risk from topology properties and topology
protection. (a)The original graph structure. (b)The topological structure
after perturbation.
existing works focus on privacy leakage in graph-based [30],
[31]. Differential privacy [12], [32] based on data distribution
perturbation, as advanced privacy-preserving technology, is
widely used in deep learning because of the strict mathematical
definition. Therefore, there is a remarkable limitation: the
privacy-preserving method of a homogeneous graph cannot
solve the problem caused by heterogeneity. For example,
different types of nodes may no longer be independent of
each other in features and topology but have semantic de-
pendencies. On the one hand, Fig. 1 illustrates an inference
and preservation between homogeneous and heterogeneous
graphs. The model predicts user Cwill buy the item by
neighbor relationships. Specifically, the inference is drawn
due to As historical shopping record, and Bis a neighbor
of both Aand C. The existing works protect the direct
relationship between users by disturbing their links, reducing
the predicted probability. However, different types of nodes
and edges exist in heterogeneous social networks, respectively.
When other node types are considered on the graph, we still
can infer the buying action since the existing homogeneous
graph methods only pay attention to the influence between
the same node types. On the other hand, we are assuming
that a malicious attacker can compare the target network
with another network whose topology is similar and public.
The background knowledge allows him to obtain connections
between arbitrary nodes regardless of node types and analyze
the semantics to understand the preferences of a particular
user. Fig. 2 shows that user Ais the only node in the subgraph
with degree two and in one quadrilateral and business Bis the
node with degree four and in two quadrilaterals. The attacker
can utilize the topology of heterogeneous graphs to infer that
Ahas purchased items from B. If we could change the links
between nodes while keeping the topological properties of
each node as much as possible, the edges in the graph would
not be directly exposed. The examples of the above attack
methods show that the traditional naive differential privacy
based on the I.I.D assumption is difficult to apply to the
heterogeneous graphs of non-I.I.D directly.
Consequently, the core issue is, “Can we put forward a
heterogeneous graph neural network for privacy-preserving
recommendation model which is able to adapt to the hetero-
geneity of graph data with resisting the ‘betrayal’ of graph
topology and different neighbors?” In conclusion, there are
three extraordinary challenges in heterogeneous graphs about
privacy-preserving: (1) privacy is leaked through different
types of higher-order neighbor information; (2) even if the
topology of homogeneous nodes is changed, privacy can still
reveal the relationship between the same node types through
high-level semantics; (3) the difficulty lies in how to trade off
privacy guarantees and compelling predictions.
To resolve above problem, we propose a novel
Heterogeneous Graph Neural Network Privacy-Preserving
method based on Differential Privacy named HeteDP1.
First, we define a novel privacy leakage scenario for
heterogeneous graph recommendations, and further, we reveal
the privacy leakage risks associated with the heterogeneity of
heterogeneous graphs. Specifically, We designed two stages of
DP strategies to guarantee the privacy of graph features and
topology for the privacy leakage problem of the heterogeneous
graph. We propose a reasonable feature perturbation method
based on a heterogeneous attention mechanism to encode the
node representations. The sensitivity of features’ Gaussian
noise is learned by the neighbor influence and relationship
influence of nodes under different relational subgraphs. Then,
we input the perturbed node representations to a variational
graph auto-encoders (VGAE) [33] of the heterogeneous
graph for reconstructing the privacy-preserving topology.
The reconstructor can set learnable gradient clipping
hyperparameters as noise sensitivity to clip and perturb the
gradients. In addition, to solve the privacy budget allocation
problem of global differential privacy, we design a bi-level
optimization algorithm for HeteDP. We summarize our main
contributions as follows:
Aiming at the nature of heterogeneous information net-
works, we define a novel privacy leakage scenario and
reveal privacy leakage risks for heterogeneous graph
recommendations.
We propose a novel unsupervised privacy-preserving
learning framework, named Heterogeneous Graph Neu-
ral Network Privacy-Preserving with Differential Privacy
(HeteDP). HeteDP is a two-stage pipeline framework,
which can preserve the privacy of the feature and topol-
ogy of the heterogeneous graph.
We design a adaptive privacy budget allocation by using
bi-level optimization to balance the privacy and utility of
HeteDP.
Experiments demonstrate the adaptability and generaliza-
tion performance of the model on multiple real-world
datasets. We further analyze the necessity of each part of
HeteDP and the feasibility of the whole model in detail.
II. RELATED WORK
A. Heterogeneous Graph Neural Network
HGNNs [34]–[36] are a powerful representation learn-
ing method with outstanding generalization ability. Existing
models can fully use intricate information in heterogeneous
networks to learn more inner information and improve model
1The source code is released at https://github.com/AixWinnie/HeteDP.
摘要:

HeterogeneousGraphNeuralNetworkforPrivacy-PreservingRecommendationYuecenWeiyz,XingchengFux{,QingyunSunx{,HaoPengx,JiaWuk,JinyanWangyzandXianxianLiyzyGuangxiKeyLabofMulti-sourceInformationMining&Security,GuangxiNormalUniversity,Guilin,ChinazSchoolofComputerScienceandEngineering,GuangxiNormalUnivers...

展开>> 收起<<
Heterogeneous Graph Neural Network for Privacy-Preserving Recommendation Yuecen Weiyz Xingcheng Fux Qingyun Sunx Hao Pengx Jia Wuk Jinyan Wangyzand Xianxian Liyz.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:5.11MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注