Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering Yaming Yang Ziyu Guan Zhe Wang Wei Zhao Cai Xu Weigang Lu Jianbin Huang

2025-04-24 0 0 580.03KB 12 页 10玖币
侵权投诉
Self-supervised Heterogeneous Graph Pre-training
Based on Structural Clustering
Yaming Yang, Ziyu Guan, Zhe Wang, Wei Zhao
, Cai Xu, Weigang Lu, Jianbin Huang
School of Computer Science and Technology, Xidian University
{yym@, zyguan@, zwang@stu., ywzhao@mail., cxu@, wglu@stu., jbhuang@}xidian.edu.cn
Abstract
Recent self-supervised pre-training methods on Heterogeneous Information Net-
works (HINs) have shown promising competitiveness over traditional semi-
supervised Heterogeneous Graph Neural Networks (HGNNs). Unfortunately, their
performance heavily depends on careful customization of various strategies for gen-
erating high-quality positive examples and negative examples, which notably limits
their flexibility and generalization ability. In this work, we present SHGP, a novel
Self-supervised Heterogeneous Graph Pre-training approach, which does not need
to generate any positive examples or negative examples. It consists of two modules
that share the same attention-aggregation scheme. In each iteration, the Att-LPA
module produces pseudo-labels through structural clustering, which serve as the
self-supervision signals to guide the Att-HGNN module to learn object embeddings
and attention coefficients. The two modules can effectively utilize and enhance each
other, promoting the model to learn discriminative embeddings. Extensive experi-
ments on four real-world datasets demonstrate the superior effectiveness of SHGP
against state-of-the-art unsupervised baselines and even semi-supervised baselines.
We release our source code at: https://github.com/kepsail/SHGP.
1 Introduction
Over the past few years, various semi-supervised graph neural networks (GNNs) have been proposed
to learn graph embeddings. They have achieved remarkable success in many graph analytic tasks.
This success, however, comes at the cost of a heavy reliance on high-quality supervision labels. In
real-world scenarios, labels are usually expensive to acquire, and sometimes even impossible due to
privacy concerns.
To relieve the label scarcity issue in (semi-) supervised learning, and take full advantage of a large
amount of easily available unlabeled data, the self-supervised learning (SSL) paradigm has recently
drawn considerable research interest in the computer vision research community. It leverages the
supervision signal from the data itself to learn generalizable embeddings, which are then transferred
to various downstream tasks with only a few task-specific labels. One of the most common SSL
paradigms is contrastive learning, which learns representations by estimating and maximizing the
mutual information between the input and the output of a deep neural network encoder [8].
For graphs, some recent graph contrastive learning methods [
29
,
7
,
41
,
25
,
20
,
19
,
43
,
36
,
40
] have
shown promising competitiveness compared with semi-supervised GNNs. They usually require three
typical steps: (1) constructing positive examples (semantically correlated structural instances) by
strategies such as node dropping, edge perturbation, and negative examples (uncorrelated instances) by
strategies such as feature shuffling, mini-batch sampling; (2) encoding these examples through graph
encoders such as GCN [
16
]; (3) maximizing/minimizing the similarity between these positive/negative
Corresponding author
Preprint. Under review.
arXiv:2210.10462v2 [cs.LG] 12 Apr 2023
examples. Nevertheless, in the real world, graphs often contain multiple types of objects and multiple
types of relationships between them, which are called heterogeneous graphs, or heterogeneous
information networks (HINs) [
27
]. Due to the challenges caused by the heterogeneity, existing SSL
methods on homogeneous graphs cannot be straightforwardly applied to HINs. Very recently, several
works have made some efforts to conduct SSL on HINs [
33
,
23
,
18
,
15
,
13
,
14
,
37
,
10
]. In comparison
with SSL methods on homogeneous graphs, the key difference is that they usually have different
example generation strategies, so as to capture the heterogeneous structural properties in HINs.
The strategies of generating high-quality positive/negative examples are critical to the performance of
existing methods [
34
,
4
,
41
,
36
]. Unfortunately, whether for homogeneous graphs or heterogeneous
graphs, the example generation strategies are dataset-specific, and may not be applicable to all
scenarios. This is because real-world graphs are abstractions of things from various domains, e.g.,
social networks, citation networks, etc. They usually have significantly different structural properties
and semantics. Previous works have systematically studied this and found that different strategies
are good at capturing different structural semantics. For example, study [
41
] observed that edge
perturbation benefits social networks but hurts some biochemical networks, and study [
36
] observed
that negative examples benefit sparser graphs. Consequently, in practice, the example generation
strategies have to be empirically constructed and investigated through either trial-and-error or rules
of thumb. This significantly limits the practicality and general applicability of existing methods.
In this work, we focus on HINs which are more challenging, and propose a novel SSL approach,
named SHGP. Different from existing methods, SHGP requires neither positive examples nor negative
examples, thus circumventing the above issues. Specifically, SHGP adopts any HGNN model that is
based on attention-aggregation scheme as the base encoder, which is termed as the module Att-HGNN.
The attention coefficients in Att-HGNN are particularly used to combine with the structural clustering
method LPA (label propagation algorithm) [
21
], as the module Att-LPA. Through performing
structural clustering on HINs, Att-LPA is able to produce clustering labels, which are treated as
pseudo-labels. In turn, these pseudo-labels serve as guidance signals to help Att-HGNN learn better
embeddings as well as better attention coefficients. Thus, the two modules are able to exploit and
enhance each other, finally leading the model to learn discriminative and informative embeddings. In
summary, we have three main contributions as follows:
We propose a novel SSL method on HINs, SHGP. It innovatively consists of the Att-LPA module
and the Att-HGNN module. The two modules can effectively enhance each other, facilitating the
model to learn effective embeddings.
To the best of our knowledge, SHGP is the first attempt to perform SSL on HINs without any
positive or negative examples. Therefore, it can directly avoid the laborious investigation of
example generation strategies, improving the model’s generalization ability and flexibility.
We transfer the object embeddings learned by SHGP to various downstream tasks. The experimental
results show that SHGP can outperform state-of-the-art baselines, even including some semi-
supervised baselines, demonstrating its superior effectiveness.
2 Related work
SSL on HINs.
There are several existing methods [
33
,
23
,
18
,
15
,
37
,
13
,
14
,
10
] that conduct
SSL on HINs. Determined by their contrastive loss functions, all these methods require high-quality
positive and negative examples to effectively learn embeddings. Thus, their effectiveness and
performance hinge on the specific strategies of generating positive examples and negative examples,
which limits their flexibility and generalization ability.
SSL on homogeneous graphs.
Existing SSL methods on homogeneous graphs [
29
,
7
,
41
,
25
,
20
,
19
,
43
,
36
,
35
,
30
] also need to generate sufficient positive and negative examples to effectively
perform graph contrastive learning. They only handle homogeneous graphs and cannot be easily
applied to HINs. In this work, we seek to perform SSL on HINs without any positive examples or
negative examples.
GNN+LPA methods.
There exist several methods [
31
,
1
,
24
] that combine LPA [
21
] with GNNs.
However, they are all supervised learning methods, and only deal with homogeneous graphs. In this
work, we study SSL on HINs.
2
Others.
DeepCluster [
2
] uses
K
-means to perform clustering in the embedding space. Differently,
our SHGP directly performs structural clustering in the graph space. JOAO [
40
] explores the
automatic selection of positive example generation strategies, which is still not fully automatic.
HuBERT [
9
] is an SSL approach for speech representation learning. GIANT [
3
] leverages graph-
structured self-supervision to extract numerical node features from raw data. MARU [
12
] learns
object embeddings by exploiting meta-contexts in random walks. Different from them, in this work,
we study how to effectively conduct SSL on HINs. Graph pooling methods e.g. [
22
,
39
] learn soft
cluster assignment to coarsen graph topology in each model layer. Differently, our method propagates
integer (hard) cluster labels in each layer to perform structural clustering.
3 Preliminaries
We first briefly introduce some concepts about HINs, and then formally describe the problem we
study in this paper.
Heterogeneous Information Network.
An HIN is defined as:
G= (V,E,A,R, φ, ψ)
, where
V
is the set of objects,
E
is the set of links,
φ
:
V → A
and
ψ
:
E → R
are respectively the object
type mapping function and the link type mapping function,
A
denotes the set of object types, and
R
denotes the set of relations (link types), where
|A| +|R| >2
. Let
X={X1, ..., X|A|}
denote the
set containing all the feature matrices associated with each type of objects. A meta-path
P
of length
l
is defined in the form of
A1
R1
A2
R2
· · · Rl
Al+1
(abbreviated as
A1A2· · · Al+1
), which
describes a composite relation
R=R1R2◦ · · · ◦ Rl
between object types
A1
and
Al+1
, where
denotes the composition operator on relations.
We show a toy HIN in the left part of Figure 1. It contains four object types: “Paper” (
P
), “Author”
(
A
), “Conference” (
C
) and “Term” (
T
), and three relations: “Publish” between
P
and
C
, “Write”
between
P
and
A
, and “Contain” between
P
and
T
.
AP C
is a meta-path of length two, and
a1p2c2
is such a path instance, which means that author a1has published paper p2in conference c2.
SSL on HINs.
Given an HIN
G
, the problem is to learn an embedding vector
hiRd
for each
object
i∈ V
, in a self-supervised manner, i.e., without using any task-specific labels. The pre-trained
embeddings are expected to capture the general-purpose information contained in
G
, and can be
easily transferred to various unseen downstream tasks with only a few task-specific labels.
p3
c2t2
a2
p2
c1t1
a1
p1
Att-LPA
Att-HGNN
Pseudo-labels
CE loss
Conf
Term
Paper
Author Grad descent
Att as link weight
Predictions
Figure 1: The overall architecture of SHGP. Given an HIN, in each iteration, we use Att-HGNN
to produce embeddings and predictions, and use Att-LPA to produce pseudo-labels. The loss
is computed as the cross-entropy between the predictions and the pseudo-labels. The attention
coefficients (and other parameters) are optimized via gradient descent, which serve as the new
attention-aggregation weights of Att-HGCN and Att-LPA in the next iteration, promoting them to
produce better embeddings and predictions, as well as better pseudo-labels.
4 Methodology
In this section, we present the proposed method SHGP, which consists of two key modules. The Att-
HGNN module is instantiated as any HGNN model that is based on attention-aggregation scheme. The
Att-LPA module combines the structural clustering method LPA [
21
] with the attention-aggregation
scheme used in Att-HGNN. The overall model architecture is shown in Figure 1 and explained in the
figure caption. In the following, we describe the procedure of SHGP in detail.
3
摘要:

Self-supervisedHeterogeneousGraphPre-trainingBasedonStructuralClusteringYamingYang,ZiyuGuan,ZheWang,WeiZhao,CaiXu,WeigangLu,JianbinHuangSchoolofComputerScienceandTechnology,XidianUniversity{yym@,zyguan@,zwang@stu.,ywzhao@mail.,cxu@,wglu@stu.,jbhuang@}xidian.edu.cnAbstractRecentself-supervisedpre-tr...

展开>> 收起<<
Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering Yaming Yang Ziyu Guan Zhe Wang Wei Zhao Cai Xu Weigang Lu Jianbin Huang.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:580.03KB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注