Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering Yaming Yang Ziyu Guan Zhe Wang Wei Zhao Cai Xu Weigang Lu Jianbin Huang

2025-04-24 0 0 580.03KB 12 页 10玖币

侵权投诉

Self-supervised Heterogeneous Graph Pre-training

Based on Structural Clustering

Yaming Yang, Ziyu Guan, Zhe Wang, Wei Zhao∗

, Cai Xu, Weigang Lu, Jianbin Huang

School of Computer Science and Technology, Xidian University

{yym@, zyguan@, zwang@stu., ywzhao@mail., cxu@, wglu@stu., jbhuang@}xidian.edu.cn

Abstract

Recent self-supervised pre-training methods on Heterogeneous Information Net-

works (HINs) have shown promising competitiveness over traditional semi-

supervised Heterogeneous Graph Neural Networks (HGNNs). Unfortunately, their

performance heavily depends on careful customization of various strategies for gen-

erating high-quality positive examples and negative examples, which notably limits

their ﬂexibility and generalization ability. In this work, we present SHGP, a novel

Self-supervised Heterogeneous Graph Pre-training approach, which does not need

to generate any positive examples or negative examples. It consists of two modules

that share the same attention-aggregation scheme. In each iteration, the Att-LPA

module produces pseudo-labels through structural clustering, which serve as the

self-supervision signals to guide the Att-HGNN module to learn object embeddings

and attention coefﬁcients. The two modules can effectively utilize and enhance each

other, promoting the model to learn discriminative embeddings. Extensive experi-

ments on four real-world datasets demonstrate the superior effectiveness of SHGP

against state-of-the-art unsupervised baselines and even semi-supervised baselines.

We release our source code at: https://github.com/kepsail/SHGP.

1 Introduction

Over the past few years, various semi-supervised graph neural networks (GNNs) have been proposed

to learn graph embeddings. They have achieved remarkable success in many graph analytic tasks.

This success, however, comes at the cost of a heavy reliance on high-quality supervision labels. In

real-world scenarios, labels are usually expensive to acquire, and sometimes even impossible due to

privacy concerns.

To relieve the label scarcity issue in (semi-) supervised learning, and take full advantage of a large

amount of easily available unlabeled data, the self-supervised learning (SSL) paradigm has recently

drawn considerable research interest in the computer vision research community. It leverages the

supervision signal from the data itself to learn generalizable embeddings, which are then transferred

to various downstream tasks with only a few task-speciﬁc labels. One of the most common SSL

paradigms is contrastive learning, which learns representations by estimating and maximizing the

mutual information between the input and the output of a deep neural network encoder [8].

For graphs, some recent graph contrastive learning methods [

] have

shown promising competitiveness compared with semi-supervised GNNs. They usually require three

typical steps: (1) constructing positive examples (semantically correlated structural instances) by

strategies such as node dropping, edge perturbation, and negative examples (uncorrelated instances) by

strategies such as feature shufﬂing, mini-batch sampling; (2) encoding these examples through graph

encoders such as GCN [

]; (3) maximizing/minimizing the similarity between these positive/negative

∗Corresponding author

Preprint. Under review.

arXiv:2210.10462v2 [cs.LG] 12 Apr 2023

examples. Nevertheless, in the real world, graphs often contain multiple types of objects and multiple

types of relationships between them, which are called heterogeneous graphs, or heterogeneous

information networks (HINs) [

]. Due to the challenges caused by the heterogeneity, existing SSL

methods on homogeneous graphs cannot be straightforwardly applied to HINs. Very recently, several

works have made some efforts to conduct SSL on HINs [

]. In comparison

with SSL methods on homogeneous graphs, the key difference is that they usually have different

example generation strategies, so as to capture the heterogeneous structural properties in HINs.

The strategies of generating high-quality positive/negative examples are critical to the performance of

existing methods [

]. Unfortunately, whether for homogeneous graphs or heterogeneous

graphs, the example generation strategies are dataset-speciﬁc, and may not be applicable to all

scenarios. This is because real-world graphs are abstractions of things from various domains, e.g.,

social networks, citation networks, etc. They usually have signiﬁcantly different structural properties

and semantics. Previous works have systematically studied this and found that different strategies

are good at capturing different structural semantics. For example, study [

] observed that edge

perturbation beneﬁts social networks but hurts some biochemical networks, and study [

] observed

that negative examples beneﬁt sparser graphs. Consequently, in practice, the example generation

strategies have to be empirically constructed and investigated through either trial-and-error or rules

of thumb. This signiﬁcantly limits the practicality and general applicability of existing methods.

In this work, we focus on HINs which are more challenging, and propose a novel SSL approach,

named SHGP. Different from existing methods, SHGP requires neither positive examples nor negative

examples, thus circumventing the above issues. Speciﬁcally, SHGP adopts any HGNN model that is

based on attention-aggregation scheme as the base encoder, which is termed as the module Att-HGNN.

The attention coefﬁcients in Att-HGNN are particularly used to combine with the structural clustering

method LPA (label propagation algorithm) [

], as the module Att-LPA. Through performing

structural clustering on HINs, Att-LPA is able to produce clustering labels, which are treated as

pseudo-labels. In turn, these pseudo-labels serve as guidance signals to help Att-HGNN learn better

embeddings as well as better attention coefﬁcients. Thus, the two modules are able to exploit and

enhance each other, ﬁnally leading the model to learn discriminative and informative embeddings. In

summary, we have three main contributions as follows:

•

We propose a novel SSL method on HINs, SHGP. It innovatively consists of the Att-LPA module

and the Att-HGNN module. The two modules can effectively enhance each other, facilitating the

model to learn effective embeddings.

•

To the best of our knowledge, SHGP is the ﬁrst attempt to perform SSL on HINs without any

positive or negative examples. Therefore, it can directly avoid the laborious investigation of

example generation strategies, improving the model’s generalization ability and ﬂexibility.

•

We transfer the object embeddings learned by SHGP to various downstream tasks. The experimental

results show that SHGP can outperform state-of-the-art baselines, even including some semi-

supervised baselines, demonstrating its superior effectiveness.

2 Related work

SSL on HINs.

There are several existing methods [

] that conduct

SSL on HINs. Determined by their contrastive loss functions, all these methods require high-quality

positive and negative examples to effectively learn embeddings. Thus, their effectiveness and

performance hinge on the speciﬁc strategies of generating positive examples and negative examples,

which limits their ﬂexibility and generalization ability.

SSL on homogeneous graphs.

Existing SSL methods on homogeneous graphs [

] also need to generate sufﬁcient positive and negative examples to effectively

perform graph contrastive learning. They only handle homogeneous graphs and cannot be easily

applied to HINs. In this work, we seek to perform SSL on HINs without any positive examples or

negative examples.

GNN+LPA methods.

There exist several methods [

] that combine LPA [

] with GNNs.

However, they are all supervised learning methods, and only deal with homogeneous graphs. In this

work, we study SSL on HINs.

Others.

DeepCluster [

] uses

-means to perform clustering in the embedding space. Differently,

our SHGP directly performs structural clustering in the graph space. JOAO [

] explores the

automatic selection of positive example generation strategies, which is still not fully automatic.

HuBERT [

] is an SSL approach for speech representation learning. GIANT [

] leverages graph-

structured self-supervision to extract numerical node features from raw data. MARU [

] learns

object embeddings by exploiting meta-contexts in random walks. Different from them, in this work,

we study how to effectively conduct SSL on HINs. Graph pooling methods e.g. [

] learn soft

cluster assignment to coarsen graph topology in each model layer. Differently, our method propagates

integer (hard) cluster labels in each layer to perform structural clustering.

3 Preliminaries

We ﬁrst brieﬂy introduce some concepts about HINs, and then formally describe the problem we

study in this paper.

Heterogeneous Information Network.

An HIN is deﬁned as:

G= (V,E,A,R, φ, ψ)

, where

is the set of objects,

is the set of links,

V → A

and

E → R

are respectively the object

type mapping function and the link type mapping function,

denotes the set of object types, and

denotes the set of relations (link types), where

|A| +|R| >2

. Let

X={X1, ..., X|A|}

denote the

set containing all the feature matrices associated with each type of objects. A meta-path

of length

is deﬁned in the form of

−−→ A2

−−→ · · · Rl

−→ Al+1

(abbreviated as

A1A2· · · Al+1

), which

describes a composite relation

R=R1◦R2◦ · · · ◦ Rl

between object types

and

Al+1

, where

◦

denotes the composition operator on relations.

We show a toy HIN in the left part of Figure 1. It contains four object types: “Paper” (

), “Author”

(

), “Conference” (

) and “Term” (

), and three relations: “Publish” between

and

, “Write”

between

and

, and “Contain” between

and

AP C

is a meta-path of length two, and

a1p2c2

is such a path instance, which means that author a1has published paper p2in conference c2.

SSL on HINs.

Given an HIN

, the problem is to learn an embedding vector

hi∈Rd

for each

object

i∈ V

, in a self-supervised manner, i.e., without using any task-speciﬁc labels. The pre-trained

embeddings are expected to capture the general-purpose information contained in

, and can be

easily transferred to various unseen downstream tasks with only a few task-speciﬁc labels.

c2t2

c1t1

Att-LPA

Att-HGNN

Pseudo-labels

CE loss

Conf

Term

Paper

Author Grad descent

Att as link weight

Predictions

Figure 1: The overall architecture of SHGP. Given an HIN, in each iteration, we use Att-HGNN

to produce embeddings and predictions, and use Att-LPA to produce pseudo-labels. The loss

is computed as the cross-entropy between the predictions and the pseudo-labels. The attention

coefﬁcients (and other parameters) are optimized via gradient descent, which serve as the new

attention-aggregation weights of Att-HGCN and Att-LPA in the next iteration, promoting them to

produce better embeddings and predictions, as well as better pseudo-labels.

4 Methodology

In this section, we present the proposed method SHGP, which consists of two key modules. The Att-

HGNN module is instantiated as any HGNN model that is based on attention-aggregation scheme. The

Att-LPA module combines the structural clustering method LPA [

] with the attention-aggregation

scheme used in Att-HGNN. The overall model architecture is shown in Figure 1 and explained in the

ﬁgure caption. In the following, we describe the procedure of SHGP in detail.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Self-supervisedHeterogeneousGraphPre-trainingBasedonStructuralClusteringYamingYang,ZiyuGuan,ZheWang,WeiZhao,CaiXu,WeigangLu,JianbinHuangSchoolofComputerScienceandTechnology,XidianUniversity{yym@,zyguan@,zwang@stu.,ywzhao@mail.,cxu@,wglu@stu.,jbhuang@}xidian.edu.cnAbstractRecentself-supervisedpre-tr...

展开>> 收起<<

Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering Yaming Yang Ziyu Guan Zhe Wang Wei Zhao Cai Xu Weigang Lu Jianbin Huang.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering Yaming Yang Ziyu Guan Zhe Wang Wei Zhao Cai Xu Weigang Lu Jianbin Huang

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: