Relational Message Passing for Fully Inductive Knowledge Graph Completion Yuxia Geng

2025-04-29 0 0 1.7MB 13 页 10玖币

侵权投诉

Relational Message Passing for Fully Inductive

Knowledge Graph Completion

Yuxia Geng

College of Computer Science and Technology

Zhejiang University, Hangzhou, China

Donghai Laboratory, Zhoushan, China

gengyx@zju.edu.cn

Jiaoyan Chen

Department of Computer Science

The University of Manchester

Manchester, United Kingdom

jiaoyan.chen@manchester.ac.uk

Jeff Z. Pan

School of Informatics

The University of Edinburgh

Edinburgh, United Kingdom

j.z.pan@ed.ac.uk

Mingyang Chen

College of Computer Science and Technology

Zhejiang University

Hangzhou, China

mingyangchen@zju.edu.cn

Song Jiang

NAIE PDU

Huawei Technologies Co., Ltd.

Xi’an, China

jiangsong12@huawei.com

Wen Zhang∗, Huajun Chen∗

Zhejiang University, Hangzhou, China

Donghai Laboratory, Zhoushan, China

AZFT Joint Lab for Knowledge Engine

{zhang.wen, huajunsir}@zju.edu.cn

Abstract—In knowledge graph completion (KGC), predicting

triples involving emerging entities and/or relations, which are

unseen when the KG embeddings are learned, has become a

critical challenge. Subgraph reasoning with message passing is

a promising and popular solution. Some recent methods have

achieved good performance, but they (i) usually can only predict

triples involving unseen entities alone, failing to address more

realistic fully inductive situations with both unseen entities and

unseen relations, and (ii) often conduct message passing over

the entities with the relation patterns not fully utilized. In this

study, we propose a new method named RMPI which uses a

novel Relational Message Passing network for fully Inductive

KGC. It passes messages directly between relations to make full

use of the relation patterns for subgraph reasoning with new

techniques on graph transformation, graph pruning, relation-

aware neighborhood attention, addressing empty subgraphs, etc.,

and can utilize the relation semantics deﬁned in the KG’s

ontological schema. Extensive evaluation on multiple benchmarks

has shown the effectiveness of RMPI’s techniques and its better

performance compared with the existing methods that support

fully inductive KGC. RMPI is also comparable to the state-of-the-

art partially inductive KGC methods with very promising results

achieved. Our codes, data and some supplementary experiment

results are available at https://github.com/zjukg/RMPI.

Index Terms—Knowledge Graph, Inductive Knowledge Graph

Completion, Message Passing, Link Prediction, Ontology

I. INTRODUCTION

Knowledge Graphs (KGs) [25] often suffer from incomplete-

ness [14]. Many KG completion (KGC) methods, including the

schema aware ones [41], have been developed to discover

missing facts (triples). Many of them use KG embedding

techniques, encoding the KG entities and relations into a vector

space with their semantics concerned, so that the missing facts

can be inferred by computation on these vector representations

(embeddings) [11], [39]. However, these embedding-based

methods often work in a transductive setting, where the triples

to predict involve only entities and relations that have already

∗Corresponding author.

husband_of

father_of

son_of

(a)

husband_of

father_of

son_of

(b)

spouse_of ?

husband_of

father_of

son_of

(c)

Fig. 1: Examples on inductive KGC: (a) a training graph, i.e., a given

KG whose embeddings have been learned; (b) a testing graph with

unseen entities for partially inductive completion; (c) a testing graph

with both unseen entities and unseen relations (spouse_of ) for fully

inductive completion. The unseen elements are colored in red.

occurred in the embedding training triples. When some entities

or relations are newly added during testing (a.k.a. unseen

entities or relations), they have to re-train the whole KG

embeddings, which is not feasible in practice due to the quickly

evolving nature and large sizes of many real-world KGs.

Recently, there is an increasing number of inductive KGC

studies which aim to complete triples involving unseen entities

or unseen relations without training the KG embeddings from

the scratch. Among all these works, some try to obtain the

embeddings of unseen entities or unseen relations using external

resources (e.g., textual descriptions) [15], [26], [29], [36], [42]

or auxiliary triples which associate unseen entities with seen

entities [5], [17], [38]. Although these approaches can work,

the additional resources they require are often unavailable or

have low quality, some even with extra high computation costs,

e.g. for text embedding.

An promising and widely investigated direction for address-

ing inductive KGC with unseen entities is to acquire high

level semantics merely from the graph structure. The relevant

approaches often induce entity-independent logical rules that

hold among the relations, from the KG in either statistical

[21] or end-to-end differentiable manners [22], [27], [46]. For

example, we can induce a rule

father_of(x, y)∧son_of(y, z)→

husband_of(x, z)

from the training graph in Fig. 1a, and apply

arXiv:2210.03994v2 [cs.AI] 30 Dec 2022

it to infer new triples in a testing graph with unseen entities, e.g.,

(H,husband_of,K) in Fig. 1b. GraIL [31] is a typical and recent

method of this kind, with the scalability and rule expressiveness

improved by general graph topological features and a relation-

aware graph neural network for message passing and learning

over local subgraphs (see more details in Section

II-B

). GraIL

and its follow-up methods [7], [20] have shown the feasibility

of using message passing over subgraphs for inductive KGC

with promising results achieved on several benchmarks.

However, these methods mostly assume that all the relations

in the testing stage are seen with embeddings learned, which

is often violated in real-world evolving KGs, especially those

constructed via open information extraction systems e.g. NELL

[23] and those that are publicly editable e.g. Wikidata [35].

Moreover, they often pass messages directly over entities with

the patterns on relations not fully utilized. Although some

strategies have been developed to strengthen the role of rela-

tions, there is still much space to explore. For convenience, we

call the inductive KGC cases with only unseen entities during

testing investigated in these works as

partially inductive KGC

and call those more realistic and more challenging cases with

both unseen entities and unseen relations during testing as

fully

inductive KGC1

. Fig. 1 provides clear examples on these two

cases. Currently, there have been few methods speciﬁcally

developed for the latter. The only one we know is MaKEr [10]

which also relies on the graph structure for prediction.

In this study, we aim to address fully inductive KGC with

a method named RMPI, which includes a relation oriented

message passing network for subgraph reasoning. It ﬁrst

transforms a triple’s surrounding subgraph in the original KG

into a new relation view graph, where inter-relation features

are more straightforwardly represented. It then learns the

embedding of an unseen relation from the relational subgraph,

by the relational message passing network, where novel graph

pruning and neighborhood attention techniques are developed

for both efﬁciency and effectiveness, with new neighborhood

aggregations being used for addressing the issue of empty

subgraphs. As an example, for an unseen relation spouse_of in

Fig. 1c, its embedding for predicting the triple with Hand K

can be obtained by aggregating the embeddings of neighboring

relations husband_of,father_of and son_of. Furthermore, RMPI

allows the injection of the KG’s ontological schema, which is

quite common (see KGs like DBpedia [3] and NELL [23]) and

contains complementary relation semantics, for more robust

reasoning on fully inductive KGC. In summary, our main

contributions are the following:

•

We are among the earliest to investigate fully inductive

KGC, considering both subgraph structures and ontologi-

cal schemas.

The term fully inductive has been used in some inductive KGC works that

only consider unseen entities [1], [31], meaning the sets of entities seen during

training and testing are disjoint, as against the other semi inductive case where

unseen entities have to be connected to trained seen entities. In our paper,

we name such a fully inductive setting with only unseen entities as

partially

inductive, to distinguish it from the fully inductive setting we investigated.

•

We have proposed a robust KGC model named RMPI

with novel techniques for effective relational message

passing and subgraph reasoning, supporting both partially

and fully inductive KGC.

•

Extensive experiments have been conducted on

newly-

constructed benchmarks and

public benchmarks from

different KGs. RMPI and its variants often outperform

the baselines including the state-of-the-art methods.

II. PRELIMINARY

We begin by ﬁrst formally deﬁning the task and notations,

and then brieﬂy introducing the background of subgraph-based

inductive reasoning used in existing works.

A. Problem Formulation

A KG is often denoted as

G={E,R,T }

, where

is a set

of entities,

is a set of relations, and

T={(h, r, t)|h, t ∈

E;r∈ R}

is a set of relational facts in form of RDF

triple.

and

are called a triple’s head entity (subject), relation

(predicate) and tail entity (object), respectively. KGC is then

deﬁned to predict an input candidate triple as true or not (i.e.,

triple classiﬁcation), or predict the missing entity/relation in a

triple with the other two elements given (i.e., entity/relation

prediction), which is often achieved by ﬁlling the incomplete

triple with a candidate entity/relation and feeding it into the

model. It is expected that the true (positive) triples are predicted

with higher scores than those false (negative) ones [2]. In a

commonly practiced partially inductive KGC setting, a set of

unseen entities

with

E ∩ E0=∅

are newly emerged during

prediction, the goal is then to predict a triple (

h, r, t

) with

r∈ R

and

h, t ∈ E0

is often known as the training graph,

while the graph composed of the triples by the unseen entities

E0and the relations Ris often known as the testing graph.

In this study, we extend the above partially inductive KGC

to fully inductive KGC by introducing a set of unseen relations

with

R∩R0=∅

to the testing graph. Specially, we

further investigate two situations: one is a general testing graph

involving both seen and unseen relations, while the other is

a testing graph involving only unseen relations. Formally, a

triple given in the testing graph and a triple to predict can be

denoted as (

h, r, t

) with

h, t ∈ E0

r∈(R∪R0)

r∈ R0

We name the evaluation with the ﬁrst testing graph as testing

with semi unseen relations, and name the evaluation with the

second testing graph as testing with fully unseen relations.

B. Subgraph-based Partially Inductive Reasoning

To predict a target triple (

h, r, t

) where

r∈ R

h, t ∈ E0

, and

the embeddings of

and

are both not available, methods such

as GraIL [31] learn and utilize the structural semantics from the

subgraph around this triple in an entity-independent manner. To

better understand, we re-denote such a triple as (

u, rt, v

) using

the notations widely used in GNN, where

and

are referred

as target head entity and target tail entity, respectively, and

is the target relation. The basic workﬂow of GraIL includes

three steps. It ﬁrst extracts the

-hop enclosing subgraph

2Resource Description Framework. See https://www.w3.org/RDF/.

husband_of

daughter_of mother_of

son_of

father_of

husband_of

father_of

lives_in

address

father_of

mother_of

1-hop

2-hop

T-H H-T

H-T

T-T

H-H

T-T

T-H

H-T

husband_of

daughter_of

mother_of

father_of

son_of

lives_in

address

father_of

Fusion

husband_of

daughter_of

mother_of

father_of

son_of

lives_in

address

lives_in

studys_in

father_of

brother_of

lives_in

husband_of

daughter_of

mother_of

father_of

son_of

lives_in

address

lives_in

studys_in

father_of

brother_of

lives_in

The Enclosing Subgraph

The One-hop Disclosing Subgraph

Graph Transformation

husband_of

daughter_of mother_of

son_of

father_of

lives_in

address

1-hop

R(G )

R( )

Pruned Relational Message Passing

Attentive Neighborhood Aggregation

0.8

daughter_of

mother_of

father_of

husband_of

parent_of

famale

male

person

spouse_of

SPO

DOM (rdfs:domain)

RNG (rdfs:range)

SPO (rdfs:subPropertyOf)

SCO (rdfs:subClassOf)

SPO

DOM

RNG

Ontological Schema

Knowledge Graph

Graph Pre-training

(A, husband_of, B)

SCO

Fig. 2: The overall framework of our Relational Message Passing Network.

surrounding the target triple, denoted as

G(u,rt,v)

. It then

annotates the entities in the extracted subgraph according to

their shortest distances to

and

. Namely, for each entity

the subgraph, it is labeled with a tuple

(d(i, u), d(i, v))

, where

d(i, u)

(resp.

d(i, v)

) denotes the shortest distance between

and

(resp.

) without counting any path through

(resp.

It ﬁnally summarizes the subgraph through a GNN encoder

and scores the likelihood of the target triple using the encoded

subgraph representation and the target relation’s embedding.

The GNN-based encoder adopts the general message passing

scheme where a node representation is iteratively updated by

combining it with aggregation of its neighbors’ representations,

and considers the multi-relation features of subgraph. Formally,

the embedding of entity iin the k-th GNN layer is given by:

i=ReLU (X

r∈R

j∈N r

αk

ij Wk

rhk−1

j+Wk

self hk−1

i)(1)

αk

ij =σ(Ak

1s+bk

1)(2)

s=ReLU (Ak

2[hk−1

i⊕hk−1

j⊕ra

t⊕ra] + bk

2)(3)

where

is the set of relations in the KG,

denotes the

neighbors of entity

under relation

is the relation-

speciﬁc transformation matrix used in the

-th layer, and

self

is the matrix for combining message from itself.

αk

denotes

the attention weight at the

-th layer for the edge connecting

and

via relation

, which is computed by a function of the

latent embeddings of

and

learned at previous layer and the

learnable attention embeddings of

and the target relation

The initial embedding

is represented using the node labels

obtained in the second step (i.e., concatenating the one-hot

vectors of

d(i, u)

and

d(i, v)

), and the ﬁnal embedding after

layers

is output to generate the subgraph representation

G(u,rt,v)and score the triple, as:

score(u, rt, v) = W[hK

G(u,rt,v)⊕hK

u⊕hK

v⊕rt](4)

G(u,rt,v)=1

|NG|X

i∈NG

i(5)

where

is the set of entities in the subgraph. The subgraph is

then represented by averaged pooling over all the output entity

embeddings, and the likelihood of the target triple is scored

through a linear layer with the subgraph representation, the

embedding of the target relation

, and the output embeddings

of the target entity pair (hK

uand hK

v), concatenated and fed.

The extracted enclosing subgraph are informative with some

logical evidences (e.g., paths) that could help deduce the

relation between two entities contained. While the distance-

aware entity initial features represent an entity’s relative

positions w.r.t. the target triple’s head and tail entities in

the subgraph, thus capturing the structural semantics without

the need of learning speciﬁc embeddings from the whole

KG. Finally, the applied GNN layers pass messages between

entities iteratively to update the entity features, and the resultant

representations are used to score the triple together with the

learnable relation embeddings.

III. METHODOLOGY

A. Overview

In this paper, we make full use of the reasoning clues

reﬂected by relations, and propose to pass messages (features)

directly from relations to relations iteratively to infer the

likelihood of a target triple, during which the entity features

are not used, while the embedding of an unseen relation can

be inferred and is optionally further augmented by relation

semantics from the KG’s ontological schemas. To achieve this

goal, we proposed a Relational Message Passing Network as

shown in Fig. 2. Brieﬂy, given a target triple, it (i) extracts

an enclosing subgraph

and transforms it to a new graph

that can straightforwardly represent the relationships between

relations via their co-occurrence in

,(ii) propagates features

between relations by multiple message passing layers which are

optimized with a graph pruning strategy for higher computation

efﬁciency and a neighborhood attention mechanism for higher

accuracy, and (iii) computes the target triple’s score, using the

output representation of the target relation, during which the

inductive embeddings of unseen relations are obtained through

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

RelationalMessagePassingforFullyInductiveKnowledgeGraphCompletionYuxiaGengCollegeofComputerScienceandTechnologyZhejiangUniversity,Hangzhou,ChinaDonghaiLaboratory,Zhoushan,Chinagengyx@zju.edu.cnJiaoyanChenDepartmentofComputerScienceTheUniversityofManchesterManchester,UnitedKingdomjiaoyan.chen@manches...

展开>> 收起<<

Relational Message Passing for Fully Inductive Knowledge Graph Completion Yuxia Geng.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Relational Message Passing for Fully Inductive Knowledge Graph Completion Yuxia Geng

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: