SMiLE Schema-augmented Multi-level Contrastive Learning for Knowledge Graph Link Prediction Miao Peng1 Ben Liu1 Qianqian Xie2 Wenjie Xu1 Hua Wang3 Min Peng1

2025-05-03 0 0 861.1KB 13 页 10玖币

侵权投诉

SMiLE: Schema-augmented Multi-level Contrastive Learning for

Knowledge Graph Link Prediction

Miao Peng1, Ben Liu1, Qianqian Xie2, Wenjie Xu1, Hua Wang3, Min Peng1∗

1School of Computer Science, Wuhan University, China

2Department of Computer Science, The University of Manchester, United Kingdom

3Centre for Applied Informatics, Victoria University, Australia

{pengmiao,liuben123,vingerxu,pengm}@whu.edu.cn

qianqian.xie@manchester.ac.uk,hua.wang@vu.edu.au

Abstract

Link prediction is the task of inferring miss-

ing links between entities in knowledge graphs.

Embedding-based methods have shown effec-

tiveness in addressing this problem by mod-

eling relational patterns in triples. However,

the link prediction task often requires con-

textual information in entity neighborhoods,

while most existing embedding-based meth-

ods fail to capture it. Additionally, little atten-

tion is paid to the diversity of entity represen-

tations in different contexts, which often leads

to false prediction results. In this situation, we

consider that the schema of knowledge graph

contains the speciﬁc contextual information,

and it is beneﬁcial for preserving the consis-

tency of entities across contexts. In this pa-

per, we propose a novel Schema-augmented

Multi-level contrastive LEarning framework

(SMiLE) to conduct knowledge graph link pre-

diction. Speciﬁcally, we ﬁrst exploit network

schema as the prior constraint to sample neg-

atives and pre-train our model by employing

a multi-level contrastive learning method to

yield both prior schema and contextual infor-

mation. Then we ﬁne-tune our model under

the supervision of individual triples to learn

subtler representations for link prediction. Ex-

tensive experimental results on four knowledge

graph datasets with thorough analysis of each

component demonstrate the effectiveness of

our proposed framework against state-of-the-

art baselines. The implementation of SMiLE is

available at https://github.com/GKNL/SMiLE.

1 Introduction

Knowledge graph (KG), as a well-structured rep-

resentation of knowledge, stores a vast number of

human knowledge in the format of triples-(head,

relation, tail). KGs are essential components for

various artiﬁcial intelligence applications, includ-

ing question answering (Diefenbach et al.,2018),

recommendation systems (Wang et al.,2021b), etc.

∗

*Corresponding author

Educated at

Jury of

Worked in

Friend with

Worked in

Born in

Located in

Winner of

Acted in

Directed by Friend with

Located in

Type Type Of

Entity Relation

Acted in Born in

Winner of

Directed by

The

Hours

Australia

Best

Actress

Stephen

Daldry

Lane Cove

Public School

Entity Level

66th Cannes

Prize

Movie

Director

Film

Festival

Theatre

School

Country

Phillip Street

Theatre

Type Level

Nicole

Kidman

Actress Citizen

Figure 1: An example of KG fragment. Nicole Kidman

has two types Actress and Citizen, and each of them

preserves different information in different contexts.

In real world, KGs always suffer from the incom-

pleteness problem, meaning that there are a large

number of valid links in KG are missing. In this

situation, link prediction techniques, which aim to

automatically predict whether a relationship exists

between a head entity and a tail entity, are essential

for triple construction and veriﬁcation.

To address the link prediction problem in KG,

a variety of methods have been proposed. Tradi-

tional rule-based methods like Markov logic net-

works (Richardson and Domingos,2006) and re-

inforcement learning-based method (Xiong et al.,

2017) learn logic rules from KGs to conduct link

prediction. The other mainstream methods are

based on knowledge graph embeddings, includ-

ing translational models like TransE (Bordes et al.,

2013), TransR (Lin et al.,2015) and semantic

matching models like RESCAL (Nickel et al.,

2011), DistMult (Yang et al.,2015). Besides,

embedding-based methods leverage graph neural

networks to explore graph topology (Vashishth

et al.,2020) and utilize type information (Ma et al.,

2017) to enhance representations in KG.

arXiv:2210.04870v3 [cs.CL] 4 Mar 2024

Nevertheless, the aforementioned methods fail

to model the contextual information in entity neigh-

bors. In fact, the context of an entity preserves

speciﬁc structural and semantic information, and

link prediction task is essentially dependent on the

contexts related to speciﬁc entities and triples. Fur-

thermore, not much attention is paid to the diver-

sity of entity representations in different contexts,

which may often result in false predictions. Quan-

titatively, dataset FB15k has 14579 entities and

154916 triples, and the number of entities with

types is 14417 (98.89%). There are 13853 entities

(95.02%) that have more than two types, and each

entity has 10.02 types on average. For example,

entity Nicole Kidman in Figure 1has two different

types (Actress and Citizen), expressing different

semantics in two different contexts. Speciﬁcally,

the upper left in the ﬁgure describes the contextual

information in type level about "Awards and works

of Nicole Kidman as an actress". In this case, it is

well-founded that there exists a relation between

Nicole Kidman and 66

Cannes, and intuitively

the prediction of (Nicole Kidman, ?, Lane Cove

Public School) does not make sense, since there

is no direct relationship between type Actress and

type School. But considering that Nicole Kidman

is also an Australian citizen, it is hence reasonable

to conduct such a prediction.

We argue that the key challenge of preserving

contextual information in embeddings is how to

encapsulate complex contexts of entity neighbor-

hoods. Simply considering all information in the

subgraph of entities as the context may bring in

redundant and noisy information. Schema, as a

high-order meta pattern of KG, contains the type

constraint between entities and relations, and it can

naturally be used to capture the structural and se-

mantic information in context. As for the problem

of inconsistent entity representations, the diverse

representations of an entity are indispensable to

be considered in different contexts. As different

schema deﬁnes diverse type restrictions between

entities, it is able to preserve subtle and precise

semantic information in a speciﬁc context. Addi-

tionally, to yield consistent and robust entity repre-

sentations for each contextual semantics, entities

in contexts of the same schema are supposed to

contain similar features but disparate in different

contexts.

To tackle the aforementioned issues, inspired

by the advanced contrastive learning techniques,

we proposed a novel schema-augmented multi-

level contrastive learning framework to allow ef-

ﬁcient link prediction in KGs. To tackle the in-

completeness problem of KG schema, we ﬁrst ex-

tract and build a <head_type, relation, tail_type>

tensor from an input KG (Rosso et al.,2021)

to represent the high-order schema information.

Then, we design a multi-level contrastive learning

method under the guidance of schema. Speciﬁcally,

we optimize the contrastive learning objective in

contextual-level and global-level of our model sep-

arately. In the contextual-level, contrasting entities

within subgraphs of the same schema can learn se-

mantic and structural characteristics in a speciﬁc

context. In the global-level, differences and global

connections between contexts of an entity can be

captured via a cross-view contrast. Overall, we

exploit the aforementioned contrastive strategy to

obtain entity representations with structural and

high-order semantic information in the pre-train

phase and then ﬁne-tune representations of entities

and relations to learn subtler knowledge of KG.

To summarize, we make three major contribu-

tions in this work as follows:

•

We propose a novel multi-level contrastive

learning framework to preserve contextual in-

formation in entity embeddings. Furthermore,

we learn different entity representations from

different contexts.

•

We design a novel approach to sample hard

negatives by utilizing KG schema as a prior

constraint, and perform the contrast estima-

tion in both contextual-level and global-level,

enforcing the embeddings of entities in the

same context closer while pushing apart enti-

ties in dissimilar contexts.

•

We conduct extensive experiments on four dif-

ferent kinds of knowledge graph datasets and

demonstrate that our model outperforms state-

of-the-art baselines on the link prediction task.

2 Related Work

2.1 KG Inference

To conduct inference like link prediction on in-

complete KG, most traditional methods enumerate

relational paths as candidate logic rules, including

Markov logic network (Richardson and Domin-

gos,2006), rule mining algorithm (Meilicke et al.,

2019) and path ranking algorithm (Lao et al.,2011).

However, these rule-based methods suffer from lim-

ited generalization performance due to consuming

searching space.

The other mainstream methods are based on re-

inforcement learning, which deﬁnes the problem as

a sequential decision-making process (Xiong et al.,

2017;Lin et al.,2018). They train a pathﬁnding

agent and then extract logic rules from reasoning

paths. However, the reward signal in these methods

can be exceedingly sparse.

2.2 KG Embedding Models

Various methods have been explored yet to per-

form KG inference based on KG embeddings.

Translation-based models including TransE (Bor-

des et al.,2013), TransR (Lin et al.,2015) and

RotatE (Sun et al.,2019) model the relation as a

translation operation from head entity to tail entity.

Semantic matching methods like DistMult (Yang

et al.,2015) and QuatE (Zhang et al.,2019) mea-

sure the authenticity of triples through a similar-

ity score function. GNN-based methods are pro-

posed to comprehensively exploit structural infor-

mation of neighbors by a message-passing mech-

anism. R-GCN (Schlichtkrull et al.,2018) and

CompGCN (Vashishth et al.,2020) employ GCNs

to model multi-relational KG.

More recently, some methods integrate auxil-

iary information into KG embeddings. JOIE (Hao

et al.,2019) considers ontological concepts as sup-

plemental knowledge in representation learning.

TransT (Ma et al.,2017) and TKRL (Xie et al.,

2016) leverage rich information in entity types to

enhance representations. Nevertheless, these graph-

based methods further capture relational and struc-

tural information but fail to capture the contextual

semantics and schema information in KG.

2.3 Graph Contrastive Learning

Contrastive learning is an effective technique to

learn representation by contrasting similarities be-

tween positive and negative samples (Le-Khac

et al.,2020). More recently, the self-supervised

contrastive learning method has been introduced

into graph representation area. HeCo (Wang et al.,

2021c) proposes a co-contrastive learning strategy

for learning node representations from the meta-

path view and schema view. CPT-KG (Jiang et al.,

2021b) and PTHGNN (Jiang et al.,2021a) optimize

contrastive estimation on node feature level to pre-

train GNNs on heterogeneous graphs. Furthermore,

Ouyang et al. (2021) proposes a hierarchical con-

trastive model to deal with representation learning

on imperfect KG. SimKGC (Wang et al.,2022) ex-

plores a more effective contrastive learning method

for text-based knowledge representation learning

with pre-trained language models.

3 The Proposed SMiLE Framework

In this section, we ﬁrst present notations related

to this work. Then we introduce the detail and

training strategy of our proposed framework. The

overall architecture of SMiLE is shown in Figure 2.

3.1 Notations

A knowledge graph can be deﬁned as

G= (E,R,T,P)

, where

and

indicate

the set of entities and relations, respectively.

represents the collection of triples

(s, r, o)

and

the set of all entity types. Each entity

s(or o)∈ E

has one or multiple types ts1, ts2, ..., tsn ∈ P.

The goal of our SMiLE model is to study the

structure- and context-preserving properties of en-

tity representations to perform effective link pre-

diction tasks in knowledge graphs, which aim to

infer missing links in an incomplete

. Ideally, the

probability scores of positive triples are supposed

to be higher than those of corrupted negative ones.

Context Subgraph. Given an entity

, we regard

its

-hop neighbors with related edges as its context

subgraph, denoted as

gc(s)

. Likewise, we deﬁne

the context subgraph between two entities

and

as the

-hop neighbors connecting s and o via sev-

eral relations, which can be represented as

gc(s, o)

Knowledge Graph Schema. The schema of KG

can be deﬁned as

S= (P,R)

, where

is the set

of all entity types and

is the set of all relations.

Consequently, the schema of a KG can be char-

acterized as a set of entity-typed triples

(ts, r, to)

meaning that entity

of type

has a connection

with entity oof type tovia a relation r.

3.2 Network Schema Construction

By reason of some existing KGs do not contain

complete schema, inspired by RETA (Rosso et al.,

2021), we design a simple but effective approach

to construct schema Sfrom a KG G.

First, for all triples

(s, r, o)

in KG, we convert

each entity to its corresponding type, hence all

entity-typed triples form a typed collection

{(ts, r, to)|(ts, r, to)∈ P × R × P}

. Noticing that

each entity in KG may have multiple types, we take

each combination of entity types in an entity-typed

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SMiLE:Schema-augmentedMulti-levelContrastiveLearningforKnowledgeGraphLinkPredictionMiaoPeng1,BenLiu1,QianqianXie2,WenjieXu1,HuaWang3,MinPeng1∗1SchoolofComputerScience,WuhanUniversity,China2DepartmentofComputerScience,TheUniversityofManchester,UnitedKingdom3CentreforAppliedInformatics,VictoriaUnivers...

展开>> 收起<<

SMiLE Schema-augmented Multi-level Contrastive Learning for Knowledge Graph Link Prediction Miao Peng1 Ben Liu1 Qianqian Xie2 Wenjie Xu1 Hua Wang3 Min Peng1.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

SMiLE Schema-augmented Multi-level Contrastive Learning for Knowledge Graph Link Prediction Miao Peng1 Ben Liu1 Qianqian Xie2 Wenjie Xu1 Hua Wang3 Min Peng1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: