Transformer-based Entity Typing in Knowledge Graphs Zhiwei HuVíctor Gutiérrez-BasultoZhiliang Xiang Ru LiJeff Z. Pan

2025-05-06 0 0 1.05MB 14 页 10玖币
侵权投诉
Transformer-based Entity Typing in Knowledge Graphs
Zhiwei HuVíctor Gutiérrez-BasultoZhiliang Xiang
Ru Li♣∗ Jeff Z. Pan
School of Computer and Information Technology, Shanxi University, China
School of Computer Science and Informatics, Cardiff University, UK
ILCC, School of Informatics, University of Edinburgh, UK
zhiweihu@whu.edu.cn,liru@sxu.edu.cn
{gutierrezbasultov,xiangz6}@cardiff.ac.uk
j.z.pan@ed.ac.uk
Abstract
We investigate the knowledge graph entity typ-
ing task which aims at inferring plausible en-
tity types. In this paper, we propose a novel
Transformer-based Entity Typing (TET) ap-
proach, effectively encoding the content of
neighbors of an entity. More precisely, TET
is composed of three different mechanisms:
alocal transformer allowing to infer miss-
ing types of an enity by independently en-
coding the information provided by each of
its neighbors; a global transformer aggregat-
ing the information of all neighbors of an
entity into a single long sequence to reason
about more complex entity types; and a con-
text transformer integrating neighbors content
based on their contribution to the type infer-
ence through information exchange between
neighbor pairs. Furthermore, TET uses infor-
mation about class membership of types to se-
mantically strengthen the representation of an
entity. Experiments on two real-world datasets
demonstrate the superior performance of TET
compared to the state-of-the-art.
1 Introduction
A knowledge graph (KG) (Pan et al.,2016) is a
multi-relational graph encoding factual knowledge,
with the form
(h, r, t)
where
h
,
t
are the head
and tail entities connected via the relation
r
. In
this paper, we consider KGs with minimal schema
information, i.e., those containing entity type as-
sertions, as the only schema information, of the
form
(e, has_type, c)
stating that the entity
e
has
type
c
; e.g., to capture that Barack Obama has
type President. Entity type knowledge is widely
used in NLP tasks, e.g., in relation extraction (Liu
et al.,2014), entity and relation linking (Gupta
et al.,2017;Pan et al.,2019), question answering
(ElSahar et al.,2018;Hu et al.,2022), and fine-
grained entity typing on text (Onoe et al.,2021;
Qian et al.,2021;Liu et al.,2021). However, entity
Contact Authors
Barack
Obama
A Promised
Land
1961
U.S Democratic
Party
Juris
Doctor
Columbia
University
Book
20th-century
American writers
Politician
President
Columbia
University Alumni
American
Legal Scholars
Bachelor
of Science
Relational neighbors
Type neighbors
degree award write
Entity Type Missing type
KG relation Type link Missing type link
Figure 1: A KG with its entity type information.
types are far from complete, since in real-world
applications they are continuously emerging. For
example, about 10% of entities in FB15k (Bordes
et al.,2013) have the type /music/artist, but do not
have /people/person (Moon et al.,2017).
In light of this, it has been recently investigated
the Knowledge Graph Entity Typing (KGET) task,
aiming at inferring missing entity types in a KG.
Most existing approaches to KGET use methods
based on either embeddings or graph convolutional
networks (GCN). Despite the huge progress these
methods have made, there are still some important
challenges to be solved. On the one hand, most
embedding-based models (Moon et al.,2017;Zhao
et al.,2020;Ge et al.,2021;Zhuo et al.,2022)
encode all neighbors of a target entity into a sin-
gle vector, but in many cases only some neigh-
bors are necessary to infer the correct types. For
example, as shown in Figure 1, to predict that
the entity Barack Obama has type President, only
the neighbor
is_leader_of
U.S is needed. Indeed,
using too many neighbors, such as
graduate_from
Columbia University, will introduce noise. The
CET model (Pan et al.,2021) overcomes this
problem by encoding each neighbor independently.
However, since entities and relations are repre-
arXiv:2210.11151v1 [cs.AI] 20 Oct 2022
sented by TransE (Bordes et al.,2013), there is a
restriction on the direction of the representation of
entities and relations direction, fixing it from entity
to relation or vice versa. As a consequence,
cer-
tain interactions between neighbor entities and
relations are ignored
. Also, to predict more com-
plex types, CET directly adds and averages the
neighbor representations,
weakening the contri-
bution of different neighbors
, since it ignores that
the contribution of different neighbors to differ-
ent types might not be the same. For example,
as shown in Figure 1, the inference of the type
20th-century American writer involves multiple se-
mantic aspects of Barack Obama, it requires to
jointly consider the neighbors
write
A Promised
Land,
was_born_in
1961, and
is_leader_of
U.S, but
the neighbor
degree_award
Juris Doctor should get
less attention. On the other hand, GCN frame-
works for KGET use expressive representations
for entities and relations based on their neighbor
entities and relations (Jin et al.,2019;Zhao et al.,
2022;Zou et al.,2022;Vashishth et al.,2020;Pan
et al.,2021). However, a common problem of
GCN-based models is that they aggregate informa-
tion only along the paths starting from neighbors
of the target entity,
limiting the representation of
interdependence between neighbors that are not
directly connected.
For example, in Figure 1the
entities Juris Doctor and U.S are not connected,
but combining their information could help to infer
that American Legal Scholars is a type of Barack
Obama. This could be fixed by increasing the num-
ber of layers, but with an additional computational
cost.
The main objective of this paper is to introduce
a transformer-based approach to KGET that ad-
dresses the highlighted challenges. The transformer
architecture (Vaswani et al.,2017) has been essen-
tial for NLP, e.g., in pre-trained language mod-
els (Devlin et al.,2019;Reimers and Gurevych,
2019;Lan et al.,2020;Wu et al.,2021a), docu-
ment modeling (Wu et al.,2021b), and link pre-
diction (Wang et al.,2019;Chen et al.,2021).
Transformers are well-suited for KGET as enti-
ties and relations in a KG can be regarded as to-
kens, and using the transformer as encoder, one can
thus achieve bidirectional deep interaction between
entities and relations. Specifically, we propose
TET
, a
T
ransformer-based
E
ntity
T
yping model
for KGET, composed of the following three infer-
ence modules. A
local transformer
that indepen-
dently encodes the relational and type neighbors of
an entity into a sequence, facilitating bidirectional
interaction between elements within the sequence,
addressing the first problem. A
global transformer
that aggregates all neighbors of an entity into a sin-
gle long sequence to simultaneously consider mul-
tiple attributes of an entity, allowing to infer more
‘complex’ types, thus addressing the third problem.
A
context transformer
that aggregates neighbors
of an entity in a differentiated manner according
to their contribution while preserving the graph
structure, thus addressing the second problem. Fur-
thermore, we use semantic knowledge about the
known types in a KG. In particular, we find out that
types are normally clustered in classes. For exam-
ple, the types medicine/disease,medicine/symptom,
and medicine/drug belong to the class medicine.
We use this class membership information for re-
placing the ‘generic’ relation has_type with a more
fine-grained relation that captures to which class a
type belongs to, enriching the semantic content of
connections between entities and types. To sum up,
our contributions are:
We propose a novel transformer-based frame-
work for inferring missing entity types in KGs,
encoding knowledge about entity neighbors
from three different perspectives.
We use class membership of types to replace the
single has_type relation with class-membership
relations providing fine-grained semantic infor-
mation.
We conduct empirical and ablation experiments
on two real-world datasets, demonstrating the
superiority of TET over existing SoTA models.
Data, code, and an extended version with
appendix are available at
https://github.
com/zhiweihu1103/ET-TET.
2 Related Work
The knowledge graph completion (KGC) task is
usually concerned with predicting the missing head
or tail entities of a triple. KGET can thus be seen as
a specialization of KGC. Existing KGET methods
can be classified in embedding- and GNC-based.
Embedding-based Methods.
ETE (Moon et al.,
2017) learns entity embeddings for KGs by a stan-
dard representation learning method (Bordes et al.,
2013), and further builds a mechanism for infor-
mation exchange between entities and their types.
ConnectE (Zhao et al.,2020) jointly embeds enti-
ties and types into two different spaces and learns
a mapping from the entity space to the type space.
CORE (Ge et al.,2021) utilizes the models Ro-
tatE (Sun et al.,2019) and ComplEx (Trouillon
et al.,2016) to embed entities and types into two
different complex spaces, and develops a regression
model to link them. However, the above methods
do not fully consider the known types of entities
while training the entity embedding representation,
which seriously affects the prediction performance
of missing types. Also, the representation of types
in these methods is such that they cannot be se-
mantically differentiated. CET (Pan et al.,2021)
jointly utilizes information about existing type as-
sertions in a KG and about the neighborhood of
entities by respectively employing an independent-
based mechanism and an aggregated-based one. It
also utilizes a pooling method to aggregate their
inference results. AttEt (Zhuo et al.,2022) designs
an attention mechanism to aggregate the neighbor-
hood knowledge of an entity using type-specific
weights, which are beneficial to capture specific
characteristics of different types. A shortcoming of
these two methods is that, unlike our TET model,
they are not able to cluster types in classes, and are
thus not able to semantically differentiate them in
a fine-grained way.
GCN-based Methods.
Graph Convolutional
Networks (GCNs) have proven effective on mod-
eling graph structures (Kipf and Welling,2017;
Hamilton et al.,2017;Dettmers et al.,2018). How-
ever, directly using GCNs on KGs usually leads to
poor performance since KGs have different kinds
of entities and relations. To address this problem,
RGCN (Schlichtkrull et al.,2018) proposes to ap-
ply relation-specific transformations in GCN’s ag-
gregation. HMGCN (Jin et al.,2019) proposes a
hierarchical multi-graph convolutional network to
embed multiple kinds of semantic correlations be-
tween entities. CompGCN (Vashishth et al.,2020)
uses composition operators from KG-embedding
methods by jointly embedding both entities and
relations in a relational graph. ConnectE-MRGAT
(Zhao et al.,2022) proposes a multiplex relational
graph attention network to learn on heterogeneous
relational graphs, and then utilizes the ConnectE
method for infering entity types. RACE2T (Zou
et al.,2022) introduces a relational graph attention
network method, utilizing the neighborhood and
relation information of an entity for type inference.
A common problem with these methods is that they
follow a simple single-layer attention formulation,
restricting the information transfer between uncon-
nected neighbors of an entity.
Transformer-based Methods.
To the best of
our knowledge, there are no transformer-based
approaches to KGET. However, two transformer-
based frameworks for the KGC task have been al-
ready proposed: CoKE (Wang et al.,2019) and
HittER (Chen et al.,2021). Our experiments show
that they are not suitable for KGET.
3 Method
In this section, we describe the architecture of our
TET model (cf. Figure 2). We start by introducing
necessary background (Sec. 3.1), then present in
detail the architecture of TET (Sec. 3.2). Finally,
we describe pooling and optimization strategies
(Sec. 3.3 and 3.4).
3.1 Background
In this paper, a knowledge graph (Pan et al.,2016)
is represented in a standard format for graph-
structured data such as RDF (Pan,2009). A knowl-
edge graph (KG)
G
is a tuple
(E,R,C,T)
, where
E
is a set of entities,
C
is a set of entity types,
R
is a set of relation types, and
T
is a set of triples.
Triples in
T
are either relation assertions (
h, r, t
)
where
h, t ∈ E
are respectively the head and tail
entities of the triple, and
r∈ R
is the edge of the
triple connecting head and tail; or entity type as-
sertions
(e, has_type, c)
, where
e∈ E
,
c∈ C
, and
has_type is the instance-of relation. For
e∈ E
,
the relational neighbors of
e
is the set
{(r, f)|
(e, r, f ) T }
. The type neighbors of
e
are defined
as
{(has_type, c)|(e, has_type, c) T }
. We will
simply say neighbors of
e
when we refer to the
relational and type neighbors of
e
. The goal of this
paper is to address KGET task which aims at infer-
ring missing types from
C
in entity type assertions.
3.2 Model Architecture
In this section, we introduce the local, global and
context transformer-based modeling components of
our TET model. Before defining these components,
we start by discussing an important observation.
3.2.1 Class Membership
A key observation is that in a KG all type assertions
are uniformly defined using the relation has_type.
摘要:

Transformer-basedEntityTypinginKnowledgeGraphsZhiweiHu|VíctorGutiérrez-Basulto}ZhiliangXiang}RuLi|JeffZ.Pan|SchoolofComputerandInformationTechnology,ShanxiUniversity,China}SchoolofComputerScienceandInformatics,CardiffUniversity,UKILCC,SchoolofInformatics,UniversityofEdinburgh,UK|zhiweihu@whu.edu...

展开>> 收起<<
Transformer-based Entity Typing in Knowledge Graphs Zhiwei HuVíctor Gutiérrez-BasultoZhiliang Xiang Ru LiJeff Z. Pan.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:1.05MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注