RulE Knowledge Graph Reasoning with Rule Embedding Xiaojuan Tang13Song-Chun Zhu123Yitao Liang13Muhan Zhang13 1Institute for Artificial Intelligence Peking University2Tsinghua University

2025-05-03 0 0 982.17KB 20 页 10玖币
侵权投诉
RulE: Knowledge Graph Reasoning with Rule Embedding
Xiaojuan Tang,1,3Song-Chun Zhu,1,2,3Yitao Liang,*1,3Muhan Zhang1,3
1Institute for Artificial Intelligence, Peking University 2Tsinghua University
3National Key Laboratory of General Artificial Intelligence, BIGAI
1xiaojuan@stu.pku.edu.cn 1{muhan,yitaol,s.c.zhu}@pku.edu.cn
3{tangxiaojuan,sczhu,liangyitao,mhzhang}@bigai.ai
Abstract
Knowledge graph reasoning is an important
problem for knowledge graphs. In this paper,
we propose a novel and principled framework
called RulE (stands for Rule Embedding) to
effectively leverage logical rules to enhance
KG reasoning. Unlike knowledge graph em-
bedding methods, RulE learns rule embeddings
from existing triplets and first-order rules by
jointly representing entities,relations and log-
ical rules in a unified embedding space. Based
on the learned rule embeddings, a confidence
score can be calculated for each rule, reflect-
ing its consistency with the observed triplets.
This allows us to perform logical rule infer-
ence in a soft way, thus alleviating the brit-
tleness of logic. On the other hand, RulE
injects prior logical rule information into the
embedding space, enriching and regularizing
the entity/relation embeddings. This makes
KGE alone perform better too. RulE is con-
ceptually simple and empirically effective. We
conduct extensive experiments to verify each
component of RulE. Results on multiple bench-
marks reveal that our model outperforms the
majority of existing embedding-based and rule-
based approaches. The code is released at
https://github.com/XiaojuanTang/RulE
1 Introduction
Knowledge graphs (KGs) usually store millions
of real-world facts and are used in a variety of ap-
plications (Wang et al.,2018;Bordes et al.,2014;
Xiong et al.,2017). Examples of knowledge graphs
include Freebase (Bollacker et al.,2008), Word-
Net (Miller,1995) and YAGO (Suchanek et al.,
2007). They represent entities as nodes and re-
lations among entities as edges. Each edge en-
codes a fact in the form of a triplet (head entity,
relation, tail entity). However, KGs are usually
highly incomplete, making their downstream tasks
more challenging. Knowledge graph reasoning,
*Corresponding authors
which predicts missing facts by reasoning on exist-
ing facts, has thus become a popular research area
in artificial intelligence.
There are two prominent lines of work in this
area: knowledge graph embedding (KGE) and rule-
based KG reasoning. Knowledge graph embed-
ding (KGE) methods such as TransE (Bordes et al.,
2013), RotatE (Sun et al.,2019) and BoxE (Ab-
boud et al.,2020) embed entities and relations
into a latent space and compute the score for each
triplet to quantify its plausibility. KGE is effi-
cient and robust to noise. However, it only uses
zeroth-order (propositional) logic to encode exist-
ing facts (e.g., Alice is Bob’s wife.”) without
explicitly leveraging first-order (predicate) logic.
First-order logic uses the universal quantifier to rep-
resent generally applicable logical rules. For in-
stance, “
x, y :xis ys wife yis xs husband
".
Those rules are not specific to particular entities
(e.g., Alice and Bob) but are generally applicable to
all entities. The other line of work, rule-based KG
reasoning, in contrast, explicitly applies logic rules
to infer new facts (Galárraga et al.,2013,2015;
Yi et al.,2018;Sadeghian et al.,2019;Qu et al.,
2020). Unlike KGE, logical rules can achieve inter-
pretable reasoning and generalize to new entities.
However, the brittleness of logical rules greatly
harms prediction performance. Consider the log-
ical rule
(x, works in, y)(x, lives in, y)
as an
example. It is mostly correct. Yet, if somebody
works in New York but actually lives in New Jer-
sey, the rule can still only infer the wrong fact in
an absolute way.
Considering that the aforementioned two lines of
work can complement each other, addressing each
other’s weaknesses with their own merits, it be-
comes imperative to study how to integrate logical
rules with KGE methods in a principled manner.
If we view this integration in a broader context,
embedding-based reasoning can be seen as a neural
method, while rule-based reasoning can be seen
arXiv:2210.14905v3 [cs.AI] 20 May 2024
Turing
UK
Entity Relation Rule
Rule1: BornIn CityOf Nationality
CityOf
Rule1
RulE
Turing
UK
Nationality
(Turing, Nationality, UK)
BornIn
(a) Traditional KGE (b) Our RulE
Figure 1: (a) Traditional KGE methods embed entities
and relations as low-dimensional vectors only using
existing triplets by defining operations between entities
and relations (e.g., translation); (b) Our RulE associates
each rule with an embedding and additionally defines
mathematical operations between relations and logical
rules (e.g., multi-step translation) to leverage first-order
logical rules.
as a symbolic method. Neural-symbolic learning
has also been a focus of artificial intelligence re-
search in recent years (Parisotto et al.,2017;Yi
et al.,2018;Manhaeve et al.,2018;Xu et al.,2018;
Hitzler,2022).
In the KG domain, such efforts exist too. Some
works combine logical rules and KGE by using
rules to infer new facts as additional training data
for KGE (Guo et al.,2016,2018) or directly con-
vert some rules into regularization terms for spe-
cific KGE models (Ding et al.,2018;Guo et al.,
2020). However, they both leverage logical rules
merely to enhance KGE training without actually
using logical rules to perform reasoning. In this
way, they might lose the important information
contained in explicit rules, leading to empirically
worse performance than state-of-the-art methods.
To address the aforementioned limitations, we
propose a simple and principled framework called
RulE, which aims to learn rule embeddings by
jointly representing entities, relations and logical
rules in a unified space. As illustrated in Figure 1,
given a KG and logical rules, RulE assigns an em-
bedding to each entity, relation and rule, and de-
fines respective mathematical operators between
entities and relations (traditional KGE part) as well
as between relations and rules (RulE part). It is
important to note that we cannot define operators
between entities and rules because rules are not
specific to particular entities. By jointly optimizing
entity, relation and rule embeddings in the same
space, RulE allows injecting prior logical rule in-
formation to enrich and regularize the embedding
space. Our experiments reveal that this joint em-
bedding can boost KGE methods themselves. Addi-
tionally, based on the relation and rule embeddings,
RulE is able to give a confidence score to each
rule, similar to how KGE gives each triplet a con-
fidence score. This confidence score reflects how
consistent a rule is with the existing facts, and en-
ables performing logical rule inference in a soft
way by softly controlling the contribution of each
rule, which alleviates the brittleness of logic.
We evaluate RulE on benchmark link predic-
tion tasks and show superior performance. Exper-
imental results reveal that our model outperforms
the majority of existing embedding-based and rule-
based methods. We also conduct extensive ablation
studies to demonstrate the effectiveness of each
component of RulE. All the empirical results verify
that RulE is a simple and effective framework for
neural-symbolic KG reasoning.
2 Preliminaries
A KG consists of a set of triplets
K=
{(h,r,t)|h,t∈ E,r R} E × R × E
, where
E
denotes the set of entities and
R
the set of relations.
For a testing triplet
(h,r,t)
, we define a query as
q= (h,r,?)
. The knowledge graph reasoning (link
prediction) task is to infer the missing entity
t
based
on the existing facts and rules.
2.1 Embedding-based reasoning
Knowledge graph embedding (KGE) represents en-
tities and relations as
embeddings
in a continuous
space. It calculates a score for each triplet based
on these embeddings via a scoring function. The
embeddings are trained so that facts observed in
the KG have higher scores than those not observed.
The learning goal here is to maximize the scores of
positive facts (existing triplets) and minimize those
of sampled negative samples.
RotatE (Sun et al.,2019) is a representative
KGE method with competitive performance on
common benchmark datasets. It maps entities in
a complex space and defines relations as element-
wise rotations in each two-dimensional complex
plane. Each entity and each relation is associated
with a complex vector, i.e.,
h,r,tCk
, where the
modulus of each element in
r
is fixed to 1 (multi-
plying a complex number with a unitary complex
number is equivalent to a 2D rotation). If a triplet
(h,r,t)
holds, it is expected that
thr
in the
complex space, where
denotes the Hadamard
(element-wise) product. Formally, the distance
function of RotatE is defined as:
d(h,r,t) =hrt.(1)
2.2 Rule-based reasoning
Logical rules are usually expressed as first-
order logic formulae, e.g.,
x, y, z : (x, r1, y)
(y, r2, z)(x, r3, z)
, or
r1(x, y)r2(y, z)
r3(x, z)
for brevity. The left-hand side of the impli-
cation “
” is called rule body or premise, and the
right-hand side is rule head or conclusion. Logical
rules are often restricted to be closed, which form
chains. For a chain rule, successive relations share
intermediate entities (e.g.,
y
), and the rule head’s
and rule body’s head/tail entity are the same. Chain
rules include common logical rules in KG such as
symmetry, inversion, composition, hierarchy, and
intersection rules. These rules play an important
role in KG reasoning. The length of a rule is the
number of atoms (relations) that exist in its rule
body. A
grounding
of a rule is obtained by sub-
stituting all variables
x, y, z
with specific entities.
If all triplets in the body of a grounding rule exist
in the KG, we get a
support
of this rule. Those
rules that have nonzero support are called activated
rules. When inferring a query
(h,r,?)
, rule-based
reasoning enumerates relation paths between head
h
and each candidate tail, and uses activated rules
to infer the answer. See Appendix 9for illustrative
examples.
3 Method
This section introduces our proposed model RulE.
RulE is a principled framework to combine KG
embedding with logical rules by learning rule em-
beddings. As illustrated in Figure 2, the training
process of RulE consists of three key components.
Consider a KG containing triplets and a set of logi-
cal rules automatically extracted or predefined by
experts. They are: 1) Joint entity/relation/rule
embedding. We model the relationship between
entities and relations as well as the relationship
between relations and logical rules to jointly train
entity, relation and rule embeddings in a continuous
space, as demonstrated in Figure 1. 2) Soft rule
reasoning. With the rule and relation embeddings,
we calculate a confidence score for each rule which
is used as the weight of activated rules to output
a grounding rule score. 3) Finally, we integrate
the KGE score calculated from the entity and rela-
tion embeddings trained in the first stage and the
grounding rule score obtained in the second stage
to reason unknown triplets.
3.1 Joint entity/relation/rule embedding
Given a triplet
(h,r,t)∈ K
and a rule
R∈ L
,
we use
h,r,t,RCk
to represent their embed-
dings, respectively, where
k
is the dimension of
the complex space (following RotatE). Similar to
KGE, which encodes the plausibility of each triplet
with a scoring function, RulE additionally defines
a scoring function for logical rules. Based on the
two scoring functions, it jointly learns entity, re-
lation and rule embeddings in the same space by
maximizing the plausibility of existing triplets
K
(zeroth-order logic) and logical rules
L
(first-order
logic). The following describes in detail how to
model the triplets and logical rules together.
Modeling the relationship between entities
and relations To model triplets, we take Ro-
tatE (Sun et al.,2019) due to its simplicity and
competitive performance. Its loss function with
negative sampling is defined as:
Lt(h,r,t) = log σ(γtd(h,r,t))
X
(h,r,t)N
1
|N|log σ(d(h,r,t)γt),(2)
where
γt
is a fixed triplet margin,
d(h,r,t)
is the
distance function defined in Equation (1), and
N
is the set of negative samples constructed by re-
placing either the head entity or the tail entity with
a random entity using a self-adversarial negative
sampling approach. Note that RulE is not restricted
to particular KGE models. The RotatE can be re-
placed with other models, such as TransE (Bordes
et al.,2013) and ComplEx (Trouillon et al.,2016),
too.
Modeling the relationship between relations
and logical rules A universal first-order logical
rule is some rule that universally holds for all en-
tities. Therefore, we cannot relate such a rule to
specific entities. Instead, it is a higher-level con-
cept related only to the relations it is composed
of. Our modeling strategy is as follows. For a
logical rule
R:r1r2. . . rlrl+1
, we ex-
pect that
rl+1 (r1r2. . . rl)R
. Because
the modulus of each element in
r
is restricted to
1, the multiple rotations in the complex plane are
equivalent to the summation of the corresponding
angles. We define
g(r)
to return the angle vector
of relation
r
(taking the angle for each element of
r
). Note that the definition of Hadamard product in
Equation 1is equivalent to the term
g(r)
as defined
in Equation 3. More interpretations are provided
e!, 𝑟
!, 𝑒"
𝑒", 𝑟", 𝑒#
𝑒!, 𝑟$, 𝑒%
𝑒%, 𝑟%, 𝑒#
(𝑅!:𝑟
!𝑟"𝑟#)
(𝑅": 𝑟
$∧ r%∧ 𝑟&⇒ 𝑟')
(𝑅#:𝑟'r(𝑟#)
Triplets and logic rules
Initialized embeddings 𝑓triple 𝑒!,𝑟
!𝑒"
𝑓triple 𝑒",𝑟"𝑒&
𝑓triple 𝑒!,𝑟'𝑒(
Triplet loss
𝑓rule 𝑟
!,𝑟",𝑅!𝑟#
𝑓rule 𝑟
$,𝑟%,𝑟&,𝑅"𝑟'
𝑓rule 𝑟',𝑟(,𝑅#𝑟#
Rule loss
Joint
Entity/Relation/Rule
Embedding
Jointly training
Optimized embeddings
Soft multi-hot encoding
𝑤!𝑤#
KGE score
Final score
Soft Rule Reasoning
Grounding rule score
MLP
𝑟
$
𝑟%
𝑟&
𝑟
!
Rule grounding
𝑟'𝑟(
𝑟"
𝑟#
𝑒&
𝑒!
𝑒)
𝑒!
𝑟
!
𝑟
*
𝑅!
𝑅+
(𝑒!,𝑟&,?)
𝑒)
𝑒!
𝑟
!
𝑟
*
𝑅!
𝑅+
Rule confidence
Figure 2: Architecture of RulE. It consists of three components. 1) We first model the relationship between entities
and relations as well as the relationship between relations and logical rules to learn joint entity, relation and rule
embedding in the same continuous space. With the learned rule embeddings (
R
) and relation embeddings (
r
),
RulE can output a weight (
w
) as the confidence score of each rule. 2) In the soft rule reasoning stage, we construct
a soft multi-hot encoding
v
based on rule confidences. Specifically, for triplet
(e1, r3, e6)
, only
R1
and
R3
can
infer the fact with the grounding paths
e1r1r2e6
and
e1r7r8e6
(highlighted with purple and
blue). Thus, the value of
v1
is
w1
,
v3
is
w3
and others (unactivated rules) are
0
. Then the constructed soft multi-hot
encoding passes an MLP to output the grounding rule score. 3) Finally, RulE integrates the KGE score calculated
from the entity and relation embeddings trained in the first stage and the grounding rule score obtained in the second
stage to reason unknown triplets.
in Appendix 15. Then, the distance function is
formulated as follows:
dr(r1,...,rl+1,R) =
l
X
i=1
g(ri)
+g(R)g(rl+1).
(3)
We also employ negative sampling, the same as
when modeling triplets. At this time, it replaces a
relation (either in rule body or rule head) with a
random relation. The loss function for logical rules
is defined as:
Lr(r1,...,rl+1,R) = log σ(γrdr)
X
(r
1,...,r
l+1,R)M
1
|M|log σ(d
rγr),(4)
where
γr
is a fixed rule margin and
M
is the set of
negative rule samples.
Note that the above strategy is not the only pos-
sible way. For example, when considering the rela-
tion order of logical rules (e.g., sister’s mother is
different from mother’s sister), we design a variant
of RulE using position-aware sum, which shows
slightly improved performance on some datasets.
See Appendix 14. Nevertheless, we find that Equa-
tion (3) is simple and good enough, thus keep it as
the default choice.
Joint training Given a KG containing triplets
K
and logical rules
L
, we jointly optimize the two
loss functions (2) and (4) to get the final entity,
relation and rule embeddings:
L=X
(h,r,t)∈K
Lt(h,r,t)
+αX
(r1,...,rl,R)∈L
Lr(r1,...,rl+1,R),
(5)
where
α
is a hyperparameter to balance the two
losses. Note that the two losses act as each other’s
regularization terms. The rule loss (4) cannot
be optimized alone, otherwise there always exist
(r1,...,rl+1,R)
s that can perfectly minimize the
loss, leading to meaningless embeddings. How-
ever, when jointly optimizing it with the triplet
loss, the embeddings will be regularized, and rules
more consistent with the triplets tend to have lower
losses (by being more easily optimized). On the
other hand, the rule loss also provides a regulariza-
tion to the triplet (KGE) loss by adding additional
constraints that relations should satisfy. This ad-
ditional information enhances the KGE training,
leading to entity/relation embeddings more consis-
tent with prior rules.
3.2 Soft rule reasoning
As shown in Figure 2, during soft rule reasoning,
we use the joint relation and rule embeddings to
compute the confidence score of each rule. Similar
摘要:

RulE:KnowledgeGraphReasoningwithRuleEmbeddingXiaojuanTang,1,3Song-ChunZhu,1,2,3YitaoLiang,*1,3MuhanZhang∗1,31InstituteforArtificialIntelligence,PekingUniversity2TsinghuaUniversity3NationalKeyLaboratoryofGeneralArtificialIntelligence,BIGAI1xiaojuan@stu.pku.edu.cn1{muhan,yitaol,s.c.zhu}@pku.edu.cn3{ta...

展开>> 收起<<
RulE Knowledge Graph Reasoning with Rule Embedding Xiaojuan Tang13Song-Chun Zhu123Yitao Liang13Muhan Zhang13 1Institute for Artificial Intelligence Peking University2Tsinghua University.pdf

共20页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:20 页 大小:982.17KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 20
客服
关注