Entity Aware Negative Sampling with Auxiliary Loss of False Negative Prediction for Knowledge Graph Embedding Sang-Hyun Je

2025-04-26 0 0 1.53MB 12 页 10玖币
侵权投诉
Entity Aware Negative Sampling with Auxiliary Loss of False Negative
Prediction for Knowledge Graph Embedding
Sang-Hyun Je
Kakao Enterprise Corp.
shje65@gmail.com
Abstract
Knowledge graph (KG) embedding is widely
used in many downstream applications using
KGs. Generally, since KGs contain only
ground truth triples, it is necessary to construct
arbitrary negative samples for representation
learning of KGs. Recently, various methods
for sampling high-quality negatives have been
studied because the quality of negative triples
has great effect on KG embedding. In this
paper, we propose a novel method called En-
tity Aware Negative Sampling (EANS), which
is able to sample negative entities resemble to
positive one by adopting Gaussian distribution
to the aligned entity index space. Addition-
ally, we introduce auxiliary loss for false neg-
ative prediction that can alleviate the impact
of the sampled false negative triples. The pro-
posed method can generate high-quality nega-
tive samples regardless of negative sample size
and effectively mitigate the influence of false
negative samples. The experimental results
on standard benchmarks show that our EANS
outperforms existing the state-of-the-art meth-
ods of negative sampling on several knowl-
edge graph embedding models. Moreover, the
proposed method achieves competitive perfor-
mance even when the number of negative sam-
ples is limited to only one.
1 Introduction
Knowledge graph (KG) is a multi-relational
directed graph that contains various entities
and their relationships. Each edge of KG
describes factual information, called a
fact
or
triple
. A fact as triple is composed
of two entities and relation corresponding to
their relationship and it is represented in the
form of
(head entity, relation, tail
entity)
, which is denoted as
(h, r, t)
,
e.g.
(Christopher Nolan, DirectorOf,
Interstellar)
. Freebase (Bollacker et al.,
2008), YAGO (Suchanek et al.,2007), and DB-
pedia (Auer et al.,2007) are examples of large
Entity Embedding Space
!"#
!#!$
!"$
%&#' (#' !#)
%&$' (
$' !$)
Observed Triples
%&#' (#'!"#)
%&$' (
$'!"$)
Entity Aware
Negative Triples
Entity Clustering
Aligned Entity Index Space
1 2 3 4
!#!$
1
2
34
Entity Clusters
!#
!$
Entity Index
Re-ordering
Figure 1: The overview of proposed entity aware nega-
tive sampling method.
scale knowledge graphs that contains real world
information. Recently, KG is being actively used
to inject structured knowledge into target system
in various fields such as recommendation (Wang
et al.,2019;Xu et al.,2020;Zhang et al.,2018),
question answering (Huang et al.,2019;Saxena
et al.,2020), and natural language generation (Liu
et al.,2021;Wu et al.,2020).
KGs are usually incomplete because there are in-
evitably missing links between entities. To predict
these missing link in KGs, a lot of link prediction
(a.k.a., knowledge graph embedding) models are
trying to embed elements of KG to low dimensional
vector space. Since KGs basically contain only true
triples, most of the existing knowledge graph em-
bedding model are trained in a contrastive learning
manner that widens the gap of scores between true
and false triples.
Obviously, the quality of negative samples is
critical to learning knowledge graph embedding.
Nevertheless, replacing head or tail entity of posi-
tive triple with random entity from the entire enti-
ties of KG for constructing negative samples is the
most widely used because of its efficiency. Such
a uniform random sampling method can be effec-
tive at the beginning of training. But as training
progresses, the trivial negative samples lose their ef-
fectiveness and give zero loss to the training model
(Wang et al.,2018). To resolve this problem, many
studies have been proposed to construct more mean-
arXiv:2210.06242v1 [cs.LG] 12 Oct 2022
ingful negatives.
To generate high-quality negative samples, we
propose an entity aware negative sampling (EANS)
method that exploits entity embeddings of the
knowledge graph embedding model itself. EANS
can generate high-quality negatives regardless of
the negative sample size. The proposed method
constructs negative samples based on the assump-
tion that the negative triples which are corrupted
by similar candidate entities to the original positive
entity will be high-quality negative samples. EANS
samples negative entities similar to positive one by
utilizing the distribution of entity embeddings. The
generated entity aware negative samples push the
model to continuously learn effective representa-
tions.
While generating high-quality negative samples,
it should be careful of the influence of false nega-
tives. When corrupting positive triples using enti-
ties which are similar to the positive one, the pos-
sibility of generating false negative samples also
increases. To alleviate the effect of false nega-
tive triples, we propose an auxiliary loss for false
negative prediction. The proposed function can
mitigate the effect of false negatives by calculating
additional prediction scores and reducing the triple
scores of false negatives.
We evaluate the proposed EANS method on sev-
eral famous knowledge graph embedding models
and two widely used benchmarks. Based on the
experimental results, proposed method achieves
remarkable improvements over baseline models.
Moreover, it shows better performance than the ex-
isting negative sampling methods on several mod-
els. Our method also shows comparable perfor-
mance while using much smaller number of neg-
ative samples. Especially EANS produces com-
petitive performance to existing negative sampling
methods even with only one negative sample.
2 Related Work
2.1 Knowledge Graph Embedding Models
There are two main streams in the knowledge graph
embedding, one for translational distance mod-
els and the other for semantic matching models.
TransE (Bordes et al.,2013) is the first proposed
translational distance based model. Various ex-
tensions of TransE, such as TransH (Wang et al.,
2014), TransR (Lin et al.,2015) and TransD (Ji
et al.,2015), increase their expressive power by
projecting entity and relation vectors into various
spaces. RESCAL (Nickel et al.,2011), DistMult
(Yang et al.,2014), and ComplEx (Trouillon et al.,
2016) is the most representative models of seman-
tic matching based methods. RESCAL treats each
relation as a matrix which capture latent seman-
tics of entities. DistMult simplifies RESCAL by
constraining relation matrices to diagonal matrices.
ComplEx is an extension of DistMult that extend
embedding vectors into complex space. Recently,
more complex and sophisticated models (Dettmers
et al.,2018;Vashishth et al.,2019;Sun et al.,2019;
Lu et al.,2022;Vashishth et al.,2020) have been
studied. Such methods introduce various technique
and networks to model the scoring function and
extend embeddings of KG’s elements into various
spaces.
2.2 Negative Sampling
To construct meaningful negative samples, (Wang
et al.,2018;Cai and Wang,2017) proposed Genera-
tive Adversarial Network(GAN) (Goodfellow et al.,
2014) based architecture to model the distribution
of negative samples. However, these methods need
much more additional parameters for extra gener-
ator and it could be hard to train GAN because of
its instability and degeneracy (Zhang et al.,2019).
To address these problems, caching based method
(Zhang et al.,2019) have been proposed with fewer
parameters compared to GAN-based methods to
keep high-quality negative triples. (Ahrabian et al.,
2020) suggested the method that utilize the struc-
ture of graph by choose negative samples from
k-hop neighborhood in the graph. (Sun et al.,2019)
proposed self-adversarial negative sampling, which
give difference weights to each sampled negative
according to its triple score. In recent, (Hajimorad-
lou and Kazemi,2022) proposed different training
procedure without negative sampling. Instead of
using negative samples, they fully utilize regulariza-
tion method to train knowledge graph embeddings.
3 Method
In this section, we introduce our proposed entity
aware negative sampling (EANS) method. The pro-
posed method consists of two parts. One is select-
ing a negative entity by re-ordering entire entities
to create entity aware negative triples, and the other
is calculating an additional loss to mitigate the in-
fluence of false negatives. The remainder of this
section gives details of each part. Entire process of
proposed method are summarized in Algorithm 1.
Algorithm 1 Algorithm of EANS
Input
: Knowledge graph
G={(h, r, t)}
, entity
set E, relation set R
1:
Initialize embeddings
We
for each
e∈ E
and
Wrfor each r∈ R
2: for i= 1, . . . , max_step do
3: sample a mini-batch Gbatch ∈ G
4: for (h, r, t)∈ Gbatch do
5: get negative entity h0(or t0), where
h0=int(h(or t)+N(0,1) σ)
6: construct negative triple (h0, r, t0)
7: update parameters w.r.t. the gradients
of loss function, Eq. 7.
8: end for
9: if (imod reorder_step) == 0then
10: clustering the entity embeddings We
using K-means method
11: re-ordering the index of Ebased on
clustering labels
12: end if
13: end for
3.1 Entity Aware Negative Sampling (EANS)
3.1.1 Entity Embedding based Clustering
Given a KG, let
E
be the entire entities set,
R
be
the relation set, and
G
be all truth triple sets. A
triple score
f(h, r, t)
is calculated by an adopted
specific knowledge graph embedding model.
In the general uniform random negative sam-
pling method, a negative triple,
(h0, r, t)
or
(h, r, t0)
, can be constructed by corrupting the en-
tities of an observed positive triple
(h, r, t)
, where
h0, t0∈ E
. When corrupting the entities, random
entities are extracted from uniformly weighted
E
.
Since the most of the entities in
E
are not highly
related to the each positive triple
(h, r, t)
, it is hard
to expect sampling high-quality entities through
the uniform random sampling method.
In order to sample meaningful entities, we de-
sign EANS to select negative entities that are highly
related to the positive one. The key intuition of
our method is that two entities which have similar
embedding vectors can be high-quality negative
sample to each other. Therefore, it is possible
to construct high-quality negative samples by cor-
rupting with entities that have similar embedding
vectors to positive entity.
The simplest way to find entities similar to the
positive entity is calculating the distances between
the embedding vectors of the positive entity and
entire entities in
E
at every training step and search-
ing the nearest neighbors. However, this method
consumes a large computational resource and the
cost will increases dramatically in proportion to the
size of embedding dimension dand entities set E.
Instead of applying the nearest neighbor search-
ing for negative entity selection at every step, we
design entity clustering based sampling method.
First, our method groups similar entities in advance
and samples negative entities based on the clusters.
This method can be implemented through the K-
means (Lloyd,1982) clustering algorithm. Each
entity representation
ei
of
i
-th entity for K-means
clustering is constructed by concatenating all entity
specific parameters,
ei= [W1
i;W2
i;· · · ;WL
i],(1)
where
Wl
are entity specific parameters in model
and
;
denotes concatenating operation. For ex-
ample, entity representation of TransD (Ji et al.,
2015) consists of entity specific embedding and
entity transfer vector, and can be represented as
ei= [Wemb
i;Wtransf er
i]
. Given a positive entity,
a negative entity is chosen based on the cluster to
which the positive entity belongs. The important
point of selecting a negative entity is that it is not
only selected from the same cluster positive entity
belong to, but also selected from the outside of
the corresponding cluster. The more details of the
method are described in section 3.1.2.
Since clustering algorithm is adopted to the en-
tities which are on training for knowledge graph
embedding, the entity embeddings will continu-
ously change as learning goes on. Therefore, clus-
ter labels of entities should be constantly updated.
However, executing clustering algorithm in every
training step is also intractable. Fortunately, clus-
tering algorithm does not need to be executed every
training step for negative sampling. In our experi-
ments, it shows sufficient performance even if the
cluster labels were only updated every one to three
epoch.
3.1.2 Entity Index Re-Ordering for Gaussian
Sampling in Entity Index Space
The K-means clustering algorithm does not guaran-
tee that the numbers of data points in each cluster
are evenly divided. If the size of the cluster be-
comes extremely small, the same entity is used too
repeatedly as a negative, which adversely affects
learning of models.
摘要:

EntityAwareNegativeSamplingwithAuxiliaryLossofFalseNegativePredictionforKnowledgeGraphEmbeddingSang-HyunJeKakaoEnterpriseCorp.shje65@gmail.comAbstractKnowledgegraph(KG)embeddingiswidelyusedinmanydownstreamapplicationsusingKGs.Generally,sinceKGscontainonlygroundtruthtriples,itisnecessarytoconstructar...

展开>> 收起<<
Entity Aware Negative Sampling with Auxiliary Loss of False Negative Prediction for Knowledge Graph Embedding Sang-Hyun Je.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:1.53MB 格式:PDF 时间:2025-04-26

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注