ingful negatives.
To generate high-quality negative samples, we
propose an entity aware negative sampling (EANS)
method that exploits entity embeddings of the
knowledge graph embedding model itself. EANS
can generate high-quality negatives regardless of
the negative sample size. The proposed method
constructs negative samples based on the assump-
tion that the negative triples which are corrupted
by similar candidate entities to the original positive
entity will be high-quality negative samples. EANS
samples negative entities similar to positive one by
utilizing the distribution of entity embeddings. The
generated entity aware negative samples push the
model to continuously learn effective representa-
tions.
While generating high-quality negative samples,
it should be careful of the influence of false nega-
tives. When corrupting positive triples using enti-
ties which are similar to the positive one, the pos-
sibility of generating false negative samples also
increases. To alleviate the effect of false nega-
tive triples, we propose an auxiliary loss for false
negative prediction. The proposed function can
mitigate the effect of false negatives by calculating
additional prediction scores and reducing the triple
scores of false negatives.
We evaluate the proposed EANS method on sev-
eral famous knowledge graph embedding models
and two widely used benchmarks. Based on the
experimental results, proposed method achieves
remarkable improvements over baseline models.
Moreover, it shows better performance than the ex-
isting negative sampling methods on several mod-
els. Our method also shows comparable perfor-
mance while using much smaller number of neg-
ative samples. Especially EANS produces com-
petitive performance to existing negative sampling
methods even with only one negative sample.
2 Related Work
2.1 Knowledge Graph Embedding Models
There are two main streams in the knowledge graph
embedding, one for translational distance mod-
els and the other for semantic matching models.
TransE (Bordes et al.,2013) is the first proposed
translational distance based model. Various ex-
tensions of TransE, such as TransH (Wang et al.,
2014), TransR (Lin et al.,2015) and TransD (Ji
et al.,2015), increase their expressive power by
projecting entity and relation vectors into various
spaces. RESCAL (Nickel et al.,2011), DistMult
(Yang et al.,2014), and ComplEx (Trouillon et al.,
2016) is the most representative models of seman-
tic matching based methods. RESCAL treats each
relation as a matrix which capture latent seman-
tics of entities. DistMult simplifies RESCAL by
constraining relation matrices to diagonal matrices.
ComplEx is an extension of DistMult that extend
embedding vectors into complex space. Recently,
more complex and sophisticated models (Dettmers
et al.,2018;Vashishth et al.,2019;Sun et al.,2019;
Lu et al.,2022;Vashishth et al.,2020) have been
studied. Such methods introduce various technique
and networks to model the scoring function and
extend embeddings of KG’s elements into various
spaces.
2.2 Negative Sampling
To construct meaningful negative samples, (Wang
et al.,2018;Cai and Wang,2017) proposed Genera-
tive Adversarial Network(GAN) (Goodfellow et al.,
2014) based architecture to model the distribution
of negative samples. However, these methods need
much more additional parameters for extra gener-
ator and it could be hard to train GAN because of
its instability and degeneracy (Zhang et al.,2019).
To address these problems, caching based method
(Zhang et al.,2019) have been proposed with fewer
parameters compared to GAN-based methods to
keep high-quality negative triples. (Ahrabian et al.,
2020) suggested the method that utilize the struc-
ture of graph by choose negative samples from
k-hop neighborhood in the graph. (Sun et al.,2019)
proposed self-adversarial negative sampling, which
give difference weights to each sampled negative
according to its triple score. In recent, (Hajimorad-
lou and Kazemi,2022) proposed different training
procedure without negative sampling. Instead of
using negative samples, they fully utilize regulariza-
tion method to train knowledge graph embeddings.
3 Method
In this section, we introduce our proposed entity
aware negative sampling (EANS) method. The pro-
posed method consists of two parts. One is select-
ing a negative entity by re-ordering entire entities
to create entity aware negative triples, and the other
is calculating an additional loss to mitigate the in-
fluence of false negatives. The remainder of this
section gives details of each part. Entire process of
proposed method are summarized in Algorithm 1.