Focusing on Context is NICE Improving Overshadowed Entity Disambiguation Vera Provatorova1 Simone Tedeschi23 Svitlana Vakulenko4

2025-04-27 0 0 510.79KB 10 页 10玖币
侵权投诉
Focusing on Context is NICE:
Improving Overshadowed Entity Disambiguation
Vera Provatorova1, Simone Tedeschi 2,3, Svitlana Vakulenko4
Roberto Navigli2and Evangelos Kanoulas1
1University of Amsterdam, 2Sapienza University of Rome
3Babelscape, Italy, 4Amazon Alexa AI
{v.provatorova, e.kanoulas}@uva.nl
{tedeschi, navigli}@diag.uniroma1.it
svitlana.vakulenko@gmail.com
Abstract
Entity disambiguation (ED) is the task of map-
ping an ambiguous entity mention to the corre-
sponding entry in a structured knowledge base.
Previous research showed that entity overshad-
owing is a significant challenge for existing
ED models: when presented with an ambigu-
ous entity mention, the models are much more
likely to rank a more frequent yet less contextu-
ally relevant entity at the top. Here, we present
NICE, an iterative approach that uses entity
type information to leverage context and avoid
over-relying on the frequency-based prior. Our
experiments show that NICE achieves the best
performance results on the overshadowed enti-
ties while still performing competitively on the
frequent entities.
1 Introduction
Entity disambiguation (ED) is the task of mapping
an ambiguous entity mention to the corresponding
entry in a structured knowledge base. Despite ED
being a well-known task, recent work has shown
that the existing methods are still far from achiev-
ing human-level performance: in particular, the
case of entity overshadowing remains a big chal-
lenge. An entity
e1
overshadows
e2
if the two enti-
ties share the same surface form
m
, and
e1
is more
common than
e2
, i.e., has a higher prior probability
to be linked to
m
(Provatorova et al.,2021). For ex-
ample, when given the sentence
Michael Jordan
published a paper on machine learning” and the
task of linking Michael Jordan either to the bas-
ketball player (a frequent entity) or to the scientist
(an overshadowed entity), a human will correctly
choose the latter, while a typical model is likely
to ignore the context and give the wrong yet more
popular answer due to over-relying on prior prob-
ability. Figure 1shows another example of entity
overshadowing: the entity Rome (TV series) is
overshadowed by Rome (city).
According to previous research, current ED sys-
tems are prone to over-relying on prior probability
Rome_(city) output (REL, WAT)
Rome_(TV_show) correct answer
Hinds played Caesar in Rome .
Figure 1: An example of entity overshadowing: two
popular ED systems (REL and WAT) predict the most
frequent entity (Rome the city) instead of the correct
answer in this context (Rome the TV series).
instead of focusing on context information, which
causes them to underperform on overshadowed en-
tities. In benchmarking experiments performed
by Provatorova et al. (2021), all ED systems under
evaluation appeared to have a large performance
gap between Top and Shadow subsets of the Shad-
owLink dataset, where Top contains most frequent
entities and Shadow contains their overshadowed
counterparts. The results of a human evaluation
experiment in the same study indicate that the chal-
lenge of entity overshadowing is unique to auto-
mated ED methods: human participants achieved
equally good results at disambiguating entities sam-
pled from Top and Shadow. These findings call for
further research in the field of ED, with the goal
of building a method that outperforms existing sys-
tems on overshadowed entities while still achieving
competitive results on standard datasets.
Interestingly, the best results on Shadow in
the benchmarking experiments were achieved by
AIDA (Hoffart et al.,2011), an unsupervised col-
lective entity disambiguation method: while still
affected by overshadowing, this method appeared
to be the best at capturing the context information
in comparison with modern neural appproaches.
Specifically, AIDA relies on two main sources of
context information: semantic similarity between
an entity and its context and graph-based related-
ness between the candidate entities of different
arXiv:2210.06164v1 [cs.CL] 12 Oct 2022
mentions. Our study continues this line of work,
incorporating modern neural methods to measure
semantic similarity and adding novel heuristics to
improve candidate filtering and collective disam-
biguation.
We introduce NICE (NER
1
-enhanced Iterative
Combination of Entities), a combined entity dis-
ambiguation algorithm designed to tackle the chal-
lenge of entity overshadowing by focusing on three
aspects of context-based information: entity types,
entity-context similarity and entity coherence. The
pipeline of NICE includes a NER-enhanced candi-
date filtering module designed to improve robust-
ness on overshadowed entities (Section 2.1), a pre-
scoring module that calculates semantic similarity
between a candidate entity and a mention in con-
text, and an unsupervised iterative disambiguation
algorithm that maximises entity coherence (Section
2.3), combining the relatedness scores between can-
didate entities with the scores of the semantic simi-
larity module (Sections 2.3-2.4). To the best of our
knowledge, our study is the first attempt to build an
entity disambiguation method designed specifically
to tackle the problem of entity overshadowing.
We perform a systematic evaluation of the NICE
method, and use our experimental results to answer
the following research questions:
RQ1:
Does focusing on context information im-
prove ED performance on overshadowed entities?
RQ2:
Does focusing on context information in-
stead of relying on mention-entity priors in ED al-
low to maintain competitive performance on more
frequent entities?
RQ3:
In what ways do the different aspects of
context information contribute to ED performance
on overshadowed entities?
We hope that our work will encourage further
studies concerning overshadowed entities. The
source code of the NICE method is provided as
supplementary material and will be released pub-
licly upon acceptance.
2 The NICE method
Our method is based on the assumption that the
main challenge in disambiguating overshadowed
entities stems from over-relying on entity common-
ness, and therefore switching the focus to the con-
text (entity relatedness) can improve the perfor-
mance. We consider three main ways of extracting
1Named Entity Recognition (Yadav and Bethard,2018)
information from the context: (1) using mention-
entity similarity to predict entity types and improve
candidate filtering, (2) using word embeddings en-
hanced with entity types to measure semantic sim-
ilarity between an entity and its context, and (3)
using entity-entity similarity to make sure that the
entity disambiguating decisions within one docu-
ment are coherent (collective disambiguation).
2.1 Candidate filtering
Adding the step of filtering candidate entities
before disambiguation brings the benefits of re-
duced inference time and potential improvements
in accuracy. To perform this step in the NICE
method, we follow the work of Tedeschi et al.
(2021) by using entity type information. Given
an entity mention
m
surrounded by textual context
(contlef t, contright)
and a list of candidate entities
cands ={e1, . . . ., en}
, we use a NER classifier to
predict the top-k possible entity types of
m
. Then,
we discard all candidate entities that have an entity
type not matching any of these kclasses:
candsfiltered ={ei:type(ei)ˆ
T|eicands},
where
ˆ
T
is the set of top-k predicted entity types. If
the confidence score of the NER classifier is above
a threshold value
t
, only one class is used instead
of
k
. In the current setup of the NICE method, the
number of top predicted classes is
k= 3
and the
confidence threshold value is
t= 1
, which means
that the classifier always outputs the top-3 entity
classes. Figure 2shows an example of NER-based
candidate filtering.
To obtain the entity types for the candidates, we
use the Wiki2NER dictionary provided by Tedeschi
et al. (2021)
2
. Then, instead of using the NER clas-
sifier as provided by Tedeschi et al. (2021), which
has been trained only on the AIDA training set and
therefore may be biased towards frequent entities
as well, we introduce a refined version of it, which
is more robust to overshadowing. Specifically, we
filter the training set of BLINK (Wu et al.,2020)
3
by discarding the entries where the ground truth
answer has the highest popularity score among all
candidate entities. Then, we use the 2M remaining
data entries to fine-tune the classifier. The moti-
vation behind fine-tuning the classifier rather than
training it from scratch is to achieve an improve-
ment in recognising overshadowed entities without
2https://github.com/Babelscape/ner4el
3
BLINK is a dataset for ED consisting of 9M entries ex-
tracted from Wikipedia.
摘要:

FocusingonContextisNICE:ImprovingOvershadowedEntityDisambiguationVeraProvatorova1,SimoneTedeschi2;3,SvitlanaVakulenko4RobertoNavigli2andEvangelosKanoulas11UniversityofAmsterdam,2SapienzaUniversityofRome3Babelscape,Italy,4AmazonAlexaAI{v.provatorova,e.kanoulas}@uva.nl{tedeschi,navigli}@diag.uniroma1....

展开>> 收起<<
Focusing on Context is NICE Improving Overshadowed Entity Disambiguation Vera Provatorova1 Simone Tedeschi23 Svitlana Vakulenko4.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:510.79KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注