Rescue Implicit and Long-tail Cases Nearest Neighbor Relation Extraction Zhen Wan1Qianying Liu1

2025-04-29 0 0 605.85KB 8 页 10玖币
侵权投诉
Rescue Implicit and Long-tail Cases: Nearest Neighbor Relation
Extraction
Zhen Wan 1Qianying Liu 1
Zhuoyuan Mao1Fei Cheng1Sadao Kurohashi1Jiwei Li2
1Kyoto University, Japan
2Zhejiang University, China
{zhenwan, ying, zhuoyuanmao}@nlp.ist.i.kyoto-u.ac.jp
{feicheng, kuro}@i.kyoto-u.ac.jp
{jiwei_li}@zju.edu.cn
Abstract
Relation extraction (RE) has achieved remark-
able progress with the help of pre-trained lan-
guage models. However, existing RE models
are usually incapable of handling two situa-
tions: implicit expressions and long-tail rela-
tion types, caused by language complexity and
data sparsity. In this paper, we introduce a sim-
ple enhancement of RE using knearest neigh-
bors (kNN-RE). kNN-RE allows the model to
consult training relations at test time through
a nearest-neighbor search and provides a sim-
ple yet effective means to tackle the two issues
above. Additionally, we observe that kNN-
RE serves as an effective way to leverage dis-
tant supervision (DS) data for RE. Experimen-
tal results show that the proposed kNN-RE
achieves state-of-the-art performances on a va-
riety of supervised RE datasets, i.e., ACE05,
SciERC, and Wiki80, along with outperform-
ing the best model to date on the i2b2 and
Wiki80 datasets in the setting of allowing us-
ing DS. Our code and models are available at:
https://github.com/YukinoWan/kNN-RE.
1 Introduction
Relation extraction (RE) aims to identify the rela-
tionship between entities mentioned in a sentence,
and is beneficial to a variety of downstream tasks
such as question answering and knowledge base
population. Recent studies (Zhang et al.,2020;
Zeng et al.,2020;Lin et al.,2020;Wang and Lu,
2020;Cheng et al.,2020;Zhong and Chen,2021)
in supervised RE take advantage of pre-trained lan-
guage models (PLMs) and achieve SOTA perfor-
mances by fine-tuning PLMs with a relation classi-
fier. However, we observe that existing RE models
are usually incapable of handling two RE-specific
situations :
implicit expressions
and
long-tail re-
lation types.
Implicit expression
refers to the situation where
a relation is expressed as the underlying message
This denotes equal contribution.
Nearest Neighbor: He
was the younger bro-
ther of Panagiotis and
Athanasios Sekeris.
Test ex am ple : He is the
youngest son of Liones,
comparing with Samuel
Liones and Henry Liones.
Gold label: sibling to
Implicit Expression Long-tail Relation Types
Model prediction: spouse
Gold label: sibling to
child of
spouse
sibling to
title
country
of birth
employee
of
Test example Decision boundary
Figure 1: Left: the retrieved example has a similar
structure but with the phrase “younger brother”, it be-
comes easier to infer. Right: Referring to the gold
labels of nearest neighbors can reduce the bias. High-
lighted words may directly influence on the relation pre-
diction.
that is not explicitly stated or shown. For exam-
ple, for the relation “sibling to”, a common expres-
sion can be “
He
has a brother
James
”, while an
implicit expression could be “He is the youngest
son of Liones, comparing with
Samuel Liones
and
Henry Liones
. In the latter case, the rela-
tion “sibling to” between “
Samuel Liones
” and
Henry Liones
” is not directly expressed but could
be inferred from them both are brothers of the same
person. Such underlying message can easily con-
fuse the relation classifier. The problem of
long-
tail relation types
is caused by data sparsity in
training. For example, the widely used supervised
RE dataset TACRED (Zhang et al.,2017) includes
41 relation types. The most frequent type “per:title”
has 3,862 training examples, while over 22 types
have less than 300 examples. The majority types
can easily dominate model predictions and lead to
low performance on long-tail types.
Inspired by recent studies (Khandelwal et al.,
2020;Guu et al.,2020;Meng et al.,2021) us-
ing
k
NN to retrieve diverse expressions for lan-
guage generation tasks, we introduce a simple but
effective
k
NN-RE framework to address above-
mentioned two problems. Specifically, we store the
training examples as the memory by a vanilla RE
arXiv:2210.11800v2 [cs.CL] 30 Jan 2023
Instance 𝑥!
Joe Biden
becomes… US
Bill Gates
lives in America
Xi
was…presidentChina
He
lives in Tokyo
Joe Biden
owns…company
Test I nstan ce 𝑥
Joe Biden
…president…US
Rep.
𝒙!
𝒙"
𝒙#
𝒙$
𝒙%
𝒙&
Representation
𝒙
Label 𝑟!
president of
live in
president of
live in
own
𝑝'(
president of
0.4
live in
0.2
own
0.4
Retrieval
𝑝)**
president of
own
𝑑!
Distances 𝑑!= 𝑑(𝒙, 𝒙𝒊) ∶ 𝐿#
distance with aRBF kernel Aggregation
Interpolation
λ𝑝)** + (1 − λ) 𝑝'(
president of
0.7
live in
0.1
own
0.2
Memory
BERT
Encoder
Relation
Classifier
Training Set
DS Set Normalization:
𝑝(𝑟!) ∝ exp(−𝑑!)
𝑝(𝑟!)
president of
0.6
president of
0.3
own
0.1
Nearest 𝑘
Figure 2: An illustration of kNN-RE. The memory is constructed with each pair of relation representations (Rep.)
and relation labels from training set or DS set. For inference, the blue line denotes the workflow for vanilla RE and
the black line denotes the workflow for kNN.
model and consult the stored memory at test time
through a nearest-neighbor search. As shown in
Figure 1, for an
implicit expression
, the expres-
sion “son of” may mislead to an incorrect predic-
tion while its retrieved nearest neighbor contains a
direct expression “brother of”, which is a more ex-
plicit expression of the gold label “sibling to”. The
prediction of
long-tail
examples, as shown in Fig-
ure 1, is usually biased toward the majority class.
Nearest neighbor retrieval provides direct guidance
to the prediction by referring to the labels of its
nearest neighbors in the training set, and thus can
significantly reduce the imbalanced classification.
Additionally, we observe that
k
NN-RE serves
as an efficient way to leverage distant supervision
(DS) data for RE. DS augments labeled RE datasets
by matching knowledge base (KB) relation triplets
and raw text entity pairs in a weak-supervision fash-
ion (Mintz et al.,2009;Lin et al.,2016;Vashishth
et al.,2018;Chen et al.,2021). Recent studies (Bal-
dini Soares et al.,2019;Ormándi et al.,2021;Peng
et al.,2020;Wan et al.,2022), which apply PLMs
to the DS labeled data to improve supervised RE,
require heavy computation due to the fact that they
require pre-training on DS data, whose size is usu-
ally dozens of times that of supervised datasets.
To address this issue, we propose a lightweight
method to leverage DS data to benefit supervised
RE by extending the construction of stored memory
for
k
NN-RE to DS labeled data and outperforming
the recent best pre-training method with no extra
training.
In summary, we propose
k
NN-RE: a flexible
k
NN framework to solve the RE task. We con-
Dataset # Rel. # Train # Dev # Test
ACE05 6 4,788 1,131 1,151
Wiki80 80 45,330 5,070 5,600
TACRED 41 68,124 22,631 15,509
i2b2 2010VA 8 3,020 111 6,147
SciERC 7 1,861 275 551
Wiki20m 80 303K - -
MIMIC-III 8 36K - -
Table 1: Statistics of datasets. Rel. denotes relation
types.
duct the experiments for
k
NN-RE with three dif-
ferent memory settings: training, DS, and the com-
bination of training and DS. The results show that
our
k
NN-RE with the training memory obtains
a 0.84%-1.15% absolute F1 improvement on five
datasets and achieves state-of-the-art (SOTA) F1
scores on three of them (ACE05, SciERC and
Wiki80). In the DS setup,
k
NN-RE outperforms
SOTA DS pre-training methods on two datasets
(i2b2, Wiki20) significantly without extra training.
2 Methodology
2.1 Background: Vanilla RE model
For the vanilla RE model, We follow the recent
SOTA method PURE (Zhong and Chen,2021). To
encode an input example to a fixed-length represen-
tation by fine-tuning PLMs such as BERT (Devlin
et al.,2019), PURE adds extra marker tokens to
highlight the head and tail entities and their types.
Specifically, given an example
x
: “
He
has
a brother
James
.”, the input sequence is “[CLS]
[H_PER]
He
[/H_PER] has a brother [T_PER]
James
摘要:

RescueImplicitandLong-tailCases:NearestNeighborRelationExtractionZhenWan1QianyingLiu1ZhuoyuanMao1FeiCheng1SadaoKurohashi1JiweiLi21KyotoUniversity,Japan2ZhejiangUniversity,China{zhenwan,ying,zhuoyuanmao}@nlp.ist.i.kyoto-u.ac.jp{feicheng,kuro}@i.kyoto-u.ac.jp{jiwei_li}@zju.edu.cnAbstractRelationextr...

展开>> 收起<<
Rescue Implicit and Long-tail Cases Nearest Neighbor Relation Extraction Zhen Wan1Qianying Liu1.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:605.85KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注