Rescue Implicit and Long-tail Cases Nearest Neighbor Relation Extraction Zhen Wan1Qianying Liu1

2025-04-29 0 0 605.85KB 8 页 10玖币

侵权投诉

Rescue Implicit and Long-tail Cases: Nearest Neighbor Relation

Extraction

Zhen Wan ∗1Qianying Liu ∗1

Zhuoyuan Mao1Fei Cheng1Sadao Kurohashi1Jiwei Li2

1Kyoto University, Japan

2Zhejiang University, China

{zhenwan, ying, zhuoyuanmao}@nlp.ist.i.kyoto-u.ac.jp

{feicheng, kuro}@i.kyoto-u.ac.jp

{jiwei_li}@zju.edu.cn

Abstract

Relation extraction (RE) has achieved remark-

able progress with the help of pre-trained lan-

guage models. However, existing RE models

are usually incapable of handling two situa-

tions: implicit expressions and long-tail rela-

tion types, caused by language complexity and

data sparsity. In this paper, we introduce a sim-

ple enhancement of RE using knearest neigh-

bors (kNN-RE). kNN-RE allows the model to

consult training relations at test time through

a nearest-neighbor search and provides a sim-

ple yet effective means to tackle the two issues

above. Additionally, we observe that kNN-

RE serves as an effective way to leverage dis-

tant supervision (DS) data for RE. Experimen-

tal results show that the proposed kNN-RE

achieves state-of-the-art performances on a va-

riety of supervised RE datasets, i.e., ACE05,

SciERC, and Wiki80, along with outperform-

ing the best model to date on the i2b2 and

Wiki80 datasets in the setting of allowing us-

ing DS. Our code and models are available at:

https://github.com/YukinoWan/kNN-RE.

1 Introduction

Relation extraction (RE) aims to identify the rela-

tionship between entities mentioned in a sentence,

and is beneﬁcial to a variety of downstream tasks

such as question answering and knowledge base

population. Recent studies (Zhang et al.,2020;

Zeng et al.,2020;Lin et al.,2020;Wang and Lu,

2020;Cheng et al.,2020;Zhong and Chen,2021)

in supervised RE take advantage of pre-trained lan-

guage models (PLMs) and achieve SOTA perfor-

mances by ﬁne-tuning PLMs with a relation classi-

ﬁer. However, we observe that existing RE models

are usually incapable of handling two RE-speciﬁc

situations :

implicit expressions

and

long-tail re-

lation types.

Implicit expression

refers to the situation where

a relation is expressed as the underlying message

∗This denotes equal contribution.

Nearest Neighbor: He

was the younger bro-

ther of Panagiotis and

Athanasios Sekeris.

Test ex am ple : He is the

youngest son of Liones,

comparing with Samuel

Liones and Henry Liones.

Gold label: sibling to

Implicit Expression Long-tail Relation Types

Model prediction: spouse

Gold label: sibling to

child of

spouse

sibling to

title

country

of birth

employee

Test example Decision boundary

Figure 1: Left: the retrieved example has a similar

structure but with the phrase “younger brother”, it be-

comes easier to infer. Right: Referring to the gold

labels of nearest neighbors can reduce the bias. High-

lighted words may directly inﬂuence on the relation pre-

diction.

that is not explicitly stated or shown. For exam-

ple, for the relation “sibling to”, a common expres-

sion can be “

has a brother

James

”, while an

implicit expression could be “He is the youngest

son of Liones, comparing with

Samuel Liones

and

Henry Liones

.” In the latter case, the rela-

tion “sibling to” between “

Samuel Liones

” and

“

Henry Liones

” is not directly expressed but could

be inferred from them both are brothers of the same

person. Such underlying message can easily con-

fuse the relation classiﬁer. The problem of

long-

tail relation types

is caused by data sparsity in

training. For example, the widely used supervised

RE dataset TACRED (Zhang et al.,2017) includes

41 relation types. The most frequent type “per:title”

has 3,862 training examples, while over 22 types

have less than 300 examples. The majority types

can easily dominate model predictions and lead to

low performance on long-tail types.

Inspired by recent studies (Khandelwal et al.,

2020;Guu et al.,2020;Meng et al.,2021) us-

ing

NN to retrieve diverse expressions for lan-

guage generation tasks, we introduce a simple but

effective

NN-RE framework to address above-

mentioned two problems. Speciﬁcally, we store the

training examples as the memory by a vanilla RE

arXiv:2210.11800v2 [cs.CL] 30 Jan 2023

Instance 𝑥!

Joe Biden

becomes… US

Bill Gates

lives in America

was…president…China

lives in Tokyo

Joe Biden

owns…company

Test I nstan ce 𝑥

Joe Biden

…president…US

Rep.

𝒙!

𝒙"

𝒙#

𝒙$

𝒙%

𝒙&

Representation

𝒙

Label 𝑟!

president of

live in

president of

live in

own

𝑝'(

president of

0.4

live in

0.2

own

0.4

Retrieval

𝑝)**

president of

0.9

own

0.1

𝑑!

Distances 𝑑!= 𝑑(𝒙, 𝒙𝒊) ∶ 𝐿#

distance with aRBF kernel Aggregation

Interpolation

λ𝑝)** + (1 − λ) 𝑝'(

president of

0.7

live in

0.1

own

0.2

Memory

BERT

Encoder

Relation

Classifier

Training Set

DS Set Normalization:

𝑝(𝑟!) ∝ exp(−𝑑!)

𝑝(𝑟!)

president of

0.6

president of

0.3

own

0.1

Nearest 𝑘

Figure 2: An illustration of kNN-RE. The memory is constructed with each pair of relation representations (Rep.)

and relation labels from training set or DS set. For inference, the blue line denotes the workﬂow for vanilla RE and

the black line denotes the workﬂow for kNN.

model and consult the stored memory at test time

through a nearest-neighbor search. As shown in

Figure 1, for an

implicit expression

, the expres-

sion “son of” may mislead to an incorrect predic-

tion while its retrieved nearest neighbor contains a

direct expression “brother of”, which is a more ex-

plicit expression of the gold label “sibling to”. The

prediction of

long-tail

examples, as shown in Fig-

ure 1, is usually biased toward the majority class.

Nearest neighbor retrieval provides direct guidance

to the prediction by referring to the labels of its

nearest neighbors in the training set, and thus can

signiﬁcantly reduce the imbalanced classiﬁcation.

Additionally, we observe that

NN-RE serves

as an efﬁcient way to leverage distant supervision

(DS) data for RE. DS augments labeled RE datasets

by matching knowledge base (KB) relation triplets

and raw text entity pairs in a weak-supervision fash-

ion (Mintz et al.,2009;Lin et al.,2016;Vashishth

et al.,2018;Chen et al.,2021). Recent studies (Bal-

dini Soares et al.,2019;Ormándi et al.,2021;Peng

et al.,2020;Wan et al.,2022), which apply PLMs

to the DS labeled data to improve supervised RE,

require heavy computation due to the fact that they

require pre-training on DS data, whose size is usu-

ally dozens of times that of supervised datasets.

To address this issue, we propose a lightweight

method to leverage DS data to beneﬁt supervised

RE by extending the construction of stored memory

for

NN-RE to DS labeled data and outperforming

the recent best pre-training method with no extra

training.

In summary, we propose

NN-RE: a ﬂexible

NN framework to solve the RE task. We con-

Dataset # Rel. # Train # Dev # Test

ACE05 6 4,788 1,131 1,151

Wiki80 80 45,330 5,070 5,600

TACRED 41 68,124 22,631 15,509

i2b2 2010VA 8 3,020 111 6,147

SciERC 7 1,861 275 551

Wiki20m 80 303K - -

MIMIC-III 8 36K - -

Table 1: Statistics of datasets. Rel. denotes relation

types.

duct the experiments for

NN-RE with three dif-

ferent memory settings: training, DS, and the com-

bination of training and DS. The results show that

our

NN-RE with the training memory obtains

a 0.84%-1.15% absolute F1 improvement on ﬁve

datasets and achieves state-of-the-art (SOTA) F1

scores on three of them (ACE05, SciERC and

Wiki80). In the DS setup,

NN-RE outperforms

SOTA DS pre-training methods on two datasets

(i2b2, Wiki20) signiﬁcantly without extra training.

2 Methodology

2.1 Background: Vanilla RE model

For the vanilla RE model, We follow the recent

SOTA method PURE (Zhong and Chen,2021). To

encode an input example to a ﬁxed-length represen-

tation by ﬁne-tuning PLMs such as BERT (Devlin

et al.,2019), PURE adds extra marker tokens to

highlight the head and tail entities and their types.

Speciﬁcally, given an example

: “

has

a brother

James

.”, the input sequence is “[CLS]

[H_PER]

[/H_PER] has a brother [T_PER]

James

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

RescueImplicitandLong-tailCases:NearestNeighborRelationExtractionZhenWan1QianyingLiu1ZhuoyuanMao1FeiCheng1SadaoKurohashi1JiweiLi21KyotoUniversity,Japan2ZhejiangUniversity,China{zhenwan,ying,zhuoyuanmao}@nlp.ist.i.kyoto-u.ac.jp{feicheng,kuro}@i.kyoto-u.ac.jp{jiwei_li}@zju.edu.cnAbstractRelationextr...

展开>> 收起<<

Rescue Implicit and Long-tail Cases Nearest Neighbor Relation Extraction Zhen Wan1Qianying Liu1.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Rescue Implicit and Long-tail Cases Nearest Neighbor Relation Extraction Zhen Wan1Qianying Liu1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: