Generative Entity Typing with Curriculum Learning

2025-04-22 0 0 859.18KB 13 页 10玖币

侵权投诉

Siyu Yuan♥1, Deqing Yang♥2∗

, Jiaqing Liang♥2, Zhixu Li♠2Jinxi Liu♥1,

Jingyue Huang♥2, Yanghua Xiao♠♣2∗

♥School of Data Science, Fudan University, Shanghai, China

♠Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University

♣Fudan-Aishu Cognitive Intelligence Joint Research Center

1{syyuan21,jxliu22}@m.fudan.edu.cn,

2{yangdeqing,liangjiaqing,zhixuli,jingyuehuang18,shawyh}@fudan.edu.cn

Abstract

Entity typing aims to assign types to the en-

tity mentions in given texts. The traditional

classiﬁcation-based entity typing paradigm

has two unignorable drawbacks: 1) it fails to

assign an entity to the types beyond the pre-

deﬁned type set, and 2) it can hardly handle

few-shot and zero-shot situations where many

long-tail types only have few or even no train-

ing instances. To overcome these drawbacks,

we propose a novel generative entity typing

(GET) paradigm: given a text with an entity

mention, the multiple types for the role that

the entity plays in the text are generated with a

pre-trained language model (PLM). However,

PLMs tend to generate coarse-grained types af-

ter ﬁne-tuning upon the entity typing dataset.

In addition, only the heterogeneous training

data consisting of a small portion of human-

annotated data and a large portion of auto-

generated but low-quality data are provided for

model training. To tackle these problems, we

employ curriculum learning (CL) to train our

GET model on heterogeneous data, where the

curriculum could be self-adjusted with the self-

paced learning according to its comprehension

of the type granularity and data heterogeneity.

Our extensive experiments upon the datasets

of different languages and downstream tasks

justify the superiority of our GET model over

the state-of-the-art entity typing models. The

code has been released on https://github.com/

siyuyuan/GET.

1 Introduction

Entity typing aims to assign types to mentions of en-

tities from a predeﬁned type set, which enables ma-

chines to better understand natural languages and

beneﬁt many downstream tasks, such as entity link-

ing (Yang et al.,2019) and text classiﬁcation (Chen

et al.,2019). Traditional entity typing approaches

follow the classiﬁcation paradigm to classify (as-

sign) the entity into a predeﬁned set of types, which

∗Corresponding authors.

In the early 1980s , P & G tried to launch

here a concentrated detergent under the

Ariel brand name that it markets in Europe.

Input Text

Predefined

type set

substance

product

continent

…

company Entity: P & G

Generation Classification

detergent company*

detergent manufacturer*

company

(*) means the generated

types are out of

predefined type set

Figure 1: A toy example of entity typing through gen-

eration and classiﬁcation paradigm, respectively.

have the following two unignorable drawbacks.

1) Closed Type Set: The classiﬁcation-based ap-

proaches fail to assign the entity to the types out

of the predeﬁned set. 2) Few-shot Dilemma for

Long-tail Types: Although ﬁne-grained entity typ-

ing (FET) and ultra-ﬁne entity typing approaches

can classify entities into ﬁne-grained types, they

can hardly handle few-shot and zero-shot issues.

In fact, there are many long-tail types only having

few or even no training instances in the datasets.

For example, more than 80% types have less than

5 instances and 25% types even never appear in the

training data from the ultra-ﬁne dataset (Choi et al.,

2018).

To address these drawbacks, in this paper, we

propose a novel generative entity typing (GET)

paradigm: given a text with an entity mention,

the multiple types for the role that the entity

plays in the text are generated by a pre-trained

language model (PLM). Compared to traditional

classiﬁcation-based entity typing methods, PLM-

based GET has two advantages. First, instead of

a predeﬁned closed type set, PLMs can generate

more open types for entity mentions due to their

strong generation capabilities. For example, in Fig-

ure 1, ﬁne-grained types such as “large detergent

company" and “large detergent manufacturer" can

be generated by PLMs for entity P&G, which con-

arXiv:2210.02914v2 [cs.CL] 18 Oct 2022

tain richer semantics but are seldom included by

a predeﬁned type set. Second, PLMs are capable

of conceptual reasoning and handling the few-shot

and zero-shot dilemma (Hwang et al.,2021), since

massive knowledge has been learned during their

pre-training.

However, it is nontrivial to realize PLM-based

GET due to the following challenges: 1) Entity typ-

ing usually requires generating ﬁne-grained types

with more semantics, which are more beneﬁcial to

downstream tasks. However, PLMs are biased to

generate high-frequency vocabulary in the corpus

due to their primary learning principle based on

statistical associations. As a result, a typical PLM

tends to generate high-frequent but coarse-grained

types even if we carefully ﬁnetune the PLM on the

ﬁne-grained entity typing dataset (refer to Figure 5

in Section 4). Therefore, how to guide a PLM to

generate high-quality and ﬁne-grained types for

entities is crucial. 2) It is costly for humans to an-

notate a great number of samples with ﬁne-grained

types. Therefore, most existing works adopt het-

erogeneous data consisting of a small portion (less

than 10%) of human-annotated data and a large

portion (more than 90%) of auto-generated low-

quality data (e.g., by distant supervision), which

greatly hurts the performance of entity typing mod-

els (Gong et al.,2021). How to train a PLM to

generate desirable types on these low-quality het-

erogeneous data is also challenging.

The difﬁculty of using PLMs to generate high-

quality ﬁne-grained types based on the low-quality

heterogeneous training data motivates us to lever-

age the idea from curriculum learning (CL) (Ben-

gio et al.,2009), which better learns heterogeneous

data by ordering the training samples based on their

quality and difﬁculty (Kumar et al.,2019). In this

paper, we propose a CL-based strategy to train our

GET model. Speciﬁcally, we ﬁrst deﬁne a ﬁxed

curriculum instruction and partition the training

data into several subsets according to the granular-

ity and heterogeneity of samples for model train-

ing. Based on the curriculum instruction, CL can

control the order of using these training subsets

from coarse-grained and lower-quality ones to ﬁne-

grained and higher-quality ones. However, a ﬁxed

curriculum ignores the feedback from the training

process. Thus, we combine the predetermined cur-

riculum with self-paced learning (SPL) (Kumar

et al.,2010), which can enforce the model dynam-

ically self-adjusting to the actual learning order

according to the training loss. In this way, our CL-

based GET model can make the learning process

move towards a better global optimum upon the

heterogeneous data to generate high-quality and

ﬁne-grained types. Our contributions in this paper

are summarized as follows:

•

To the best of our knowledge, our work is the

ﬁrst to propose the paradigm of generative

entity typing (GET).

•

We propose to leverage curriculum learning to

train our GET model upon heterogeneous data,

where the curriculum can be self-adjusted

with self-paced learning.

•

Our extensive experiments on the data of dif-

ferent languages and downstream tasks justify

the superiority of our GET model.

2 Related Work

Classiﬁcation-based Entity Typing

The tradi-

tional classiﬁcation-based entity typing methods

can be categorized into three classes. 1) Coarse-

grained entity typing methods (Weischedel and

Brunstein,2005;Tokarchuk et al.,2021) assign

mentions to a small set of coarse types; 2) Fine-

grained entity typing (FET) methods (Yuan and

Downey,2018;Onoe et al.,2021) classify men-

tions into more diverse and semantically richer on-

tologies; 3) Ultra-ﬁne entity typing methods (Choi

et al.,2018;Ding et al.,2021;Dai et al.,2021)

use a large open type vocabulary to predict a set of

natural-language phrases as entity types based on

texts. However, FET and ultra-ﬁne entity typing

methods hardly perform satisfactorily due to the

huge predeﬁned type set. They also hardly han-

dle few-shot and zero-shot issues. Comparatively,

our GET model can generate high-quality multi-

granularity types even beyond the predeﬁned set

for the given entity mentions.

Concept Acquisition

Concept acquisition is

very related to entity typing which also aims to

obtain the types for the given entities, since entity

types are often recognized as concepts. Concept

acquisition can be categorized into the extraction-

based or generation-based scheme. The extraction

scheme cannot acquire concepts not existing in the

given text (Yang et al.,2020). The existing ap-

proaches of concept generation (Zeng et al.,2021)

focus on utilizing the existing concept taxonomy or

knowledge bases to generate concepts but neglect

to utilize the large corpus. Our GET model can

Fine-grained

auto-generated data

Human-annotated data

3.CL-based Learning

Predetermined

curriculum T5

Order of subsets Length of types

Prior knowledge

SPL

2.Curriculum Instruction

Input text

1.Prompts Construction

Easy

Difficult

𝑫𝑨

𝑫𝑩

𝑫𝑪

Hearst

patterns

Input data

Coarse-grained

auto-generated data

Figure 2: Our PLM-based GET framework trained with

curriculum learning.

also achieve text-based concept generation.

Curriculum Learning

According to the curricu-

lum learning (CL) paradigm, a model is ﬁrst trained

with the easier subsets or subtasks, and then the

training difﬁculty is gradually increased (Bengio

et al.,2009) to improve model performance in dif-

ﬁcult target tasks, such as domain adaption (Wang

et al.,2021) and training generalization (Huang and

Du,2019). The existing CL methods can be divided

into predeﬁned CL (PCL) (Bengio et al.,2009) and

automatic CL (ACL) (Kumar et al.,2010). PCL

divides the training data by the difﬁculty level with

prior knowledge, while ACL, such as self-paced

learning (SPL), measures the difﬁculty according

to its losses or other models.

3 Methodology

In this section, we ﬁrst formalize our task in this

paper and overview the framework of our GET

model. Then, we introduce the details of model

implementation.

3.1 Task Formalization

Given a piece of text

and an entity mention

within it, the task of

generative entity typ-

ing (GET)

is to generate multiple types

T S =

{T1, T2, ..., TK}

, where each

Tk(1 ≤k≤K)

is a

type for Mw.r.t. the context of X.

3.2 Framework

As most of the previous entity typing models (Choi

et al.,2018;Lee et al.,2020;Gong et al.,2021),

our GET model is also trained upon the hetero-

geneous data consisting of a small portion of

human-annotated data and a large portion of auto-

generated data, due to the difﬁculty and high cost of

human annotation. We will introduce how to obtain

our auto-generated data in Section 4.1. The frame-

work of our model learning includes the following

three steps, as shown in Figure 2.

Prompt Construction: To better leverage the

knowledge obtained from the pre-training of

PLM, we employ the prompt mechanism (Liu

et al.,2021a) to guide the learning of our PLM-

based GET model;

Curriculum Instruction: As a key component

of CL, the curriculum instruction is responsi-

ble for measuring the difﬁculty of each sample

in the heterogeneous training data, and then

designing a suitable curriculum for the model

training process;

CL-based Learning: In this step, our PLM-

based GET model is trained with the designed

curriculum, which is capable of adjusting its

learning progress dynamically through self-

paced learning (SPL).

3.3 Prompt Construction

To generate the types of given entities by a PLM,

we construct the prompts in cloze format from the

Hearst patterns listed in Table 1. Speciﬁcally, each

input text

including an entity mention

is con-

catenated with a cloze prompt constructed with

and the PLM is asked to ﬁll the blank within the

cloze prompt. Recall the example in Figure 1, the

original text “In the early 1980s, P & G tried to

launch here a concentrated detergent under the

Ariel brand name that it markets in Europe” can

be concatenated with a cloze prompt such as “P &

G is a ” to construct an input prompt for the

PLM, which predicts “large detergent company",

“large detergent manufacturer" and “company" as

the types for P&Gto ﬁll the blank.

M is a such as M

M is one of especially M

M refers to , including M

M is a member of

Table 1: Prompts constructed from Hearst patterns.

3.4 Curriculum Instruction

Curriculum instruction is the core issue of CL,

which requires estimating the difﬁculty of samples

in terms of model learning, to decide the order of

using samples for model training.

For our speciﬁc PLM-based GET model, we

argue that the difﬁculty of a sample in terms of

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

GenerativeEntityTypingwithCurriculumLearningSiyuYuan~1,DeqingYang~2,JiaqingLiang~2,ZhixuLi2JinxiLiu~1,JingyueHuang~2,YanghuaXiao|2~SchoolofDataScience,FudanUniversity,Shanghai,ChinaShanghaiKeyLaboratoryofDataScience,SchoolofComputerScience,FudanUniversity|Fudan-AishuCognitiveIntelligenceJointRe...

展开>> 收起<<

Generative Entity Typing with Curriculum Learning.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Generative Entity Typing with Curriculum Learning

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: