Generative Entity Typing with Curriculum Learning

2025-04-22 0 0 859.18KB 13 页 10玖币
侵权投诉
Generative Entity Typing with Curriculum Learning
Siyu Yuan1, Deqing Yang2
, Jiaqing Liang2, Zhixu Li2Jinxi Liu1,
Jingyue Huang2, Yanghua Xiao♠♣2
School of Data Science, Fudan University, Shanghai, China
Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University
Fudan-Aishu Cognitive Intelligence Joint Research Center
1{syyuan21,jxliu22}@m.fudan.edu.cn,
2{yangdeqing,liangjiaqing,zhixuli,jingyuehuang18,shawyh}@fudan.edu.cn
Abstract
Entity typing aims to assign types to the en-
tity mentions in given texts. The traditional
classification-based entity typing paradigm
has two unignorable drawbacks: 1) it fails to
assign an entity to the types beyond the pre-
defined type set, and 2) it can hardly handle
few-shot and zero-shot situations where many
long-tail types only have few or even no train-
ing instances. To overcome these drawbacks,
we propose a novel generative entity typing
(GET) paradigm: given a text with an entity
mention, the multiple types for the role that
the entity plays in the text are generated with a
pre-trained language model (PLM). However,
PLMs tend to generate coarse-grained types af-
ter fine-tuning upon the entity typing dataset.
In addition, only the heterogeneous training
data consisting of a small portion of human-
annotated data and a large portion of auto-
generated but low-quality data are provided for
model training. To tackle these problems, we
employ curriculum learning (CL) to train our
GET model on heterogeneous data, where the
curriculum could be self-adjusted with the self-
paced learning according to its comprehension
of the type granularity and data heterogeneity.
Our extensive experiments upon the datasets
of different languages and downstream tasks
justify the superiority of our GET model over
the state-of-the-art entity typing models. The
code has been released on https://github.com/
siyuyuan/GET.
1 Introduction
Entity typing aims to assign types to mentions of en-
tities from a predefined type set, which enables ma-
chines to better understand natural languages and
benefit many downstream tasks, such as entity link-
ing (Yang et al.,2019) and text classification (Chen
et al.,2019). Traditional entity typing approaches
follow the classification paradigm to classify (as-
sign) the entity into a predefined set of types, which
Corresponding authors.
In the early 1980s , P & G tried to launch
here a concentrated detergent under the
Ariel brand name that it markets in Europe.
Input Text
Predefined
type set
substance
product
continent
company Entity: P & G
Generation Classification
detergent company*
detergent manufacturer*
company
company
(*) means the generated
types are out of
predefined type set
Figure 1: A toy example of entity typing through gen-
eration and classification paradigm, respectively.
have the following two unignorable drawbacks.
1) Closed Type Set: The classification-based ap-
proaches fail to assign the entity to the types out
of the predefined set. 2) Few-shot Dilemma for
Long-tail Types: Although fine-grained entity typ-
ing (FET) and ultra-fine entity typing approaches
can classify entities into fine-grained types, they
can hardly handle few-shot and zero-shot issues.
In fact, there are many long-tail types only having
few or even no training instances in the datasets.
For example, more than 80% types have less than
5 instances and 25% types even never appear in the
training data from the ultra-fine dataset (Choi et al.,
2018).
To address these drawbacks, in this paper, we
propose a novel generative entity typing (GET)
paradigm: given a text with an entity mention,
the multiple types for the role that the entity
plays in the text are generated by a pre-trained
language model (PLM). Compared to traditional
classification-based entity typing methods, PLM-
based GET has two advantages. First, instead of
a predefined closed type set, PLMs can generate
more open types for entity mentions due to their
strong generation capabilities. For example, in Fig-
ure 1, fine-grained types such as “large detergent
company" and “large detergent manufacturer" can
be generated by PLMs for entity P&G, which con-
arXiv:2210.02914v2 [cs.CL] 18 Oct 2022
tain richer semantics but are seldom included by
a predefined type set. Second, PLMs are capable
of conceptual reasoning and handling the few-shot
and zero-shot dilemma (Hwang et al.,2021), since
massive knowledge has been learned during their
pre-training.
However, it is nontrivial to realize PLM-based
GET due to the following challenges: 1) Entity typ-
ing usually requires generating fine-grained types
with more semantics, which are more beneficial to
downstream tasks. However, PLMs are biased to
generate high-frequency vocabulary in the corpus
due to their primary learning principle based on
statistical associations. As a result, a typical PLM
tends to generate high-frequent but coarse-grained
types even if we carefully finetune the PLM on the
fine-grained entity typing dataset (refer to Figure 5
in Section 4). Therefore, how to guide a PLM to
generate high-quality and fine-grained types for
entities is crucial. 2) It is costly for humans to an-
notate a great number of samples with fine-grained
types. Therefore, most existing works adopt het-
erogeneous data consisting of a small portion (less
than 10%) of human-annotated data and a large
portion (more than 90%) of auto-generated low-
quality data (e.g., by distant supervision), which
greatly hurts the performance of entity typing mod-
els (Gong et al.,2021). How to train a PLM to
generate desirable types on these low-quality het-
erogeneous data is also challenging.
The difficulty of using PLMs to generate high-
quality fine-grained types based on the low-quality
heterogeneous training data motivates us to lever-
age the idea from curriculum learning (CL) (Ben-
gio et al.,2009), which better learns heterogeneous
data by ordering the training samples based on their
quality and difficulty (Kumar et al.,2019). In this
paper, we propose a CL-based strategy to train our
GET model. Specifically, we first define a fixed
curriculum instruction and partition the training
data into several subsets according to the granular-
ity and heterogeneity of samples for model train-
ing. Based on the curriculum instruction, CL can
control the order of using these training subsets
from coarse-grained and lower-quality ones to fine-
grained and higher-quality ones. However, a fixed
curriculum ignores the feedback from the training
process. Thus, we combine the predetermined cur-
riculum with self-paced learning (SPL) (Kumar
et al.,2010), which can enforce the model dynam-
ically self-adjusting to the actual learning order
according to the training loss. In this way, our CL-
based GET model can make the learning process
move towards a better global optimum upon the
heterogeneous data to generate high-quality and
fine-grained types. Our contributions in this paper
are summarized as follows:
To the best of our knowledge, our work is the
first to propose the paradigm of generative
entity typing (GET).
We propose to leverage curriculum learning to
train our GET model upon heterogeneous data,
where the curriculum can be self-adjusted
with self-paced learning.
Our extensive experiments on the data of dif-
ferent languages and downstream tasks justify
the superiority of our GET model.
2 Related Work
Classification-based Entity Typing
The tradi-
tional classification-based entity typing methods
can be categorized into three classes. 1) Coarse-
grained entity typing methods (Weischedel and
Brunstein,2005;Tokarchuk et al.,2021) assign
mentions to a small set of coarse types; 2) Fine-
grained entity typing (FET) methods (Yuan and
Downey,2018;Onoe et al.,2021) classify men-
tions into more diverse and semantically richer on-
tologies; 3) Ultra-fine entity typing methods (Choi
et al.,2018;Ding et al.,2021;Dai et al.,2021)
use a large open type vocabulary to predict a set of
natural-language phrases as entity types based on
texts. However, FET and ultra-fine entity typing
methods hardly perform satisfactorily due to the
huge predefined type set. They also hardly han-
dle few-shot and zero-shot issues. Comparatively,
our GET model can generate high-quality multi-
granularity types even beyond the predefined set
for the given entity mentions.
Concept Acquisition
Concept acquisition is
very related to entity typing which also aims to
obtain the types for the given entities, since entity
types are often recognized as concepts. Concept
acquisition can be categorized into the extraction-
based or generation-based scheme. The extraction
scheme cannot acquire concepts not existing in the
given text (Yang et al.,2020). The existing ap-
proaches of concept generation (Zeng et al.,2021)
focus on utilizing the existing concept taxonomy or
knowledge bases to generate concepts but neglect
to utilize the large corpus. Our GET model can
Fine-grained
auto-generated data
Human-annotated data
3.CL-based Learning
Predetermined
curriculum T5
Order of subsets Length of types
Prior knowledge
SPL
2.Curriculum Instruction
Input text
1.Prompts Construction
Easy
Difficult
𝑫𝑨
𝑫𝑩
𝑫𝑪
Hearst
patterns
Input data
Coarse-grained
auto-generated data
Figure 2: Our PLM-based GET framework trained with
curriculum learning.
also achieve text-based concept generation.
Curriculum Learning
According to the curricu-
lum learning (CL) paradigm, a model is first trained
with the easier subsets or subtasks, and then the
training difficulty is gradually increased (Bengio
et al.,2009) to improve model performance in dif-
ficult target tasks, such as domain adaption (Wang
et al.,2021) and training generalization (Huang and
Du,2019). The existing CL methods can be divided
into predefined CL (PCL) (Bengio et al.,2009) and
automatic CL (ACL) (Kumar et al.,2010). PCL
divides the training data by the difficulty level with
prior knowledge, while ACL, such as self-paced
learning (SPL), measures the difficulty according
to its losses or other models.
3 Methodology
In this section, we first formalize our task in this
paper and overview the framework of our GET
model. Then, we introduce the details of model
implementation.
3.1 Task Formalization
Given a piece of text
X
and an entity mention
M
within it, the task of
generative entity typ-
ing (GET)
is to generate multiple types
T S =
{T1, T2, ..., TK}
, where each
Tk(1 kK)
is a
type for Mw.r.t. the context of X.
3.2 Framework
As most of the previous entity typing models (Choi
et al.,2018;Lee et al.,2020;Gong et al.,2021),
our GET model is also trained upon the hetero-
geneous data consisting of a small portion of
human-annotated data and a large portion of auto-
generated data, due to the difficulty and high cost of
human annotation. We will introduce how to obtain
our auto-generated data in Section 4.1. The frame-
work of our model learning includes the following
three steps, as shown in Figure 2.
1.
Prompt Construction: To better leverage the
knowledge obtained from the pre-training of
PLM, we employ the prompt mechanism (Liu
et al.,2021a) to guide the learning of our PLM-
based GET model;
2.
Curriculum Instruction: As a key component
of CL, the curriculum instruction is responsi-
ble for measuring the difficulty of each sample
in the heterogeneous training data, and then
designing a suitable curriculum for the model
training process;
3.
CL-based Learning: In this step, our PLM-
based GET model is trained with the designed
curriculum, which is capable of adjusting its
learning progress dynamically through self-
paced learning (SPL).
3.3 Prompt Construction
To generate the types of given entities by a PLM,
we construct the prompts in cloze format from the
Hearst patterns listed in Table 1. Specifically, each
input text
X
including an entity mention
M
is con-
catenated with a cloze prompt constructed with
M
,
and the PLM is asked to fill the blank within the
cloze prompt. Recall the example in Figure 1, the
original text “In the early 1980s, P & G tried to
launch here a concentrated detergent under the
Ariel brand name that it markets in Europe” can
be concatenated with a cloze prompt such as “P &
G is a ” to construct an input prompt for the
PLM, which predicts “large detergent company",
“large detergent manufacturer" and “company" as
the types for P&Gto fill the blank.
M is a such as M
M is one of especially M
M refers to , including M
M is a member of
Table 1: Prompts constructed from Hearst patterns.
3.4 Curriculum Instruction
Curriculum instruction is the core issue of CL,
which requires estimating the difficulty of samples
in terms of model learning, to decide the order of
using samples for model training.
For our specific PLM-based GET model, we
argue that the difficulty of a sample in terms of
摘要:

GenerativeEntityTypingwithCurriculumLearningSiyuYuan~1,DeqingYang~2,JiaqingLiang~2,ZhixuLi2JinxiLiu~1,JingyueHuang~2,YanghuaXiao|2~SchoolofDataScience,FudanUniversity,Shanghai,ChinaShanghaiKeyLaboratoryofDataScience,SchoolofComputerScience,FudanUniversity|Fudan-AishuCognitiveIntelligenceJointRe...

展开>> 收起<<
Generative Entity Typing with Curriculum Learning.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:859.18KB 格式:PDF 时间:2025-04-22

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注