Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5

2025-08-18 3 0 227.79KB 7 页 10玖币
侵权投诉
Effective Cross-Task Transfer Learning for
Explainable Natural Language Inference with T5
Irina Bigoulaeva1*, Rachneet Sachdeva1*, Harish Tayyar Madabushi2*,
Aline Villavicencio3and Iryna Gurevych1
1Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt
2Department of Computer Science, The University of Bath
3Department of Computer Science, The University of Sheffield
www.ukp.tu-darmstadt.de
htm43@bath.ac.uk, a.villavicencio@sheffield.ac.uk
Abstract
We compare sequential fine-tuning with a
model for multi-task learning in the context
where we are interested in boosting perfor-
mance on two tasks, one of which depends
on the other. We test these models on the
FigLang2022 shared task which requires par-
ticipants to predict language inference labels
on figurative language along with correspond-
ing textual explanations of the inference pre-
dictions. Our results show that while sequen-
tial multi-task learning can be tuned to be good
at the first of two target tasks, it performs less
well on the second and additionally struggles
with overfitting. Our findings show that sim-
ple sequential fine-tuning of text-to-text mod-
els is an extraordinarily powerful method for
cross-task knowledge transfer while simulta-
neously predicting multiple interdependent tar-
gets. So much so, that our best model achieved
the (tied) highest score on the task1.
1 Introduction and Motivation
The transfer of information between supervised
learning objectives can be achieved in Pre-trained
Language Models (PLMs) using either multi-task
learning (MTL) (Caruana,1997) or sequential fine-
tuning (SFT) (Phang et al.,2018). MTL involves
simultaneously training a model on multiple learn-
ing objectives using a weighted sum of their loss,
while SFT involves sequentially training on a set
of related tasks. Recent work has extended the
SFT approach by converting all NLP problems into
text-to-text (i.e., sequence-to-sequence where both
input and output sequences are natural text) prob-
lems (Raffel et al.,2019). The resultant model –
T5 – has achieved state-of-the-art results on a vari-
*Equal Contribution
1
To ensure reproducibility and to enable other researchers
to build upon our work, we make our code and mod-
els freely available at
https://github.com/Rachneet/
cross-task-figurative-explanations
ety of tasks such as question answering, sentiment
analysis, and, most relevant to this work, Natural
Language Inference (NLI).
In this work, we focus our efforts on the trans-
fer of information from multiple related tasks for
improved performance on a different set of tasks.
In addition, we compare the effectiveness of SFT
with that of MTL in a context where one of the
target tasks is dependent on the other. Given the
dependence of one of the target tasks on the other,
we implement an end-to-end multi-task learning
model to perform each of the tasks sequentially:
an architecture referred to as a hierarchical feature
pipeline based MTL architecture (HiFeatMTL, for
short) (Chen et al.,2021). While HiFeatMTL has
been previously used in different contexts (see Sec-
tion 3), it has, to the best of our knowledge, not
been used with, or compared to, text-to-text mod-
els. This is of particular importance as such mod-
els are known to enable transfer learning (Raffel
et al.,2019) and it is crucial to determine if tradi-
tional MTL methods can boost cross-task knowl-
edge transfer in such models.
Specifically we participate in the FigLang2022
Shared Task
2
, which extends NLI to include a
figurative-language hypothesis and additionally re-
quires participants to output a textual explanation
(also see Section 2). FigLang2022 is ideally suited
for the exploration of knowledge transfer, as PLMs
have been shown to struggle with figurative lan-
guage and so any gains achieved are a result of
knowledge transfer. For example, Liu et al. (2022)
show that in the zero- and few-shot settings, PLMs
perform significantly worse than humans. This is
especially the case with idioms (Yu and Ettinger,
2020;Tayyar Madabushi et al.,2021), on which
T5 does particularly poorly (see Section 4). Addi-
tionally, FigLang2022’s emphasis on explanations
of the predicted labels provides us with the oppor-
2https://figlang2022sharedtask.github.io/
arXiv:2210.17301v1 [cs.CL] 31 Oct 2022
tunity to test cross-task knowledge transfer in a
setting where one target task depends on the other
(HiFeatMTL) – this is especially so given the eval-
uation methods used (detailed in Section 2).
We evaluate the effectiveness of boosting per-
formance on the target tasks through the transfer
of information from two related tasks: a) eSNLI,
which is a dataset consisting of explanations asso-
ciated with NLI labels, and b) IMPLI, which is an
NLI dataset (without explanations) that contains
figurative language. More concretely, we set out to
answer the following research questions:
1.
Can distinct task-specific knowledge be trans-
ferred from separate tasks so as to improve
performance on a target task? Concretely, can
we transfer explanations of literal language
from eSNLI and figurative NLI without expla-
nations from IMPLI?
2.
Which of the two knowledge transfer tech-
niques (SFT or HiFeatMTL) is more effective
in the text-to-text context?
2 The FigLang2022 Shared Task
FigLang2022 is a variation of the NLI task which
requires the generation of a textual explanation for
the NLI prediction. Additionally, the hypothesis
is a sentence that employs one of four kinds of
figurative expressions: sarcasm, simile, idiom, or
metaphor. Additionally, a hypothesis can be a cre-
ative paraphrase, which rewords the premise using
more expressive, literal terminology. Table 1shows
examples from the task dataset.
Entailment
Premise I respectfully disagree.
Hypothesis I beg to differ. (Idiom)
Explanation To beg to differ is to disagree
with someone, and in this
sentence the speaker is
respectfully disagreeing.
Contradiction
Premise She was calm.
Hypothesis She was like a kitten in a den
of coyotes. (Simile)
Explanation A kitten in a den of coyotes
would be scared and not calm.
Table 1: An entailment and contradiction pair from the
FigLang2022 dataset.
FigLang2022 takes into consideration the qual-
ity of the generated explanation when assessing
the model’s performance by use of an explanation
score, which is the average between BERTScore
and BLEURT and ranges between 0 and 100. The
task leaderboard is based on NLI label accuracy at
an explanation score threshold of 60, although the
NLI label accuracy is reported at three thresholds
of the explanation score (i.e. 0, 50, and 60) so as
to provide a glimpse of how the model’s NLI and
explanation abilities are influenced by each other.
3 Related Work
NLI is considered central to the task of Natural
Language Understanding, and there has been sig-
nificant focus on the development of models that
can perform well on the task (Wang et al.,2018).
This task of language inference has been indepen-
dently extended to incorporate explanations (Cam-
buru et al.,2018) and figurative language (Stowe
et al.,2022) (both detailed below). Chakrabarty
et al. (2022) introduced FLUTE, the Figurative
Language Understanding and Textual Explanations
dataset which brought together these two aspects.
Previous shared tasks involving figurative lan-
guage focused on the identification or represen-
tation of figurative knowledge: For example,
FigLang2020 (Klebanov et al.,2020) and Task
6 of SemEval 2022 (Abu Farha et al.,2022) in-
volved sarcasm detection, and Task 2 of SemEval
2022 (Tayyar Madabushi et al.,2022) involved the
identification and representation of idioms.
The generation of textual explanations necessi-
tates the use of generative models such as BART
(Lewis et al.,2020) or T5 (Raffel et al.,2019).
Narang et al. (2020) introduce WT5, a sequence-to-
sequence model that outputs natural-text explana-
tions alongside its predictions and Erliksson et al.
(2021) found T5 to consistently outperform BART
in explanation generation.
Of specific relevance to our work are the IM-
PLI (Stowe et al.,2022) and eSNLI (Camburu et al.,
2018) datasets. IMPLI links a figurative sentence,
specifically idiomatic or metaphoric, to a literal
counterpart, with the NLI relation being either en-
tailment or non-entailment. Stowe et al. (2022)
show that idioms are difficult for models to han-
dle, particularly in non-entailment relations. The
eSNLI dataset (Camburu et al.,2018) is an explana-
tion dataset for general NLI. It extends the Stanford
Natural Language Inference dataset (Bowman et al.,
2015) with human-generated text explanations.
Hierarchical feature pipeline based MTL archi-
tectures (HiFeatMTL) use the outputs of one task
as a feature in the next and are distinct from hier-
archical signal pipeline architectures wherein the
摘要:

EffectiveCross-TaskTransferLearningforExplainableNaturalLanguageInferencewithT5IrinaBigoulaeva1*,RachneetSachdeva1*,HarishTayyarMadabushi2*,AlineVillavicencio3andIrynaGurevych11UbiquitousKnowledgeProcessing(UKP)Lab,TechnischeUniversitätDarmstadt2DepartmentofComputerScience,TheUniversityofBath3Depart...

展开>> 收起<<
Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:227.79KB 格式:PDF 时间:2025-08-18

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注