Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5

2025-08-18 3 0 227.79KB 7 页 10玖币

侵权投诉

Effective Cross-Task Transfer Learning for

Explainable Natural Language Inference with T5

Irina Bigoulaeva1*, Rachneet Sachdeva1*, Harish Tayyar Madabushi2*,

Aline Villavicencio3and Iryna Gurevych1

1Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt

2Department of Computer Science, The University of Bath

3Department of Computer Science, The University of Shefﬁeld

www.ukp.tu-darmstadt.de

htm43@bath.ac.uk, a.villavicencio@sheffield.ac.uk

Abstract

We compare sequential ﬁne-tuning with a

model for multi-task learning in the context

where we are interested in boosting perfor-

mance on two tasks, one of which depends

on the other. We test these models on the

FigLang2022 shared task which requires par-

ticipants to predict language inference labels

on ﬁgurative language along with correspond-

ing textual explanations of the inference pre-

dictions. Our results show that while sequen-

tial multi-task learning can be tuned to be good

at the ﬁrst of two target tasks, it performs less

well on the second and additionally struggles

with overﬁtting. Our ﬁndings show that sim-

ple sequential ﬁne-tuning of text-to-text mod-

els is an extraordinarily powerful method for

cross-task knowledge transfer while simulta-

neously predicting multiple interdependent tar-

gets. So much so, that our best model achieved

the (tied) highest score on the task1.

1 Introduction and Motivation

The transfer of information between supervised

learning objectives can be achieved in Pre-trained

Language Models (PLMs) using either multi-task

learning (MTL) (Caruana,1997) or sequential ﬁne-

tuning (SFT) (Phang et al.,2018). MTL involves

simultaneously training a model on multiple learn-

ing objectives using a weighted sum of their loss,

while SFT involves sequentially training on a set

of related tasks. Recent work has extended the

SFT approach by converting all NLP problems into

text-to-text (i.e., sequence-to-sequence where both

input and output sequences are natural text) prob-

lems (Raffel et al.,2019). The resultant model –

T5 – has achieved state-of-the-art results on a vari-

*Equal Contribution

To ensure reproducibility and to enable other researchers

to build upon our work, we make our code and mod-

els freely available at

https://github.com/Rachneet/

cross-task-figurative-explanations

ety of tasks such as question answering, sentiment

analysis, and, most relevant to this work, Natural

Language Inference (NLI).

In this work, we focus our efforts on the trans-

fer of information from multiple related tasks for

improved performance on a different set of tasks.

In addition, we compare the effectiveness of SFT

with that of MTL in a context where one of the

target tasks is dependent on the other. Given the

dependence of one of the target tasks on the other,

we implement an end-to-end multi-task learning

model to perform each of the tasks sequentially:

an architecture referred to as a hierarchical feature

pipeline based MTL architecture (HiFeatMTL, for

short) (Chen et al.,2021). While HiFeatMTL has

been previously used in different contexts (see Sec-

tion 3), it has, to the best of our knowledge, not

been used with, or compared to, text-to-text mod-

els. This is of particular importance as such mod-

els are known to enable transfer learning (Raffel

et al.,2019) and it is crucial to determine if tradi-

tional MTL methods can boost cross-task knowl-

edge transfer in such models.

Speciﬁcally we participate in the FigLang2022

Shared Task

, which extends NLI to include a

ﬁgurative-language hypothesis and additionally re-

quires participants to output a textual explanation

(also see Section 2). FigLang2022 is ideally suited

for the exploration of knowledge transfer, as PLMs

have been shown to struggle with ﬁgurative lan-

guage and so any gains achieved are a result of

knowledge transfer. For example, Liu et al. (2022)

show that in the zero- and few-shot settings, PLMs

perform signiﬁcantly worse than humans. This is

especially the case with idioms (Yu and Ettinger,

2020;Tayyar Madabushi et al.,2021), on which

T5 does particularly poorly (see Section 4). Addi-

tionally, FigLang2022’s emphasis on explanations

of the predicted labels provides us with the oppor-

2https://ﬁglang2022sharedtask.github.io/

arXiv:2210.17301v1 [cs.CL] 31 Oct 2022

tunity to test cross-task knowledge transfer in a

setting where one target task depends on the other

(HiFeatMTL) – this is especially so given the eval-

uation methods used (detailed in Section 2).

We evaluate the effectiveness of boosting per-

formance on the target tasks through the transfer

of information from two related tasks: a) eSNLI,

which is a dataset consisting of explanations asso-

ciated with NLI labels, and b) IMPLI, which is an

NLI dataset (without explanations) that contains

ﬁgurative language. More concretely, we set out to

answer the following research questions:

Can distinct task-speciﬁc knowledge be trans-

ferred from separate tasks so as to improve

performance on a target task? Concretely, can

we transfer explanations of literal language

from eSNLI and ﬁgurative NLI without expla-

nations from IMPLI?

Which of the two knowledge transfer tech-

niques (SFT or HiFeatMTL) is more effective

in the text-to-text context?

2 The FigLang2022 Shared Task

FigLang2022 is a variation of the NLI task which

requires the generation of a textual explanation for

the NLI prediction. Additionally, the hypothesis

is a sentence that employs one of four kinds of

ﬁgurative expressions: sarcasm, simile, idiom, or

metaphor. Additionally, a hypothesis can be a cre-

ative paraphrase, which rewords the premise using

more expressive, literal terminology. Table 1shows

examples from the task dataset.

Entailment

Premise I respectfully disagree.

Hypothesis I beg to differ. (Idiom)

Explanation To beg to differ is to disagree

with someone, and in this

sentence the speaker is

respectfully disagreeing.

Contradiction

Premise She was calm.

Hypothesis She was like a kitten in a den

of coyotes. (Simile)

Explanation A kitten in a den of coyotes

would be scared and not calm.

Table 1: An entailment and contradiction pair from the

FigLang2022 dataset.

FigLang2022 takes into consideration the qual-

ity of the generated explanation when assessing

the model’s performance by use of an explanation

score, which is the average between BERTScore

and BLEURT and ranges between 0 and 100. The

task leaderboard is based on NLI label accuracy at

an explanation score threshold of 60, although the

NLI label accuracy is reported at three thresholds

of the explanation score (i.e. 0, 50, and 60) so as

to provide a glimpse of how the model’s NLI and

explanation abilities are inﬂuenced by each other.

3 Related Work

NLI is considered central to the task of Natural

Language Understanding, and there has been sig-

niﬁcant focus on the development of models that

can perform well on the task (Wang et al.,2018).

This task of language inference has been indepen-

dently extended to incorporate explanations (Cam-

buru et al.,2018) and ﬁgurative language (Stowe

et al.,2022) (both detailed below). Chakrabarty

et al. (2022) introduced FLUTE, the Figurative

Language Understanding and Textual Explanations

dataset which brought together these two aspects.

Previous shared tasks involving ﬁgurative lan-

guage focused on the identiﬁcation or represen-

tation of ﬁgurative knowledge: For example,

FigLang2020 (Klebanov et al.,2020) and Task

6 of SemEval 2022 (Abu Farha et al.,2022) in-

volved sarcasm detection, and Task 2 of SemEval

2022 (Tayyar Madabushi et al.,2022) involved the

identiﬁcation and representation of idioms.

The generation of textual explanations necessi-

tates the use of generative models such as BART

(Lewis et al.,2020) or T5 (Raffel et al.,2019).

Narang et al. (2020) introduce WT5, a sequence-to-

sequence model that outputs natural-text explana-

tions alongside its predictions and Erliksson et al.

(2021) found T5 to consistently outperform BART

in explanation generation.

Of speciﬁc relevance to our work are the IM-

PLI (Stowe et al.,2022) and eSNLI (Camburu et al.,

2018) datasets. IMPLI links a ﬁgurative sentence,

speciﬁcally idiomatic or metaphoric, to a literal

counterpart, with the NLI relation being either en-

tailment or non-entailment. Stowe et al. (2022)

show that idioms are difﬁcult for models to han-

dle, particularly in non-entailment relations. The

eSNLI dataset (Camburu et al.,2018) is an explana-

tion dataset for general NLI. It extends the Stanford

Natural Language Inference dataset (Bowman et al.,

2015) with human-generated text explanations.

Hierarchical feature pipeline based MTL archi-

tectures (HiFeatMTL) use the outputs of one task

as a feature in the next and are distinct from hier-

archical signal pipeline architectures wherein the

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

EffectiveCross-TaskTransferLearningforExplainableNaturalLanguageInferencewithT5IrinaBigoulaeva1*,RachneetSachdeva1*,HarishTayyarMadabushi2*,AlineVillavicencio3andIrynaGurevych11UbiquitousKnowledgeProcessing(UKP)Lab,TechnischeUniversitätDarmstadt2DepartmentofComputerScience,TheUniversityofBath3Depart...

展开>> 收起<<

Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5.pdf

共7页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: