Closed-book Question Generation via Contrastive Learning Xiangjue Dong1 Jiaying Lu2 Jianling Wang1 James Caverlee1 1Texas AM University2Emory University

2025-04-24 0 0 484.19KB 13 页 10玖币

侵权投诉

Closed-book Question Generation via Contrastive Learning

Xiangjue Dong1, Jiaying Lu2∗

, Jianling Wang1∗

, James Caverlee1

1Texas A&M University, 2Emory University

{xj.dong, jlwang, caverlee}@tamu.edu, jiaying.lu@emory.edu

Abstract

Question Generation (QG) is a fundamental

NLP task for many downstream applications.

Recent studies on open-book QG, where sup-

portive answer-context pairs are provided to

models, have achieved promising progress.

However, generating natural questions under a

more practical closed-book setting that lacks

these supporting documents still remains a

challenge. In this work, we propose a new

QG model for this closed-book setting that

is designed to better understand the seman-

tics of long-form abstractive answers and store

more information in its parameters through

contrastive learning and an answer reconstruc-

tion module. Through experiments, we vali-

date the proposed QG model on both public

datasets and a new WikiCQA dataset. Empir-

ical results show that the proposed QG model

outperforms baselines in both automatic eval-

uation and human evaluation. In addition, we

show how to leverage the proposed model to

improve existing question-answering systems.

These results further indicate the effectiveness

of our QG model for enhancing closed-book

question-answering tasks.

1 Introduction

Question Generation (QG) has a wide range of

applications, such as generating questions for ex-

ams (Jia et al.,2021;Lelkes et al.,2021;Dugan

et al.,2022) or children’s story books (Zhao et al.,

2022;Yao et al.,2022), recommending questions

for users in a dialogue system (Shukla et al.,2019;

Laban et al.,2020), improving visual (Li et al.,

2018;Lu et al.,2022) or textual question-answering

tasks (Duan et al.,2017;Lewis et al.,2019a;Zhang

and Bansal,2019;Sultan et al.,2020;Lyu et al.,

2021), asking clariﬁcation questions (Rao and

Daumé III,2019;Yu et al.,2020;Ren et al.,2021),

and generating queries for SQL (Wu et al.,2021)

or multimodal documents (Kim et al.,2021).

∗Equal Contribution

Previous works on QG are mainly under the open-

book setting, which aims to generate questions

based on factoid or human-generated short an-

swers under the assumption that there is access

to external knowledge like retrieved documents

or passages (Du et al.,2017;Zhao et al.,2018;

Kim et al.,2019;Fei et al.,2021). After Roberts

et al. (2020) demonstrated that feeding a large

pre-trained model input questions alone without

any external knowledge can lead to competitive re-

sults with retrieval-based methods on open-domain

question-answering benchmarks, there is an in-

creasing interest in the closed-book setting. This

closed-book setting is appealing in practice and

can be widely applied, e.g., in question sugges-

tion (Laban et al.,2020;Yin et al.,2021), query

recommendation (Kim et al.,2021), and other prac-

tical settings where extensive external knowledge

is unavailable.

However, generating questions without access

to such external knowledge is challenging for two

key reasons. First, without access to retrieved doc-

uments (or passages), simple open-domain strate-

gies like basing the answers on these documents (or

passages) are not possible under the closed-book

setting. Instead, models must rely on the answers

alone. Second, the data used by most of the closed-

book works (Lewis et al.,2021;Wang et al.,2021)

are variants of existing open-domain datasets, e.g.,

SQuAD (Rajpurkar et al.,2018), TriviaQA (Joshi

et al.,2017), WebQuestions (Berant et al.,2013)

that ignore the answer-related passages. These an-

swers in open-book works are usually short, e.g.,

entities, and easier to be remembered by the lan-

guage model and stored in the parameters of the

model than long-form answers. Thus, this leads

to our motivating research question – How can we

empower a QG model to better understand the se-

mantics of long-form abstractive answers and store

more information in its parameters?

To tackle the aforementioned challenges existing

arXiv:2210.06781v2 [cs.CL] 10 Feb 2023

in the closed-book setting, this paper proposes a

new QG model with two unique characteristics: (i)

a contrastive learning loss designed to better under-

stand the semantics of the answers and the seman-

tic relationship between answers and ground-truth

questions at a contextual-level; and (ii) an answer

reconstruction loss designed to measure the an-

swerability of the generated question. Contrastive

learning has shown promising results in many NLP

tasks, e.g., (Giorgi et al.,2021;Gao et al.,2021;

Yang et al.,2021) and aligns positive pairs bet-

ter with available supervised signals (Gao et al.,

2021); here we show how to learn question rep-

resentations by distinguishing features of correct

question-answer pairs from features of incorrectly

linked question-answer pairs. Further, to ensure the

generated questions are of good quality and can be

answered by the answer that is used for question

generation, we frame the model as a generation-

reconstruction process (Cao et al.,2019;Zhu et al.,

2020), by predicting the original answers given

the generated questions by a pre-trained seq2seq

model. In addition, we introduce a new closed-

book dataset with long-form abstractive answers –

WikiCQA – to complement existing datasets like

GooAQ (Khashabi et al.,2021) and ELI5 (Fan

et al.,2019) and show how to leverage our model

to generate synthetic data to improve closed-book

question-answering tasks.

Through experiments, we ﬁnd that the proposed

QG model shows improvement through both auto-

matic and human evaluation metrics on WikiCQA

and two public datasets. Compared to the base-

line, the proposed QG framework shows an im-

provement of up to 2.0%, 2.7%, and 1.8% on

the ROUGE-L score on WikiCQA, GooAQ-S, and

ELI5, respectively, and 1.3% and 2.6% in terms of

relevance and correctness. Furthermore, we lever-

age the QG framework to generate synthetic QA

data from WikiHow summary data and pre-train

a closed-book QA model on it in both an unsu-

pervised and semi-supervised setting. The perfor-

mance is evaluated on both seen (WikiCQA) and

unseen (GooAQ-S, ELI5) datasets. We ﬁnd consis-

tent improvements across these datasets, indicating

the QG model’s effectiveness in enhancing closed-

book question-answering tasks.

In conclusion, our contributions can be summa-

rized as follows:

•

We propose a contrastive QG model, which to

our knowledge is the ﬁrst work to explore con-

trastive learning for QG under a closed-book

setting.

•

The proposed model outperforms baselines on

three datasets. The human evaluation also indi-

cates that the questions generated by our model

are more informative compared to other base-

lines.

•

We leverage the QG model as a data augmenta-

tion strategy to generate large-scale QA pairs.

Consistent improvements shown on both seen

datasets and unseen datasets indicate the QG

model’s effectiveness in enhancing closed-book

question-answering tasks.

2 Related Work

Many previous works on QG are under the open-

book setting, which takes factoid short answers (Ra-

jpurkar et al.,2016) or human-generated short

answers (Koˇ

ciský et al.,2018) with the corre-

sponding passages to generate questions (Zhang

et al.,2021). Early approaches for question genera-

tion rely on rule-based methods (Labutov et al.,

2015;Khullar et al.,2018). To bypass hand-

crafted rules and sophisticated pipelines in QG,

Du et al. (2017) introduce a vanilla RNN-based

sequence-to-sequence approach with an attention

mechanism. The recently proposed pre-trained

transformer-based frameworks (Lewis et al.,2020;

Raffel et al.,2020) also improve the performance

of QG. In addition, Sultan et al. (2020) shows that

the lexical and factual diversity of QG provides

better QA training. However, their success can not

directly adapt to the closed-book setting, where

the model is supposed to generate questions solely

relying on answers. In this work, we explore the

widely applicable closed-book QG setting, which

is still under-explored.

Contrastive Learning

aims to pull semantically

similar neighbors close and push non-neighbors

apart. It has achieved great success under both

supervised and unsupervised settings. In pioneer

works, the contrastive loss function (Hadsell et al.,

2006;Chopra et al.,2005) has been proposed as a

training objective in deep metric learning consid-

ering both similar and dissimilar pairs. Recently,

Chen et al. (2020) proposes the SimCLR frame-

work to learn useful visual representations. View-

ing contrastive learning as dictionary look-up, He

et al. (2020) present Momentum Contrast (MoCo)

to build dynamic dictionaries for contrastive learn-

ing. Some works apply contrastive learning into

the NLP domain to learn better sentence repre-

sentations (Giorgi et al.,2021;Gao et al.,2021).

In addition, contrastive learning has been applied

in multilingual neural machine translation (Pan

et al.,2021), abstractive summarization (Liu and

Liu,2021), and multi-document question genera-

tion (Cho et al.,2021). The recent most relevant

work is (Yang et al.,2021), where they design two

contrastive losses for paraphrase generation. In this

work, we adopt contrastive learning for improv-

ing representation learning in question generation

under a closed-book setting.

3 Proposed Approach

To answer our research question – How can we

empower a QG model to better understand the se-

mantics of long-form abstractive answers and store

more information in its parameters? – we propose a

closed-book QG model, which generates questions

directly without access to external knowledge. For-

mally, given an answer sentence

, a closed-book

QG engine generates a natural question

. Fig-

ure 1illustrates an overview of the proposed QG

framework, which consists of three parts: question

generation, contrastive learning, and answer recon-

struction. The framework is optimized with the

joint losses from these three parts simultaneously.

DReconstructor

Gumbel

Softmax

Figure 1: An overview of the proposed closed-book

QG framework, which consists of three parts: con-

trastive learning, question generation, and answer con-

struction. Ai: represents answer i; Qirepresents ques-

tion i.

3.1 Question Generation

We ﬁrst focus on question generation through a

sequence-to-sequence architecture which consists

of an encoder and a decoder (Sutskever et al.,2014;

Vaswani et al.,2017). The encoder takes an input

sequence of source words

x= (x1, x2, . . . , xn)

and maps it to a sequence of continuous represen-

tations

z= (z1, z2, . . . , zn)

. Then, the decoder

takes

and generates a sequence of target words

y= (y1, y2, . . . , ym)

at a time. The closed-book

QG task is deﬁned as ﬁnding ˆ

y= arg max

P(y|x),(1)

where

P(y|x)

is the conditional likelihood of the

predicted question sequence ygiven answer x.

P(y|x) =

i=1

p(yt|y<t,x),(2)

Given the answer-question pairs, the training ob-

jective of the generation part in the proposed frame-

work is to minimize the Negative Log-Likelihood

(NLL) of the training data,

Lqg =−

i=1

log p(qi|A),(3)

where

is the

-th token in the generated question

and Ais the answer.

A naive question generation model will generate

questions based on answers but lacks a rich model

of the semantics of answers nor can it guarantee the

generated questions have a semantic relationship

with the answers. Intuitively, an encoded answer

should be similar to its question and dissimilar to

others. In addition, the generated question should

be able to be answered by the answers. Hence, this

motivates the following contrastive learning and

answer reconstruction modules.

3.2 Contrastive Learning

Contrastive learning aims to pull positive pairs and

push apart negative pairs to learn effective represen-

tations. Further, the supervised signals can produce

better sentence embeddings by improving align-

ment between positive pairs (Chen et al.,2020).

An effective QG model should be able to under-

stand the semantics of the answers and the semantic

relationship with the ground-truth questions. Espe-

cially, the encoded answer should have semantic

similarity with its ground-truth question and dis-

similarity with other questions. Thus, aiming to

learn a similarity function that pulls the distance

between the answer sequence representation and

its ground-truth question sequence representation

closer, we design a contrastive loss in the repre-

sentation space. Speciﬁcally, given a positive pair

S={(xi,yi)}n

i=1

, where

and

are semanti-

cally related inputs, the other

2(n−1)

examples

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Closed-bookQuestionGenerationviaContrastiveLearningXiangjueDong1,JiayingLu2,JianlingWang1,JamesCaverlee11TexasA&MUniversity,2EmoryUniversity{xj.dong,jlwang,caverlee}@tamu.edu,jiaying.lu@emory.eduAbstractQuestionGeneration(QG)isafundamentalNLPtaskformanydownstreamapplications.Recentstudiesonopen-bo...

展开>> 收起<<

Closed-book Question Generation via Contrastive Learning Xiangjue Dong1 Jiaying Lu2 Jianling Wang1 James Caverlee1 1Texas AM University2Emory University.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Closed-book Question Generation via Contrastive Learning Xiangjue Dong1 Jiaying Lu2 Jianling Wang1 James Caverlee1 1Texas AM University2Emory University

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: