Improving Question Answering with Generation of NQ-like Questions Saptarashmi Bandyopadhyay University of Maryland College Park

2025-04-24 0 0 703.35KB 7 页 10玖币

侵权投诉

Improving Question Answering with Generation of NQ-like Questions

Saptarashmi Bandyopadhyay

University of Maryland, College Park

saptab1@umd.edu

Shraman Pal

IIT Kharagpur

shramanpal@gmail.com

Hao Zou

University of Minnesota

zou00080@umn.edu

Abhranil Chandra

IIT Kharagpur

abhranil.iitkgp@gmail.com

Jordan Boyd-Graber

University of Maryland, College Park

jbg@umiacs.umd.edu

Abstract

Question Answering (QA) systems require a

large amount of annotated data which is costly

and time-consuming to gather. Converting

datasets of existing QA benchmarks are chal-

lenging due to different formats and com-

plexities. To address these issues, we pro-

pose an algorithm to automatically generate

shorter questions resembling day-to-day hu-

man communication in the Natural Questions

(NQ) dataset from longer trivia questions in

Quizbowl (QB) dataset by leveraging conver-

sion in style among the datasets. This provides

an automated way to generate more data for

our QA systems. To ensure quality as well

as quantity of data, we detect and remove ill-

formed questions using a neural classiﬁer. We

demonstrate that in a low resource setting, us-

ing the generated data improves the QA perfor-

mance over the baseline system on both NQ

and QB data. Our algorithm improves the scal-

ability of training data while maintaining qual-

ity of data for QA systems.

1 Introduction

Large-scale data collection is a challenging process

in the domain of Question Answering and Informa-

tion Retrieval due to the necessity of high-quality

annotations which are scarce and expensive to gen-

erate. There are several QA datasets (Joshi et al.,

2017), (Rajpurkar et al.,2016), (Yang et al.,2018),

(Kwiatkowski et al.,2019), (Rodriguez et al.,2021)

with signiﬁcantly different structure and complex-

ity. Large quantities of high quality data are effec-

tive in training a more efﬁcacious Machine Learn-

ing system.

In this paper, we focus on converting questions

normally spoken in a trivia competition to ques-

tions resembling day-to-day human communica-

tion. Trivia questions have multiple lines, consist

of multiple hints as standalone sentences to an an-

swer, and players can buzz on any sentence in the

question to give an answer. In contrast, questions

used in daily human communication are shorter

(often a single line). We propose an algorithm

to generate multiple short natural questions from

every long trivia question by converting each of

the sentences with multiple hints to several shorter

questions. We also add a BERT (Devlin et al.,

2019) based quality control method to ﬁlter out

the ill-formed questions and retain the well-formed

questions. We show that our algorithm for gener-

ating questions improves the performance of two

question answering (QA) systems in a low resource

setting. We also demonstrate that concatenation of

original natural questions with generated questions

improves the QA system performance. Finally, we

prove that by using such a method to generate syn-

thetic data, we can achieve higher scores than a

system that uses only NQ data.

2 Dataset and Data Extraction

We use two popular datasets- the Quizbowl (Ro-

driguez et al.,2021), henceforth referred to as QB

dataset, and NQ-Open dataset (Lee et al.,2019), de-

rived from Natural Questions (Kwiatkowski et al.,

2019). QB has a total of 119247 question/answer

samples and NQ has 91434 total question/answer

samples. For the NQ dataset, we use the same 1800

dev and 1769 test question/answer splits as used in

the EfﬁcientQA Competition (Min et al.,2021).

As our task involves transforming QB questions

to NQ-like questions, we extract pairs of questions

that are semantically similar. We ﬁrst extract ev-

ery possible question-question pair with the same

answer using string matching resulting in 95651

question-question pairs. From this parallel corpus,

we extract the last sentence of the QB questions and

pass them through a pre-trained Sentence-BERT

(Reimers and Gurevych,2019) model along with

the corresponding NQ question. We take the co-

sine similarity between the [CLS] embedding to

ﬁnd pairs that are semantically equivalent by set-

ting the threshold to 0.5. From this we extract

arXiv:2210.06599v1 [cs.CL] 12 Oct 2022

19439 question-question pairs having moderate se-

mantic equivalence. We also use the same index

from the last sentence of QB questions paired with

NQ questions to retrieve corresponding QB full

questions in paragraph form. Out of the extracted

corpora, we create a smaller dataset of the last sen-

tence of every QB question (and corresponding QB

full paragraph) paired with NQ questions for our

low resource setting. This paired dataset has a total

of 1218 training samples, 93 validation samples

and 563 test samples which are semantically sim-

ilar. We outline the statistics of the baseline and

generated datasets in Table 1.

3 Methods to generate NQ-like Questions

We use the following NLP techniques to generate

NQ-like questions from our QB dataset

• Tokenization

• Coreference resolution

• Parse tree output

• Bag of words based question generation

We outline our methods in Algorithm 1. At ﬁrst,

we tokenize the long paragraph of a QB question

into individual sentences with hints to an answer

providing sufﬁcient context. Using every sentence

annotated by the question id was not sufﬁcient due

to annotation errors of sentence delimiters like a

‘.’.

•Initial: ... and "k." For 10 points, ...

•Tokenized:

–... and " k. "

–For 10 points , ...

However, each sentence contains 2-3 clues about

the answer whereas NQ questions generally are in

1 sentence with 1 clue. To resolve this issue, we

use Coreference Resolution on every sentence to

obtain clusters of nouns and pronouns referring to

those nouns (Kirstain et al.,2021), and Parse Trees.

We then split the sentences to get parts of a sen-

tence with similar syntactic structure using Parse

Tree based on ADVCL (adverbial clause) and CC

(conj) tags. Any pronoun that is present in a split

is replaced with the noun from the clusters found

using Coreference Resolution. After breaking into

splits, we run a check on the number of words in

every split and append sentences with smaller num-

ber of words (less than 8) back into the original

sentence. We ﬁnally clean the split sentences of

trailing punctuation marks and words that should

not be at the sentence ending like ‘and’, ‘but’ etc.

This gives us sentences or phrases that usually have

one clue similar to NQ.

To make the outputs question-like, we use a bag

of words approach by replacing words like ‘this’

with ‘which’, ‘it’ with ‘what’ for all but the last sen-

tence. The last sentence in QB has a speciﬁc syn-

tactic structure where the sentence contains, ‘For x

points , name this’ followed by the identiﬁer to the

answer. We replace this whole phrase according to

whatever comes after. For example,

• ‘name this author’ : ‘who is the author’

•

‘name this 1985 event’ : ‘which is the 1985

event’

•

‘name this phenomenon’ : ‘what is the phe-

nomenon’

Algorithm 1 NQ-like Question Generation

1: procedure NQLIKEGEN(sentences)

2: Individual(sentences).split full QB

question into single sentences

3: P re_processing(sentences).remove

punctuations, conjunctions or any other word

not expected at the end of the sentence.

4: Coreference_clusters(sentences).

forms clusters like [“this country”, “its”]

5: P arse_T ree(sentences).Split

the conjoined clues based on similar syntactic

structure

6: if Noun, P ronoun

not in

same_parse_split then

7: Replace . replace the pronoun in the

other split with the NP

8: for QB_last_sentences do

9: String_Replacement(last_sentence)

Replace ‘name this’ with the annotated ‘wh’

question by NOUNS vocabulary formed

10: for QB_non_last_sentences do

11: Bag_of_W ords . use

frequency based vocabulary table to replace

“this” to “which”

12: return nq_like_questions

To achieve this, we extract noun phrases follow-

ing ‘name this’ from 1000 samples and form a

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ImprovingQuestionAnsweringwithGenerationofNQ-likeQuestionsSaptarashmiBandyopadhyayUniversityofMaryland,CollegeParksaptab1@umd.eduShramanPalIITKharagpurshramanpal@gmail.comHaoZouUniversityofMinnesotazou00080@umn.eduAbhranilChandraIITKharagpurabhranil.iitkgp@gmail.comJordanBoyd-GraberUniversityofMaryl...

展开>> 收起<<

Improving Question Answering with Generation of NQ-like Questions Saptarashmi Bandyopadhyay University of Maryland College Park.pdf

共7页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Improving Question Answering with Generation of NQ-like Questions Saptarashmi Bandyopadhyay University of Maryland College Park

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: