Improving Question Answering with Generation of NQ-like Questions Saptarashmi Bandyopadhyay University of Maryland College Park

2025-04-24 0 0 703.35KB 7 页 10玖币
侵权投诉
Improving Question Answering with Generation of NQ-like Questions
Saptarashmi Bandyopadhyay
University of Maryland, College Park
saptab1@umd.edu
Shraman Pal
IIT Kharagpur
shramanpal@gmail.com
Hao Zou
University of Minnesota
zou00080@umn.edu
Abhranil Chandra
IIT Kharagpur
abhranil.iitkgp@gmail.com
Jordan Boyd-Graber
University of Maryland, College Park
jbg@umiacs.umd.edu
Abstract
Question Answering (QA) systems require a
large amount of annotated data which is costly
and time-consuming to gather. Converting
datasets of existing QA benchmarks are chal-
lenging due to different formats and com-
plexities. To address these issues, we pro-
pose an algorithm to automatically generate
shorter questions resembling day-to-day hu-
man communication in the Natural Questions
(NQ) dataset from longer trivia questions in
Quizbowl (QB) dataset by leveraging conver-
sion in style among the datasets. This provides
an automated way to generate more data for
our QA systems. To ensure quality as well
as quantity of data, we detect and remove ill-
formed questions using a neural classifier. We
demonstrate that in a low resource setting, us-
ing the generated data improves the QA perfor-
mance over the baseline system on both NQ
and QB data. Our algorithm improves the scal-
ability of training data while maintaining qual-
ity of data for QA systems.
1 Introduction
Large-scale data collection is a challenging process
in the domain of Question Answering and Informa-
tion Retrieval due to the necessity of high-quality
annotations which are scarce and expensive to gen-
erate. There are several QA datasets (Joshi et al.,
2017), (Rajpurkar et al.,2016), (Yang et al.,2018),
(Kwiatkowski et al.,2019), (Rodriguez et al.,2021)
with significantly different structure and complex-
ity. Large quantities of high quality data are effec-
tive in training a more efficacious Machine Learn-
ing system.
In this paper, we focus on converting questions
normally spoken in a trivia competition to ques-
tions resembling day-to-day human communica-
tion. Trivia questions have multiple lines, consist
of multiple hints as standalone sentences to an an-
swer, and players can buzz on any sentence in the
question to give an answer. In contrast, questions
used in daily human communication are shorter
(often a single line). We propose an algorithm
to generate multiple short natural questions from
every long trivia question by converting each of
the sentences with multiple hints to several shorter
questions. We also add a BERT (Devlin et al.,
2019) based quality control method to filter out
the ill-formed questions and retain the well-formed
questions. We show that our algorithm for gener-
ating questions improves the performance of two
question answering (QA) systems in a low resource
setting. We also demonstrate that concatenation of
original natural questions with generated questions
improves the QA system performance. Finally, we
prove that by using such a method to generate syn-
thetic data, we can achieve higher scores than a
system that uses only NQ data.
2 Dataset and Data Extraction
We use two popular datasets- the Quizbowl (Ro-
driguez et al.,2021), henceforth referred to as QB
dataset, and NQ-Open dataset (Lee et al.,2019), de-
rived from Natural Questions (Kwiatkowski et al.,
2019). QB has a total of 119247 question/answer
samples and NQ has 91434 total question/answer
samples. For the NQ dataset, we use the same 1800
dev and 1769 test question/answer splits as used in
the EfficientQA Competition (Min et al.,2021).
As our task involves transforming QB questions
to NQ-like questions, we extract pairs of questions
that are semantically similar. We first extract ev-
ery possible question-question pair with the same
answer using string matching resulting in 95651
question-question pairs. From this parallel corpus,
we extract the last sentence of the QB questions and
pass them through a pre-trained Sentence-BERT
(Reimers and Gurevych,2019) model along with
the corresponding NQ question. We take the co-
sine similarity between the [CLS] embedding to
find pairs that are semantically equivalent by set-
ting the threshold to 0.5. From this we extract
arXiv:2210.06599v1 [cs.CL] 12 Oct 2022
19439 question-question pairs having moderate se-
mantic equivalence. We also use the same index
from the last sentence of QB questions paired with
NQ questions to retrieve corresponding QB full
questions in paragraph form. Out of the extracted
corpora, we create a smaller dataset of the last sen-
tence of every QB question (and corresponding QB
full paragraph) paired with NQ questions for our
low resource setting. This paired dataset has a total
of 1218 training samples, 93 validation samples
and 563 test samples which are semantically sim-
ilar. We outline the statistics of the baseline and
generated datasets in Table 1.
3 Methods to generate NQ-like Questions
We use the following NLP techniques to generate
NQ-like questions from our QB dataset
• Tokenization
Coreference resolution
Parse tree output
Bag of words based question generation
We outline our methods in Algorithm 1. At first,
we tokenize the long paragraph of a QB question
into individual sentences with hints to an answer
providing sufficient context. Using every sentence
annotated by the question id was not sufficient due
to annotation errors of sentence delimiters like a
‘.’.
Initial: ... and "k." For 10 points, ...
Tokenized:
... and " k. "
For 10 points , ...
However, each sentence contains 2-3 clues about
the answer whereas NQ questions generally are in
1 sentence with 1 clue. To resolve this issue, we
use Coreference Resolution on every sentence to
obtain clusters of nouns and pronouns referring to
those nouns (Kirstain et al.,2021), and Parse Trees.
We then split the sentences to get parts of a sen-
tence with similar syntactic structure using Parse
Tree based on ADVCL (adverbial clause) and CC
(conj) tags. Any pronoun that is present in a split
is replaced with the noun from the clusters found
using Coreference Resolution. After breaking into
splits, we run a check on the number of words in
every split and append sentences with smaller num-
ber of words (less than 8) back into the original
sentence. We finally clean the split sentences of
trailing punctuation marks and words that should
not be at the sentence ending like ‘and’, ‘but’ etc.
This gives us sentences or phrases that usually have
one clue similar to NQ.
To make the outputs question-like, we use a bag
of words approach by replacing words like ‘this’
with ‘which’, ‘it’ with ‘what’ for all but the last sen-
tence. The last sentence in QB has a specific syn-
tactic structure where the sentence contains, ‘For x
points , name this’ followed by the identifier to the
answer. We replace this whole phrase according to
whatever comes after. For example,
‘name this author’ : ‘who is the author’
‘name this 1985 event’ : ‘which is the 1985
event’
‘name this phenomenon’ : ‘what is the phe-
nomenon’
Algorithm 1 NQ-like Question Generation
1: procedure NQLIKEGEN(sentences)
2: Individual(sentences).split full QB
question into single sentences
3: P re_processing(sentences).remove
punctuations, conjunctions or any other word
not expected at the end of the sentence.
4: Coreference_clusters(sentences).
forms clusters like [“this country”, “its”]
5: P arse_T ree(sentences).Split
the conjoined clues based on similar syntactic
structure
6: if Noun, P ronoun
not in
same_parse_split then
7: Replace . replace the pronoun in the
other split with the NP
8: for QB_last_sentences do
9: String_Replacement(last_sentence)
.
Replace ‘name this’ with the annotated ‘wh’
question by NOUNS vocabulary formed
10: for QB_non_last_sentences do
11: Bag_of_W ords . use
frequency based vocabulary table to replace
“this” to “which”
12: return nq_like_questions
To achieve this, we extract noun phrases follow-
ing ‘name this’ from 1000 samples and form a
摘要:

ImprovingQuestionAnsweringwithGenerationofNQ-likeQuestionsSaptarashmiBandyopadhyayUniversityofMaryland,CollegeParksaptab1@umd.eduShramanPalIITKharagpurshramanpal@gmail.comHaoZouUniversityofMinnesotazou00080@umn.eduAbhranilChandraIITKharagpurabhranil.iitkgp@gmail.comJordanBoyd-GraberUniversityofMaryl...

展开>> 收起<<
Improving Question Answering with Generation of NQ-like Questions Saptarashmi Bandyopadhyay University of Maryland College Park.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:703.35KB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注