Look to the Right Mitigating Relative Position Bias in Extractive Question Answering Kazutoshi Shinoda12Saku Sugawara2Akiko Aizawa12

2025-05-02 0 0 402.37KB 8 页 10玖币
侵权投诉
Look to the Right: Mitigating Relative Position Bias
in Extractive Question Answering
Kazutoshi Shinoda1,2Saku Sugawara2Akiko Aizawa1,2
1The University of Tokyo
2National Institute of Informatics
shinoda@is.s.u-tokyo.ac.jp
{saku,aizawa}@nii.ac.jp
Abstract
Extractive question answering (QA) models
tend to exploit spurious correlations to make
predictions when a training set has unintended
biases. This tendency results in models not be-
ing generalizable to examples where the corre-
lations do not hold. Determining the spurious
correlations QA models can exploit is crucial
in building generalizable QA models in real-
world applications; moreover, a method needs
to be developed that prevents these models
from learning the spurious correlations even
when a training set is biased. In this study, we
discovered that the relative position of an an-
swer, which is defined as the relative distance
from an answer span to the closest question-
context overlap word, can be exploited by QA
models as superficial cues for making predic-
tions. Specifically, we find that when the rela-
tive positions in a training set are biased, the
performance on examples with relative posi-
tions unseen during training is significantly de-
graded. To mitigate the performance degrada-
tion for unseen relative positions, we propose
an ensemble-based debiasing method that does
not require prior knowledge about the distribu-
tion of relative positions. We demonstrate that
the proposed method mitigates the models’ re-
liance on relative positions using the biased
and full SQuAD dataset. We hope that this
study can help enhance the generalization abil-
ity of QA models in real-world applications.1
1 Introduction
Deep learning-based natural language understand-
ing (NLU) models are prone to use spurious cor-
relations in the training set. This tendency results
in models’ poor generalization ability to out-of-
distribution test sets (McCoy et al.,2019;Geirhos
et al.,2020), which is a significant challenge in the
field. Question answering (QA) models trained on
intentionally biased training sets are more likely
1
Our codes are available at
https://github.com/
KazutoshiShinoda/RelativePositionBias.
Context
... This changed in
1924
with formal re-
quirements developed for graduate degrees,
including offering Doctorate (PhD) degrees
...
Question
The granting of Doctorate degrees first oc-
curred in what year at Notre Dame?
Relative
Position
1
Context
... The other magazine, The Juggler, is re-
leased
twice
a year and focuses on student
literature and artwork ...
Question
How often is Notre Dame’s the Juggler pub-
lished?
Relative
Position
2
Table 1: Examples taken from SQuAD. Underlined
words are contained in both the context and question.
Bold spans are the answers to the questions. In both
the examples, answers are found by looking to the right
from the overlapping words. See §2.1 for the definition
of the relative position.
to learn solutions based on spurious correlations
rather than on causal relationships between inputs
and labels. For example, QA models can learn
question-answer type matching heuristics (Lewis
and Fan,2019), and absolute-positional correla-
tions (Ko et al.,2020), particularly when a training
set is biased toward examples with corresponding
spurious correlations. Collecting a fully unbiased
dataset is challenging. Therefore, it is vital to dis-
cover possible dataset biases that can degrade the
generalization and develop debiasing methods to
learn generalizable solutions even when training
on unintentionally biased datasets.
In extractive QA (e.g., Rajpurkar et al.,2016),
in which answers to questions are spans in textual
contexts, we find that the relative position of an an-
swer, which is defined as the relative distance from
an answer span to the closest word that appears in
both a context and a question, can be exploited as
superficial cues by QA models. See Table 1for the
arXiv:2210.14541v1 [cs.CL] 26 Oct 2022
Figure 1: F1 score for each relative position din the
SQuAD development set. ALL” in the legend refers to
a QA model trained on all the examples in the SQuAD
training set. The other terms refer to models trained
only on examples for which the respective conditionals
are satisfied. BERT-base was used for the QA models.
The accuracy is comparable to ALL for examples with
seen relative positions, but worse for others. Please re-
fer to §2.1 for the definition of d.
examples. Specifically, we find that when the rela-
tive positions are intentionally biased in a training
set, a QA model tends to degrade the performance
on examples where answers are located in relative
positions unseen during training, as shown in Fig-
ure 1. For example, when a QA model is trained on
examples with negative relative positions, as shown
in Table 1, the QA performance on examples with
non-negative relative positions is degraded by 10
20 points, as indicated by the square markers
(
) in Figure 1. Similar phenomena were observed
when the distribution of the relative positions in
the training set was biased differently, as shown in
Figure 1. This observation implies that the model
may preferentially learn to find answers from seen
relative positions.
We aim to develop a method for mitigating the
performance degradation on subsets with unseen
relative positions while maintaining the scores on
subsets with seen relative positions, even when
the training set is biased with respect to relative
positions. To this end, we propose debiasing meth-
ods based on an ensemble (Hinton,2002) of in-
tentionally biased and main models. The biased
model makes predictions relying on relative posi-
tions, which promotes the main model not depend-
ing solely on relative positions. Our experiments
on SQuAD (Rajpurkar et al.,2016) using BERT-
base (Devlin et al.,2019) as the main model show
that the proposed methods improved the scores
for unseen relative positions by 0
10 points. We
demonstrate that the proposed method is effective
in four settings where the training set is differently
filtered to be biased with respect to relative posi-
tions. Furthermore, when applied to the full train-
ing set, our method improves the generalization to
examples where questions and contexts have no
lexical overlap.
2 Relative Position Bias
2.1 Definition
In this study, we call a word that is contained in
both the question and the context as an overlap-
ping word. Let
d
be the relative position of the
nearest overlapping word from the answer span in
extractive QA. If
w
is a word,
c={wc
i}N
i=0
for
the sentence,
q={wq
i}M
i=0
for the question, and
a={wc
i}e
i=s
(
0seN
) for the answer, the
relative position dis defined as follows:
f(j, s, e) =
js, for j < s
0,for sje
je, for j > e
(1)
D={f(j, s, e)|wc
jq}(2)
d= argmind0D|d0|(3)
where
0jN
denotes the position of the word
wc
j
in the sentence,
f(i, s, e)
denotes the relative
position of
wc
i
from
a
, and
D
denotes the set of rel-
ative positions of all overlapping words.
2
Because
QA models favor spans that are located close to
the overlapping words (Jia and Liang,2017) and
accuracy deteriorates when the absolute distance
between the answer span and the overlapping word
is considerable (Sugawara et al.,2018), the one
with the lowest absolute value in Equation 3is
used as the relative position.3
2.2 Distribution of Relative Position d
Figure 2shows the distribution of relative position
d
in the SQuAD (Rajpurkar et al.,2016) training
set. This demonstrates that the
d
values are bi-
ased around zero. Although the tendency to bias
around zero is consistent for the other QA datasets,
there are differences in the distribution between the
datasets. See Appendix Bfor more details. This
2
Because function words as well as content words are
important clues for reading comprehension,
D
in Equation 2
can contain function and content words.
3
There are a few cases where
d
in Equation 3is not fixed
to one value. However, such examples are excluded from the
training and evaluation sets for brevity.
摘要:

LooktotheRight:MitigatingRelativePositionBiasinExtractiveQuestionAnsweringKazutoshiShinoda1;2SakuSugawara2AkikoAizawa1;21TheUniversityofTokyo2NationalInstituteofInformaticsshinoda@is.s.u-tokyo.ac.jp{saku,aizawa}@nii.ac.jpAbstractExtractivequestionanswering(QA)modelstendtoexploitspuriouscorrelationst...

展开>> 收起<<
Look to the Right Mitigating Relative Position Bias in Extractive Question Answering Kazutoshi Shinoda12Saku Sugawara2Akiko Aizawa12.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:402.37KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注