
Look to the Right: Mitigating Relative Position Bias
in Extractive Question Answering
Kazutoshi Shinoda1,2Saku Sugawara2Akiko Aizawa1,2
1The University of Tokyo
2National Institute of Informatics
shinoda@is.s.u-tokyo.ac.jp
{saku,aizawa}@nii.ac.jp
Abstract
Extractive question answering (QA) models
tend to exploit spurious correlations to make
predictions when a training set has unintended
biases. This tendency results in models not be-
ing generalizable to examples where the corre-
lations do not hold. Determining the spurious
correlations QA models can exploit is crucial
in building generalizable QA models in real-
world applications; moreover, a method needs
to be developed that prevents these models
from learning the spurious correlations even
when a training set is biased. In this study, we
discovered that the relative position of an an-
swer, which is defined as the relative distance
from an answer span to the closest question-
context overlap word, can be exploited by QA
models as superficial cues for making predic-
tions. Specifically, we find that when the rela-
tive positions in a training set are biased, the
performance on examples with relative posi-
tions unseen during training is significantly de-
graded. To mitigate the performance degrada-
tion for unseen relative positions, we propose
an ensemble-based debiasing method that does
not require prior knowledge about the distribu-
tion of relative positions. We demonstrate that
the proposed method mitigates the models’ re-
liance on relative positions using the biased
and full SQuAD dataset. We hope that this
study can help enhance the generalization abil-
ity of QA models in real-world applications.1
1 Introduction
Deep learning-based natural language understand-
ing (NLU) models are prone to use spurious cor-
relations in the training set. This tendency results
in models’ poor generalization ability to out-of-
distribution test sets (McCoy et al.,2019;Geirhos
et al.,2020), which is a significant challenge in the
field. Question answering (QA) models trained on
intentionally biased training sets are more likely
1
Our codes are available at
https://github.com/
KazutoshiShinoda/RelativePositionBias.
Context
... This changed in
1924
with formal re-
quirements developed for graduate degrees,
including offering Doctorate (PhD) degrees
...
Question
The granting of Doctorate degrees first oc-
curred in what year at Notre Dame?
Relative
Position
−1
Context
... The other magazine, The Juggler, is re-
leased
twice
a year and focuses on student
literature and artwork ...
Question
How often is Notre Dame’s the Juggler pub-
lished?
Relative
Position
−2
Table 1: Examples taken from SQuAD. Underlined
words are contained in both the context and question.
Bold spans are the answers to the questions. In both
the examples, answers are found by looking to the right
from the overlapping words. See §2.1 for the definition
of the relative position.
to learn solutions based on spurious correlations
rather than on causal relationships between inputs
and labels. For example, QA models can learn
question-answer type matching heuristics (Lewis
and Fan,2019), and absolute-positional correla-
tions (Ko et al.,2020), particularly when a training
set is biased toward examples with corresponding
spurious correlations. Collecting a fully unbiased
dataset is challenging. Therefore, it is vital to dis-
cover possible dataset biases that can degrade the
generalization and develop debiasing methods to
learn generalizable solutions even when training
on unintentionally biased datasets.
In extractive QA (e.g., Rajpurkar et al.,2016),
in which answers to questions are spans in textual
contexts, we find that the relative position of an an-
swer, which is defined as the relative distance from
an answer span to the closest word that appears in
both a context and a question, can be exploited as
superficial cues by QA models. See Table 1for the
arXiv:2210.14541v1 [cs.CL] 26 Oct 2022