Counterfactual Multihop QA A Cause-Effect Approach for Reducing Disconnected Reasoning Wangzhen Guo Qinkang Gong Hanjiang Lai

2025-04-24 0 0 438.62KB 10 页 10玖币
侵权投诉
Counterfactual Multihop QA: A Cause-Effect Approach for Reducing
Disconnected Reasoning
Wangzhen Guo Qinkang Gong Hanjiang Lai
Sun Yat-Sen University
{guowzh6,gongqk}@mail2.sysu.edu.cn laihanj3@mail.sysu.edu.cn
Abstract
Multi-hop QA requires reasoning over multi-
ple supporting facts to answer the question.
However, the existing QA models always rely
on shortcuts, e.g., providing the true answer
by only one fact, rather than multi-hop reason-
ing, which is referred as disconnected reason-
ing problem. To alleviate this issue, we pro-
pose a novel counterfactual multihop QA, a
causal-effect approach that enables to reduce
the disconnected reasoning. It builds upon ex-
plicitly modeling of causality: 1) the direct
causal effects of disconnected reasoning and
2) the causal effect of true multi-hop reason-
ing from the total causal effect. With the
causal graph, a counterfactual inference is pro-
posed to disentangle the disconnected reason-
ing from the total causal effect, which provides
us a new perspective and technology to learn
a QA model that exploits the true multi-hop
reasoning instead of shortcuts. Extensive ex-
periments have conducted on the benchmark
HotpotQA dataset, which demonstrate that the
proposed method can achieve notable improve-
ment on reducing disconnected reasoning. For
example, our method achieves 5.8% higher
points of its Suppsscore on HotpotQA through
true multihop reasoning. The code is available
at supplementary material.
1 Introduction
Multi-hop question answering (QA) (Groeneveld
et al.,2020;Ding et al.,2019;Asai et al.,2019;
Shao et al.,2020) requires the model to reason over
multiple supporting facts to correctly answer a com-
plex question. It is a challenging task, and many
datasets, e.g., HotpotQA (Yang et al.,2018) and
approaches (Fang et al.,2019;Zhu et al.,2021b)
have been proposed for this reasoning task.
One of the main problems of multihop QA
models is disconnected reasoning (Trivedi et al.,
2020), which allows the models to exploit the
*Corresponding Author
reasoning shortcuts (Jiang and Bansal,2019;Lee
et al.,2021) instead of multi-hop reasoning to
cheat and obtain the right answer. Taking Fig. 1
as an example, to answer the question “until
when in the U.S. Senate", we should consider
two supporting facts to infer the answer “Devo-
rah Adler
Director of Research for
=
Barack Obama
served f rom 2005 to 2008
=
2008". However, one may
also infer the correct answer by just utilizing the
types of problems, e.g., we can find the correspond-
ing fact “from 2005 to 2008" in the contexts with-
out reasoning to answer this type of question “until
when".
One possible solution for reducing the discon-
nected reasoning is to strengthen the training
dataset via extra annotations or adversarial exam-
ples, which make it cannot find the correct answers
by only one supporting fact. For example, Jiang
and Bansal (2019) constructed the adversarial ex-
amples to generate better distractor facts. (Trivedi
et al.,2020) firstly defined a evaluate measure,
DiRe
in short, to measure how much the QA model
can cheat via disconnected reasoning. Then, a
transformed dataset is constructed to reduce discon-
nected reasoning. Besides, counterfactual interven-
tion (Lee et al.,2021;Ye et al.,2021) had also been
explored to change the distribution of the training
dataset. These methods improve the generalizabil-
ity and interpretability of the multi-hop reasoning
QA model via balancing the train data, which is
noted as debiased training in QA model (Niu et al.,
2021). However, when the existing approaches
decrease the disconnected reasoning, the original
performance also drops significantly. It is still chal-
lenging to reduce disconnected reasoning while
maintaining the same accuracy on the original test
set.
Motivated by causal inference (Pearl and
Mackenzie,2018;Pearl,2022;Niu et al.,2021),
we utilize the counterfactual reasoning to reduce
the disconnected reasoning in multi-hop QA and
arXiv:2210.07138v1 [cs.AI] 13 Oct 2022
also obtain the robust performance on the origi-
nal dataset. We formalize a causal graph to reflect
the causal relationships between question (
Q
), con-
texts and answer (
Y
). To evaluate the disconnected
reasoning, contexts are further divided into two
subsets:
S
is a supporting fact and
C
are the re-
maining supporting facts. Hence, we can formulate
the disconnected reasoning as two natural direct
causal effects of
(Q, S)
and
(Q, C)
on
Y
as shown
in Fig. 1. With the proposed causal graph, we
can relieve the disconnected reasoning by disen-
tangling the two natural direct effects and the true
multi-hop reasoning from the total causal effect.
A novel counterfactual multihop QA is proposed
to disentangle them from the total causal effect.
We utilize the generated probing dataset proposed
by (Trivedi et al.,2020) and DiRe to measures
how much the proposed multi-hop QA model can
reduce the disconnected reasoning. Experiment
results show that our approach can substantially
decrease the disconnected reasoning while guaran-
tee the strong performance on the original test set.
The results indicate that the proposed approach can
reduce the disconnected reasoning and improve the
true multi-hop reasoning capability.
The main contribution of this paper is threefold.
Firstly, our counterfactual multi-hop QA model
formulates disconnected reasoning as two direct
causal effects on answer, which is a new perspec-
tive and technology to learn the true multi-hop rea-
soning. Secondly, our approach achieves notable
improvement on reducing disconnected reasoning
compared to various state-of-the-arts. Thirdly, our
causal-effect approach is model-agnostic and can
be used for reducing disconnected reasoning in
many multi-hop QA architectures.
2 Related Work
Multi-hop question answering (QA) requires the
model to retrieve the supporting facts to predict
the answer. Many approaches and datasets have
been proposed to train QA systems. For example,
HotpotQA (Yang et al.,2018) dataset is a widely
used dataset for multi-hop QA, which consists of
fullwiki setting (Das et al.,2019;Nie et al.,2019;
Qi et al.,2019;Chen et al.,2019;Li et al.,2021;
Xiong et al.,2020) and distractor setting (Min et al.,
2019b;Nishida et al.,2019;Qiu et al.,2019;Jiang
and Bansal,2019;Trivedi et al.,2020).
In fullwiki setting, it firstly finds relevant
facts from all Wikipedia articles, and then fin-
ish the multi-hop QA task with the found facts.
The retrieval model is important in this set-
ting. For instance, SMRS (Nie et al.,2019) and
DPR (Karpukhin et al.,2020) found the implicit
importance of retrieving relevant information in
the semantic space. Entity-centric (Das et al.,
2019), CogQA (Ding et al.,2019) and Golden Re-
triever (Qi et al.,2019) explicitly used the entity
that is mentioned or reformed in query key words to
retrieve next hop document. Furthermore, PathRe-
triever (Asai et al.,2019) and HopRetriever (Li
et al.,2021) can iteratively select the documents
to form a paragraph-level reason path using RNN.
MDPR (Xiong et al.,2020) retrieved passages only
using dense query vector in many times. These
methods hardly discuss the QA model’s discon-
nected reasoning problem.
In distractor setting, 10 paragraphs, two gold
paragraphs and eight distractors, are given. Many
methods have been proposed to strengthen the
model’s capability of multi-hop reasoning, using
graph neural network (Qiu et al.,2019;Fang et al.,
2019;Shao et al.,2020) or adversarial examples or
counterfactual examples (Jiang and Bansal,2019;
Lee et al.,2021) or the sufficiency of the support-
ing evidences (Trivedi et al.,2020) or make use of
the pretrained language models (Zhao et al.,2020;
Zaheer et al.,2020).
However, Min et al. (2019a) demonstrated that
many compositional questions in HotpotQA can
be answered with a single hop. It means that QA
models can take shortcuts instead of multi-hop rea-
soning to produce the corrected answer. To relieve
the issue, Jiang and Bansal (2019) added adver-
sarial examples as hard distractors during training.
Recently, (Trivedi et al.,2020) proposed an ap-
proach, DiRe, to measure the model’s disconnected
reasoning behavior and use the supporting suffi-
ciency label to reduce the disconnected reasoning.
Lee et al. (2021) selected the supporting evidence
according to the sentence causality to the predicted
answer, which guarantees the explainability of the
behavior of the model. While, the original perfor-
mance also drops when reducing the disconnected
reasoning.
Causal Inference.
Recently, causal inference
(Pearl and Mackenzie,2018;Pearl,2022) has been
applied to many tasks of natural language process-
ing, and it shows promising results and provides
strong interpretability and generalizability. The rep-
resentative works include counterfactual interven-
摘要:

CounterfactualMultihopQA:ACause-EffectApproachforReducingDisconnectedReasoningWangzhenGuoQinkangGongHanjiangLaiSunYat-SenUniversity{guowzh6,gongqk}@mail2.sysu.edu.cnlaihanj3@mail.sysu.edu.cnAbstractMulti-hopQArequiresreasoningovermulti-plesupportingfactstoanswerthequestion.However,theexistingQAmode...

展开>> 收起<<
Counterfactual Multihop QA A Cause-Effect Approach for Reducing Disconnected Reasoning Wangzhen Guo Qinkang Gong Hanjiang Lai.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:438.62KB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注