Counterfactual Multihop QA A Cause-Effect Approach for Reducing Disconnected Reasoning Wangzhen Guo Qinkang Gong Hanjiang Lai

2025-04-24 0 0 438.62KB 10 页 10玖币

侵权投诉

Counterfactual Multihop QA: A Cause-Effect Approach for Reducing

Disconnected Reasoning

Wangzhen Guo Qinkang Gong Hanjiang Lai∗

Sun Yat-Sen University

{guowzh6,gongqk}@mail2.sysu.edu.cn laihanj3@mail.sysu.edu.cn

Abstract

Multi-hop QA requires reasoning over multi-

ple supporting facts to answer the question.

However, the existing QA models always rely

on shortcuts, e.g., providing the true answer

by only one fact, rather than multi-hop reason-

ing, which is referred as disconnected reason-

ing problem. To alleviate this issue, we pro-

pose a novel counterfactual multihop QA, a

causal-effect approach that enables to reduce

the disconnected reasoning. It builds upon ex-

plicitly modeling of causality: 1) the direct

causal effects of disconnected reasoning and

2) the causal effect of true multi-hop reason-

ing from the total causal effect. With the

causal graph, a counterfactual inference is pro-

posed to disentangle the disconnected reason-

ing from the total causal effect, which provides

us a new perspective and technology to learn

a QA model that exploits the true multi-hop

reasoning instead of shortcuts. Extensive ex-

periments have conducted on the benchmark

HotpotQA dataset, which demonstrate that the

proposed method can achieve notable improve-

ment on reducing disconnected reasoning. For

example, our method achieves 5.8% higher

points of its Suppsscore on HotpotQA through

true multihop reasoning. The code is available

at supplementary material.

1 Introduction

Multi-hop question answering (QA) (Groeneveld

et al.,2020;Ding et al.,2019;Asai et al.,2019;

Shao et al.,2020) requires the model to reason over

multiple supporting facts to correctly answer a com-

plex question. It is a challenging task, and many

datasets, e.g., HotpotQA (Yang et al.,2018) and

approaches (Fang et al.,2019;Zhu et al.,2021b)

have been proposed for this reasoning task.

One of the main problems of multihop QA

models is disconnected reasoning (Trivedi et al.,

2020), which allows the models to exploit the

∗

*Corresponding Author

reasoning shortcuts (Jiang and Bansal,2019;Lee

et al.,2021) instead of multi-hop reasoning to

cheat and obtain the right answer. Taking Fig. 1

as an example, to answer the question “until

when in the U.S. Senate", we should consider

two supporting facts to infer the answer “Devo-

rah Adler

Director of Research for

=⇒

Barack Obama

served f rom 2005 to 2008

=⇒

2008". However, one may

also infer the correct answer by just utilizing the

types of problems, e.g., we can ﬁnd the correspond-

ing fact “from 2005 to 2008" in the contexts with-

out reasoning to answer this type of question “until

when".

One possible solution for reducing the discon-

nected reasoning is to strengthen the training

dataset via extra annotations or adversarial exam-

ples, which make it cannot ﬁnd the correct answers

by only one supporting fact. For example, Jiang

and Bansal (2019) constructed the adversarial ex-

amples to generate better distractor facts. (Trivedi

et al.,2020) ﬁrstly deﬁned a evaluate measure,

DiRe

in short, to measure how much the QA model

can cheat via disconnected reasoning. Then, a

transformed dataset is constructed to reduce discon-

nected reasoning. Besides, counterfactual interven-

tion (Lee et al.,2021;Ye et al.,2021) had also been

explored to change the distribution of the training

dataset. These methods improve the generalizabil-

ity and interpretability of the multi-hop reasoning

QA model via balancing the train data, which is

noted as debiased training in QA model (Niu et al.,

2021). However, when the existing approaches

decrease the disconnected reasoning, the original

performance also drops signiﬁcantly. It is still chal-

lenging to reduce disconnected reasoning while

maintaining the same accuracy on the original test

set.

Motivated by causal inference (Pearl and

Mackenzie,2018;Pearl,2022;Niu et al.,2021),

we utilize the counterfactual reasoning to reduce

the disconnected reasoning in multi-hop QA and

arXiv:2210.07138v1 [cs.AI] 13 Oct 2022

also obtain the robust performance on the origi-

nal dataset. We formalize a causal graph to reﬂect

the causal relationships between question (

), con-

texts and answer (

). To evaluate the disconnected

reasoning, contexts are further divided into two

subsets:

is a supporting fact and

are the re-

maining supporting facts. Hence, we can formulate

the disconnected reasoning as two natural direct

causal effects of

(Q, S)

and

(Q, C)

as shown

in Fig. 1. With the proposed causal graph, we

can relieve the disconnected reasoning by disen-

tangling the two natural direct effects and the true

multi-hop reasoning from the total causal effect.

A novel counterfactual multihop QA is proposed

to disentangle them from the total causal effect.

We utilize the generated probing dataset proposed

by (Trivedi et al.,2020) and DiRe to measures

how much the proposed multi-hop QA model can

reduce the disconnected reasoning. Experiment

results show that our approach can substantially

decrease the disconnected reasoning while guaran-

tee the strong performance on the original test set.

The results indicate that the proposed approach can

reduce the disconnected reasoning and improve the

true multi-hop reasoning capability.

The main contribution of this paper is threefold.

Firstly, our counterfactual multi-hop QA model

formulates disconnected reasoning as two direct

causal effects on answer, which is a new perspec-

tive and technology to learn the true multi-hop rea-

soning. Secondly, our approach achieves notable

improvement on reducing disconnected reasoning

compared to various state-of-the-arts. Thirdly, our

causal-effect approach is model-agnostic and can

be used for reducing disconnected reasoning in

many multi-hop QA architectures.

2 Related Work

Multi-hop question answering (QA) requires the

model to retrieve the supporting facts to predict

the answer. Many approaches and datasets have

been proposed to train QA systems. For example,

HotpotQA (Yang et al.,2018) dataset is a widely

used dataset for multi-hop QA, which consists of

fullwiki setting (Das et al.,2019;Nie et al.,2019;

Qi et al.,2019;Chen et al.,2019;Li et al.,2021;

Xiong et al.,2020) and distractor setting (Min et al.,

2019b;Nishida et al.,2019;Qiu et al.,2019;Jiang

and Bansal,2019;Trivedi et al.,2020).

In fullwiki setting, it ﬁrstly ﬁnds relevant

facts from all Wikipedia articles, and then ﬁn-

ish the multi-hop QA task with the found facts.

The retrieval model is important in this set-

ting. For instance, SMRS (Nie et al.,2019) and

DPR (Karpukhin et al.,2020) found the implicit

importance of retrieving relevant information in

the semantic space. Entity-centric (Das et al.,

2019), CogQA (Ding et al.,2019) and Golden Re-

triever (Qi et al.,2019) explicitly used the entity

that is mentioned or reformed in query key words to

retrieve next hop document. Furthermore, PathRe-

triever (Asai et al.,2019) and HopRetriever (Li

et al.,2021) can iteratively select the documents

to form a paragraph-level reason path using RNN.

MDPR (Xiong et al.,2020) retrieved passages only

using dense query vector in many times. These

methods hardly discuss the QA model’s discon-

nected reasoning problem.

In distractor setting, 10 paragraphs, two gold

paragraphs and eight distractors, are given. Many

methods have been proposed to strengthen the

model’s capability of multi-hop reasoning, using

graph neural network (Qiu et al.,2019;Fang et al.,

2019;Shao et al.,2020) or adversarial examples or

counterfactual examples (Jiang and Bansal,2019;

Lee et al.,2021) or the sufﬁciency of the support-

ing evidences (Trivedi et al.,2020) or make use of

the pretrained language models (Zhao et al.,2020;

Zaheer et al.,2020).

However, Min et al. (2019a) demonstrated that

many compositional questions in HotpotQA can

be answered with a single hop. It means that QA

models can take shortcuts instead of multi-hop rea-

soning to produce the corrected answer. To relieve

the issue, Jiang and Bansal (2019) added adver-

sarial examples as hard distractors during training.

Recently, (Trivedi et al.,2020) proposed an ap-

proach, DiRe, to measure the model’s disconnected

reasoning behavior and use the supporting sufﬁ-

ciency label to reduce the disconnected reasoning.

Lee et al. (2021) selected the supporting evidence

according to the sentence causality to the predicted

answer, which guarantees the explainability of the

behavior of the model. While, the original perfor-

mance also drops when reducing the disconnected

reasoning.

Causal Inference.

Recently, causal inference

(Pearl and Mackenzie,2018;Pearl,2022) has been

applied to many tasks of natural language process-

ing, and it shows promising results and provides

strong interpretability and generalizability. The rep-

resentative works include counterfactual interven-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

CounterfactualMultihopQA:ACause-EffectApproachforReducingDisconnectedReasoningWangzhenGuoQinkangGongHanjiangLaiSunYat-SenUniversity{guowzh6,gongqk}@mail2.sysu.edu.cnlaihanj3@mail.sysu.edu.cnAbstractMulti-hopQArequiresreasoningovermulti-plesupportingfactstoanswerthequestion.However,theexistingQAmode...

展开>> 收起<<

Counterfactual Multihop QA A Cause-Effect Approach for Reducing Disconnected Reasoning Wangzhen Guo Qinkang Gong Hanjiang Lai.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Counterfactual Multihop QA A Cause-Effect Approach for Reducing Disconnected Reasoning Wangzhen Guo Qinkang Gong Hanjiang Lai

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: