Relational Graph Convolutional Neural Networks for Multihop Reasoning A Comparative Study Ieva Stali unait e Philip John Gorinski Ignacio Iacobacci

2025-04-30 0 0 425.22KB 10 页 10玖币
侵权投诉
Relational Graph Convolutional Neural Networks for Multihop
Reasoning: A Comparative Study
Ieva Stali¯
unait˙
e
, Philip John Gorinski, Ignacio Iacobacci
Huawei Noah’s Ark Lab, London
irs38@cam.ac.uk
{philip.john.gorinski,ignacio.iacobacci}@huawei.com
Abstract
Multihop Question Answering is a complex
Natural Language Processing task that re-
quires multiple steps of reasoning to find the
correct answer to a given question. Previ-
ous research has explored the use of models
based on Graph Neural Networks for tack-
ling this task. Various architectures have been
proposed, including Relational Graph Convo-
lutional Networks (RGCN). For these many
node types and relations between them have
been introduced, such as simple entity co-
occurrences, modelling coreferences, or “rea-
soning paths” from questions to answers via in-
termediary entities. Nevertheless, a thoughtful
analysis on which relations, node types, em-
beddings and architecture are the most bene-
ficial for this task is still missing. In this pa-
per we explore a number of RGCN-based Mul-
tihop QA models, graph relations, and node
embeddings, and empirically explore the influ-
ence of each on Multihop QA performance on
the WikiHop dataset.
1 Introduction
As the latest models on question answering tasks
have high performance on most popular bench-
marks, datasets for the more complex task of mul-
tihop question answering have been introduced
(Welbl et al.,2018;Yang et al.,2018). The multi-
hop question answering task requires a model to
provide an answer given a question and multiple
background documents. The questions require rea-
soning across more than one document for finding
the correct answer. For example, to answer the
question given in Figure 1one needs to reason over
background documents 1 and 3.
In order to solve this problem in an informed
way the system needs to recognize entities, resolve
their coreference, understand the relations between
Now at Department of Computer Science and Technol-
ogy, University of Cambridge
Query
:applies_to_jurisdiction vice president of
the European Parliament
Background document 1
:The European Parlia-
ment (EP) is the directly elected parliamentary
institution of the European Union (EU) [...]
Background document 2
:The European Commis-
sion (EC) is an institution of the European Union,
responsible for proposing legislation [...]
Background document 3
:There are fourteen
vice-presidents of the European Parliament who
sit in for the president in presiding over the plenary
of the European Parliament [...]
Answer:European Union
Figure 1: Example WikiHop question requiring reason-
ing over several documents to find the correct answer.
them, and choose the right entity as the answer, or
answer the question in free form.
People perform this task without too much dif-
ficulty, for example when playing the Wikiracing
1
game. We posit that what a human does is start
with a given entity (e.g. Vice President of the Eu-
ropean Parliament), traverse Wikipedia articles by
choosing entities in the topic of the question (e.g.
European Parliament) and finally find the answer
entity in that article by searching for the entity that
links to the original entity in the question through
the relation mentioned.
Transformer-based language models fine-tuned
for QA tasks, including Multihop QA, reach very
high performance on many benchmarks without ex-
plicitly modeling the multiple reasoning hops (Belt-
agy et al.,2020;He et al.,2020). However, they
are very computationally expensive as they model
relations between all tokens in the input with atten-
tion. This motivates the study of which specialized
relations would be the most efficient at finding the
multihop reasoning paths.
1en.wikipedia.org/wiki/Wikiracing
arXiv:2210.06418v2 [cs.CL] 13 Oct 2022
To this end, a variety of Graph Neural Network
(GNN) approaches have been proposed in recent
years, which explore the use of graph structures
with entities as vertices and relations as edges to
address the Multihop QA task. These model the
relationship between the question, context, and po-
tential answers in a more informed way than full
token attention.
In particular, Relational Graph Convolutional
Networks (Schlichtkrull et al.,2018) (RGCN) have
been successfully applied in a number of Multi-
hop QA models. RGCN introduces typed edges
between nodes in the underlying graph, with a con-
volutional update step of nodes depending on their
neighbours and their respective type(s) of relation.
While the general architecture of RGCN is
shared by many approaches, the remaining frame-
work (node types, embeddings, relations, pre- and
post-RGCN layers) vary greatly. However, a prin-
cipled analysis of the impact these factors have on
model performance seems still to be missing from
current literature. This paper aims at rectifying this
by shedding some light on the efficacy of RGCN
for multihop reasoning under various conditions.
The main contributions of this paper are as fol-
lows: (i) We present a direct comparison of two
strong, related, yet significantly different RGCN-
based architectures for the Multihop QA task; (ii)
we derive a new RGCN-based architecture combin-
ing features of the two prior ones; (iii) we present a
principled analysis of the impact on model perfor-
mance across three conditions: model architecture,
node types and relations, node embeddings.
To our knowledge, this is the first attempt at a
principled comparison of RGCN-based Multihop
QA approaches.
2 Related Work
A number of graph-based approaches to Multihop
QA have been proposed in recent years. To answer
the questions presented in the WikiHop dataset
(Welbl et al.,2018), in De Cao et al. (2019) the
authors propose to use Relational Graph Convolu-
tional Networks (Schlichtkrull et al.,2018) by mod-
elling relations between entities in the query, doc-
uments, and candidate choices. Path-based GCN
was introduced by Tang et al. (2020b) for the same
task and builds on these relations, but added rela-
tions over “reasoning entites”, i.e., Named Entities
that co-occur with those presented in the query and
candidate entity set. Furthermore, Tu et al. (2019)
have used different types of nodes and edges in the
graph which led to high performance. They em-
ployed entity, sentence and document level nodes
to represent the relevant background information
and connect them on the basis of co-reference in the
case of entities and co-occurence among all types
of nodes. In a similar vein, in order to answer Hot-
potQA (Yang et al.,2018) questions, which include
answer span as well as support sentence prediction
tasks, Fang et al. (2020) introduced Hierarchical
Graph Networks, which establish a relational hier-
archy between graph nodes on entity, sentence, and
paragraph levels, and uses Graph Attention Net-
works (Veliˇ
ckovi´
c et al.,2017) for information dis-
tribution through the graph. HopRetriever (Li et al.,
2020) leverage Wikipedia hyperlinks to model hops
between articles via entities and their implicit rela-
tions introduced through the link.
In contrast, some research has explored the ques-
tion of whether a multihop approach is even nec-
essary to solve the task presented in recent multi-
hop question answering datasets (Min et al.,2019).
They show that using a single hop is sufficient to
answer 67% of the questions in HotpotQA. What is
more, Tang et al. (2020a) study the ability of state-
of-the-art models in the task of Multihop QA to
answer subquestions that compose the main ques-
tion. They show that these models often fail to
answer the intermediate steps, and suggest that
they may not actually be performing the task in a
compositional manner. Furthermore, Groeneveld
et al. (2020) provide support for the claim that the
multihop tasks can be solved to a large extent with-
out an explicit encoding of the compositionality of
the question and all the relations between knowl-
edge sources in different support documents in Hot-
potQA. Their pipeline simply predicts the support
sentences separately and predicts the answer span
from them using transformer models for encoding
all inputs for the classification. In a similar vein,
Shao et al. (2020) show that self-attention in trans-
formers performs on par with graph structure on
the HotpotQA task, providing further evidence that
this dataset does not require explicit modeling of
multiple hops for high performance.
3 QA Graphs and Architecture
WikiHop
A number of Multihop QA datasets
have been released with the largest two, WikiHop
(Welbl et al.,2018) and HotpotQA (Yang et al.,
2018), most actively used for research. The precise
摘要:

RelationalGraphConvolutionalNeuralNetworksforMultihopReasoning:AComparativeStudyIevaStali¯unaite,PhilipJohnGorinski,IgnacioIacobacciHuaweiNoah'sArkLab,Londonirs38@cam.ac.uk{philip.john.gorinski,ignacio.iacobacci}@huawei.comAbstractMultihopQuestionAnsweringisacomplexNaturalLanguageProcessingtasktha...

展开>> 收起<<
Relational Graph Convolutional Neural Networks for Multihop Reasoning A Comparative Study Ieva Stali unait e Philip John Gorinski Ignacio Iacobacci.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:425.22KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注