Retrieval Augmentation for Commonsense Reasoning A Unified Approach Wenhao Yu1 Chenguang Zhu2 Zhihan Zhang1 Shuohang Wang2 Zhuosheng Zhang3 Yuwei Fang2 Meng Jiang1

2025-04-29 0 0 420.55KB 14 页 10玖币
侵权投诉
Retrieval Augmentation for Commonsense Reasoning: A Unified Approach
Wenhao Yu1, Chenguang Zhu2, Zhihan Zhang1, Shuohang Wang2,
Zhuosheng Zhang3, Yuwei Fang2, Meng Jiang1
1University of Notre Dame, Indiana, USA
2Microsoft Cognitive Services Research, Washington, USA
3Shanghai Jiaotong University, Shanghai, China
1{wyu1, zzhang23, mjiang2}@nd.edu;
2{chezhu, shuow, yuwfan}@microsoft.com;3zhangzs@sjtu.edu.cn
Abstract
A common thread of retrieval-augmented
methods in the existing literature focuses on
retrieving encyclopedic knowledge, such as
Wikipedia, which facilitates well-defined en-
tity and relation spaces that can be modeled.
However, applying such methods to common-
sense reasoning tasks faces two unique chal-
lenges, i.e., the lack of a general large-scale
corpus for retrieval and a corresponding ef-
fective commonsense retriever. In this pa-
per, we systematically investigate how to lever-
age commonsense knowledge retrieval to im-
prove commonsense reasoning tasks. We
proposed a unified framework of Retrieval-
Augmented Commonsense reasoning (called
RACO), including a newly constructed com-
monsense corpus with over 20 million doc-
uments and novel strategies for training a
commonsense retriever. We conducted ex-
periments on four different commonsense rea-
soning tasks. Extensive evaluation results
showed that our proposed RACOcan signif-
icantly outperform other knowledge-enhanced
method counterparts, achieving new SoTA per-
formance on the CommonGen1and CREAK2
leaderboards. Our code is available at https:
//github.com/wyu97/RACo.
1 Introduction
Recent work has shown that scaling language mod-
els with considerably more data and parameters,
such as GPT3-175B (Brown et al.,2020), could
drive significant advances in commonsense reason-
ing tasks. Nevertheless, such models make predic-
tions by only “looking up information” stored in
their parameters, making it difficult to determine
what knowledge is stored or has been already for-
gotten by the neural network (Guu et al.,2020).
Besides, storage space is limited by the size of the
1https://inklab.usc.edu/CommonGen/
leaderboard.html
2https://www.cs.utexas.edu/~yasumasa/
creak/leaderboard.html
neural network. In order to memorize more world
knowledge, one must train ever-larger networks,
which can be prohibitively expensive and slow.
The solution that may seem obvious at first
glance is to grant language models free access to
open-world sources of commonsense knowledge
in a plug-and-play manner, instead of memorizing
all world knowledge. To achieve this capability,
language models must be able to retrieve relevant
commonsense knowledge from an unbounded set
of situations. Then, the language models can lever-
age the input text, as well as the retrieved informa-
tion to produce the desired output.
Compared with the large-scale language model
counterparts, e.g., UNICORN (Lourie et al.,2021),
retrieval-augmented methods have three remark-
able advantages: first, the knowledge is not stored
implicitly in the model parameters, but is explic-
itly acquired in a plug-and-play manner, leading
to great scalability; second, the paradigm gen-
erates text based on some retrieved references,
which alleviates the difficulty of generating from
scratch (Li et al.,2022); third, knowledge corpus
can be constantly edited and updated by experts,
making the model aware of the latest information.
Besides, compared with knowledge graph infer-
ence model counterparts, e.g., QA-GNN (Yasunaga
et al.,2021), retrieval-augmented methods allow
more flexibility in accessing and using knowledge
from different sources, because of the nature of
commonsense knowledge, which cannot all be con-
tained in a single knowledge graph defined by a
certain schema (Yu et al.,2022b).
A common thread of retrieval-augmented meth-
ods in the existing literature focuses on retriev-
ing encyclopedic knowledge such as Wikipedia,
which lends itself to a well-defined space of enti-
ties and relations that can be modeled (Karpukhin
et al.,2020;Lewis et al.,2020b;Yu et al.,2022a).
However, retrieval-augmented methods for com-
monsense reasoning have been rarely studied in the
arXiv:2210.12887v1 [cs.CL] 23 Oct 2022
RACOARISTOROBERTARE-T5 KFCNET OPENCSR
(this work) (Mihaylov et al.,2018) (Wang et al.,2021) (Li et al.,2021) (Lin et al.,2021)
Number of corpus types 31 1 1 1
Number of commonsense tasks 41 1 1 1
Number of docs for retrieval 20M 5K 0.8M 0.8M 1M
Table 1: Comparison of RACOto a few recent commonsense retrieval works in the field. Our work provides a
more comprehensive and larger-scale multi-source commonsense corpus that can generalize to various tasks.
literature. In this paper, we propose a unified frame-
work of
R
etrieval-
A
ugmented
Co
mmonsense rea-
soning (RACO) to solve various commonsense
tasks. RACOfirst retrieves relevant commonsense
documents from a large-scale corpus, then com-
bines the input text with the retrieved documents
to produce the desired output. However, there are
two main challenges in training a RACOmodel.
The first challenge to address is what common-
sense knowledge to retrieve. Different from en-
cyclopedic knowledge used in open-domain QA
tasks, commonsense knowledge is very diverse,
containing everyday events and their effects, facts
about beliefs and desires, and properties of objects
in human’s daily life. Since commonsense involves
various aspects including human interaction and ob-
ject properties in everyday life, we collected a over
20 million commonsense documents collection
from both open-domain knowledge sources (e.g.,
OMCS) that cover multiple domains of common-
sense, and domain-specific sources (e.g., ATOMIC)
that focus on particular commonsense types.
The second challenge is to address how to re-
trieve relevant commonsense knowledge from the
corpus. Different from training a dense retriever on
Wikipedia (Karpukhin et al.,2020), the heuristic
of taking “documents containing correct answers”
as positive candidates cannot be used because the
output answer in commonsense reasoning tasks is
usually not a substring of retrieved documents. For
example, in binary question answering, the answer
is True or False but it does not appear in the re-
trieved documents. Therefore, we propose novel
strategies to construct question-document pairs for
commonsense dense retriever training.
Overall, our main contributions in this work can
be summarized as follows:
1.
We collected and publicized a collection of
over 20 million documents from three knowledge
sources for commonsense knowledge retrieval.
2.
We presented a unified framework of
R
etrieval-
A
ugmented
Co
mmonsense reasoning (RACO),
and proposed novel strategies for training a strong
commonsense knowledge retriever.
3.
We evaluated our RACOon four types of
commonsense reasoning tasks. Our experiments
showed RACOcan significantly outperform other
knowledge-enhanced counterparts, achieving new
SoTA on CommonGen and CREAK leaderboards.
2 Related Work
Though large-scale language models yield state-of-
the-art performance on many commonsense rea-
soning tasks, their pre-training objectives do not
explicitly guide the models to reason with common-
sense knowledge such as the relation and compo-
sition of daily concepts in our lives (Zhou et al.,
2021), leading to unsatisfactory performance in
many real-world scenarios (Talmor et al.,2021;
Zhu et al.,2022). Existing work has mainly ex-
plored two directions to improve their common-
sense reasoning ability. The first is to pre-train
or post-train a language model on commonsense
corpora (Bosselut et al.,2019;Lourie et al.,2021;
Zhou et al.,2021). When the commonsense ma-
terials are appropriately selected, this simple strat-
egy could demonstrate significantly superior per-
formance than vanilla pre-trained language mod-
els (Zhou et al.,2021). Notable methods include
COMET (Bosselut et al.,2019), CALM (Zhou
et al.,2021), UNICORN (Lourie et al.,2021), etc.
Nonetheless, these methods still suffer from the
same drawbacks as the pre-trained language mod-
els introduced in
§
1. The second is to explicitly
introduce external knowledge from commonsense
knowledge graphs to augment the limited textual
information. (Lin et al.,2019;Ji et al.,2020). A
KG often provides comprehensive and rich entity
features and relations so models can easily traverse
links to discover how entities are interconnected to
express certain commonsense knowledge. Notable
methods include KagNet (Lin et al.,2019), GRF (Ji
et al.,2020), QA-GNN (Yasunaga et al.,2021),
GreaseLM (Zhang et al.,2022), etc. However,
commonsense knowledge lies at an unbounded set
of facts and situations that usually cannot be cov-
ered by a single knowledge graph defined by a cer-
tain schema. Reasoning over multiple knowledge
graphs is a challenging task.
Retrieval-augmented method is a new learn-
ing paradigm that fuses pre-trained language
models and traditional information retrieval tech-
niques (Lewis et al.,2020b). A few recent methods
have explored retrieving in-domain commonsense
documents from a task-relevant corpus to improve
commonsense reasoning performance (Mihaylov
et al.,2018;Wang et al.,2021;Li et al.,2021). We
provide a detailed comparison in Table 1. Differ-
ent from existing methods that focus on retrieving
knowledge from in-domain corpus, our proposed
RACOleverages a much larger and general com-
monsense corpus collected from multiple sources
that provide supportive evidences for various com-
monsense reasoning tasks. Meanwhile, we pro-
posed several novel strategies for training a com-
monsense retriever that can be generalized to dif-
ferent commonsense reasoning tasks.
3 Proposed Method
In this section, we elaborate on how to leverage
commonsense knowledge retrieval from a large-
scale corpus to improve various commonsense rea-
soning tasks, including commonsense corpus con-
struction (
§
3.1), commonsense document retriever
(
§
3.2) and commonsense document reader (
§
3.3).
The architecture of RACOis shown in Figure 1.
3.1 Commonsense Corpus Construction
Commonsense knowledge includes the basic facts
about situations in everyday life, which is shared by
most people and implicitly assumed in communi-
cations (Li et al.,2022). Commonsense knowledge
has two important properties: large and diverse.
Regarding the scale of knowledge, many com-
monsense corpus contains millions of statements.
For example, Wiktionary has more than one million
word definitions and descriptions in English. Mean-
while, the commonsense knowledge is diverse, in-
volving various aspects including human interac-
tion and object properties. For example, OMCS
3
covers multiple domains of commonsense such as
everyday events and their effects (e.g., mop up
the floor if we split food over it), facts about be-
liefs and desires (e.g., study hard to win scholar-
ship), and properties of objects (e.g., goat has four
legs). The diversity of knowledge is beneficial for
3https://en.wikipedia.org/wiki/OMCS
Commonsense corpus (20M+)
Query: Only after learning you can gain knowledge.
Answer: True
Human annotated facts (HAF)
Benchmark
data (CBD)
Relevant web
corpus (CRC)
Why do people learn? Obtain knowledge
Learning causes you to gain knowledge.
Knowledge is very important, and you
should learn as much as you can.
Query
Query
Query
BERT
BERT
X
Open Mind Common Sense (OMCS)
Commonsense QA 1.0 (QID: 9415)
ARC challenge corpus (AI2)
T5 Encoder
Fusion in T5 Decoder
Reader
Retriever
Figure 1: RACOhas two major components: (i) a doc-
ument retriever and (ii) a document reader. Specifically,
the document retriever aims to fetch a handful of rele-
vant documents from a large document collections. The
document reader takes the input text, as well as the sup-
port documents to produce the desired output.
Corpus # Instance Avg. Word
HAF-corpus 3,561,762 11.06 ±5.86
CBD-corpus 2,881,609 12.78 ±9.31
CRC-corpus 14,587,486 17.76 ±10.4
Table 2: Statistics for the commonsense corpus. The
total size of these corpora exceeds 20M documents.
retrieval-augmented methods because it enables
relevance comparison across different sources, and
offers textual knowledge to easily augment the in-
put of generation models by concatenation. To
build a large-scale commonsense corpus covering
diverse sources, we collected commonsense doc-
uments from the following three aspects: (i) hu-
man annotated facts; (ii) commonsense benchmark
datasets; (iii) commonsense relevant web corpus.
The statistics can be found in Table 2.
Human annotated facts (HAF).
It contains fac-
tual commonsense either annotated by human an-
notators or written by domain experts, including
OMCS (Havasi et al.,2010), ATOMIC (Sap et al.,
2019a), Wiktionary (Meyer and Gurevych,2012).
Commonsense benchmark datasets (CBD).
It in-
cludes training data from 19 commonsense bench-
mark datasets, such as
α
-NLI (Bhagavatula et al.,
2020). See Appendix A.1 for more details.
Commonsense relevant corpus (CRC).
It con-
sists of raw statements about commonsense from
摘要:

RetrievalAugmentationforCommonsenseReasoning:AUnifiedApproachWenhaoYu1,ChenguangZhu2,ZhihanZhang1,ShuohangWang2,ZhuoshengZhang3,YuweiFang2,MengJiang11UniversityofNotreDame,Indiana,USA2MicrosoftCognitiveServicesResearch,Washington,USA3ShanghaiJiaotongUniversity,Shanghai,China1{wyu1,zzhang23,mjiang2}@...

展开>> 收起<<
Retrieval Augmentation for Commonsense Reasoning A Unified Approach Wenhao Yu1 Chenguang Zhu2 Zhihan Zhang1 Shuohang Wang2 Zhuosheng Zhang3 Yuwei Fang2 Meng Jiang1.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:420.55KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注