
narrative they are constructed to represent. Re-
cent work demonstrates that models trained with
heuristically-retrieved commonsense knowledge
learn simplified reasoning patterns (Wang et al.,
2021) and provide false notions of interpretability
(Raman et al.,2021). We posit that inadequate re-
trieval from large-scale knowledge resources is a
key contributor to the spurious reasoning abilities
learned by these systems.
Acknowledging the importance of retrieving rel-
evant commonsense knowledge to augment models,
we identify a set of challenges that commonsense
knowledge retrievers must address. First, retrieved
commonsense knowledge must be
contextually-
relevant
, rather than generically related to the en-
tities mentioned in the context. Second, relevant
commonsense knowledge can often be
implicit
,
e.g., in Figure 1, writing may be a leisure hobby for
the cyan speaker, explaining why they “love mak-
ing up stories”. Finally, knowledge may be
am-
biguously
relevant to a context. The cyan speaker
in Figure 1may write as a relaxing hobby, or be
thinking of quitting medical school to pursue a ca-
reer as a writer. Without knowing the rest of the
conversation, both inferences are potentially valid.
To more adequately address these challenges,
we introduce the new task of commonsense fact
linking,
2
where models are given contexts and
trained to identify situationally-relevant common-
sense knowledge from KGs. For this task, we
construct a
Com
monsense
Fact
linking dataset
(
ComFact
) to benchmark the next generation of
models designed to improve commonsense fact
retrieval.
ComFact
contains
∼
293k contextual
relevance annotations for four diverse dialogue
and storytelling corpora. Our empirical analysis
shows that heuristic methods over-retrieve many
unrelated facts, yielding poor performance on the
benchmark. Meanwhile, models trained on our re-
source are much more precise extractors with an
average 34.6% absolute F1 boost (though they still
fall short of human performance). The knowledge
retriever developed on our resource also brings
an average 9.8% relative improvement on a down-
stream dialogue response generation task. These
results demonstrate that
ComFact
is a promising
testbed for developing improved fact linkers that
benefit downstream NLP applications.
2
We follow prior naming convention for entity linking
(Ling et al.,2015) and multilingual fact linking (Kolluru et al.,
2021), though the task can also be viewed as information
retrieval (IR) from a commonsense knowledge base.
2 Related Work
Commonsense Knowledge Graphs
Common-
sense knowledge graphs (KGs) are standard tools
for providing background knowledge to models
for various NLP tasks such as question answering
(Talmor et al.,2019;Sap et al.,2019b) and text
generation (Lin et al.,2020). ConceptNet (Liu
and Singh,2004;Speer et al.,2017), a commonly
used commonsense KG, contains high-precision
facts collected from crowdsourcing (Singh et al.,
2002) and web ontologies (Miller,1995;Lehmann
et al.,2015), but is generally limited to taxo-
nomic, lexical and physical relationships (Davis
and Marcus,2015;Sap et al.,2019a). ATOMIC
(Sap et al.,2019a) and ANION (Jiang et al.,
2021) are fully crowdsourced, and focus on rep-
resenting knowledge about social interactions and
events. ATOMIC
20
20
(Hwang et al.,2021) expands on
ATOMIC by annotating additional event-centered
relations and integrating the facts from ConceptNet
that are not easily represented by language models,
yielding a rich resource of complex entities. In this
work, we construct our
ComFact
dataset based on
the most advanced ATOMIC20
20 KG.
Commonsense Fact Linking
Knowledge-
intensive NLP tasks are often tackled using
commonsense KGs to augment the input contexts
provided by the dataset (Wang et al.,2019;Ye
et al.,2019;Gajbhiye et al.,2021;Yin et al.,2022).
Models for various NLP applications benefit from
this fact linking, including question answering
(Feng et al.,2020;Yasunaga et al.,2021;Zhang
et al.,2022), dialogue modeling (Zhou et al.,2018;
Wu et al.,2020) and story generation (Guan et al.,
2019;Ji et al.,2020). All above works typically
conduct fact linking using heuristic solutions.
Recent research explores unsupervised learn-
ing approaches for improving on the shortcom-
ings of heuristic commonsense fact linking. Huang
et al. (2021) and Zhou et al. (2022) use soft match-
ing based on embedding similarity to link com-
monsense facts with implicit semantic relatedness.
Guan et al. (2020) use knowledge-enhanced pre-
training to implicitly incorporate commonsense
facts into narrative systems, but their approach
reduces the controllability and interpretability of
knowledge integration. Finally, several works
(Arabshahi et al.,2021;Bosselut et al.,2021;Peng
et al.,2021a,b;Tu et al.,2022) use knowledge mod-
els (Bosselut et al.,2019;Da et al.,2021;West