Life is a Circus and We are the Clowns Automatically Finding Analogies between Situations and Processes Oren Sultan

2025-05-02 0 0 1.52MB 16 页 10玖币
侵权投诉
Life is a Circus and We are the Clowns:
Automatically Finding Analogies between Situations and Processes
Oren Sultan
The Hebrew University of Jerusalem
oren.sultan@mail.huji.ac.il
Dafna Shahaf
The Hebrew University of Jerusalem
dshahaf@cs.huji.ac.il
Abstract
Analogy-making gives rise to reasoning, ab-
straction, flexible categorization and counter-
factual inference – abilities lacking in even the
best AI systems today. Much research has
suggested that analogies are key to non-brittle
systems that can adapt to new domains. De-
spite their importance, analogies received lit-
tle attention in the NLP community, with most
research focusing on simple word analogies.
Work that tackled more complex analogies re-
lied heavily on manually constructed, hard-
to-scale input representations. In this work,
we explore a more realistic, challenging setup:
our input is a pair of natural language proce-
dural texts, describing a situation or a process
(e.g., how the heart works/how a pump works).
Our goal is to automatically extract entities
and their relations from the text and find a map-
ping between the different domains based on
relational similarity (e.g., blood is mapped to
water). We develop an interpretable, scalable
algorithm and demonstrate that it identifies the
correct mappings 87% of the time for proce-
dural texts and 94% for stories from cognitive-
psychology literature. We show it can extract
analogies from a large dataset of procedural
texts, achieving 79% precision (analogy preva-
lence in data: 3%). Lastly, we demonstrate
that our algorithm is robust to paraphrasing the
input texts1.
1 Introduction
The ability to find parallels across diverse domains
and transfer ideas across them is one of the pinna-
cles of human cognition. The analogous reasoning
process allows us to abstract information, form
flexible concepts and solve problems based on our
previous experience (Minsky,1988;Hofstadter and
Sander,2013;Holyoak,1984) – abilities that cur-
rent AI systems still lack. Many researchers have
1
Data and code are available in our GitHub repository
https://github.com/orensul/analogies_mining
suggested that analogy-making is essential for engi-
neering non-brittle AI that can robustly generalize
and adapt to diverse domains (Mitchell,2021b).
Surprisingly, despite analogy’s important role in
the way humans understand language, the problem
of recognizing analogies has received relatively lit-
tle attention in NLP. Most works have focused on
SAT-type of analogies (“ato bis like cto d”), with
recent works (Linzen,2016;Ushio et al.,2021;
Brown et al.,2020) showing that models still strug-
gle with abstract, loosely defined relations.
In this work, we focus on a different type of
analogies: analogies between situations or pro-
cesses. Here the input is two domains (e.g., heart
and pump) and the goal is to map objects from the
base domain to objects from the target domain. Im-
portantly, the mapping should rely on a common
relational structure rather than object attributes,
making it challenging for NLP methods.
The most influential work in this line of re-
search is Structure Mapping Theory (SMT) (Gen-
tner,1983). In SMT and much of its follow-up
work (Falkenhainer et al.,1989;Turney,2008;For-
bus et al.,2011), domain descriptions are sets of
statements in a highly structured language, e.g.,
CAUSE(PULL(piston),CAUSE(GREATER(PRESSURE(water),
PRESSURE(pipe)),FLOW(water,pipe)))
Many have argued that too much human creativity
is required to construct these inputs, and the anal-
ogy is already expressed in them (Chalmers et al.,
1992). They are also brittle and hard-to-scale.
In our work, we explore a more realistic, chal-
lenging setup. Our input is a pair of two procedural
texts, describing a situation or a process, expressed
in natural language. We develop an algorithm to au-
tomatically extract entities and their relations from
the text and find a mapping between the different
domains based on relational similarity.
For example, the two texts in Figure 1explain the
animal cell to students through an analogy to a fac-
arXiv:2210.12197v2 [cs.CL] 19 Jan 2023
Figure 1: A+B: Example input – two analogous texts, describing the animal cell (base) and factory (target), adapted
from Mikelskis-Seifert et al. (2008). C: Our algorithm’s output. The nodes are entities (clusters of text spans;
for the sake of presentation, we show up to two spans per cluster). Edge width represents similarity between
entities in terms of the roles they play in the text. For example, the boxes on the left illustrate the similar roles
associated with the red and the pink entities. All the mappings our algorithm found are correct, but two are missing
(ribosomes/machines and endoplasmic reticulum/hallways). Showing the mapping along with its justification (the
similar roles) renders our output easy to interpret.
tory (adapted from Mikelskis-Seifert et al. (2008))
2
.
Note that in our setting entities and relations are
not given upfront, but need to be extracted; they
can appear multiple times, often expressed in dif-
ferent terms. Also, entities can play multiple roles
throughout the text.
Figure 1C shows our algorithm’s output. Nodes
are entities (clusters of text spans – mostly correct,
except for “factory activities”), and edge width rep-
resents similarity in terms of roles entities play.
For example, the boxes on the left illustrate the
similar roles associated with the red and the pink
entities (represented as questions), showing that
“plasma membrane” and “security guards” share
one role (control something), and “cell” and “fac-
tory” share three. Importantly, including the jus-
tification as part of the output renders our algo-
rithm interpretable. We note that all the mappings
our algorithm found are correct, but two are miss-
ing (ribosomes/machines and endoplasmic reticu-
lum/hallways). Our contributions are:
We present a novel setting in computational
analogy – mapping between procedural texts
2
Note that this example is from a textbook, given for sim-
plicity of presentation, and thus the texts are particularly well-
aligned. Our method can handle much more variation.
expressed in natural language. We develop
a scalable, interpretable method to find map-
pings based on relational similarity.
Our method identifies the correct mappings
87% of the time for procedural texts from
ProPara dataset (Dalvi et al.,2018). Although
not designed for them, our method achieves
94% precision on cognitive-psychology sto-
ries (Gentner et al.,1993;Ichien et al.,2020).
We demonstrate our method can be used to
mine analogies in the ProPara dataset, achiev-
ing 79% precision at the top of the ranking,
when analogy prevalence in data is 3%.
We show our method is robust to paraphras-
ing the input texts.
We release data and code, including the
found mappings at
https://github.com/
orensul/analogies_mining.
We hope this work will pave the way for further
NLP work on computational analogy.
2 Problem Formulation
Our framework is based on Gentner’s structure
mapping theory (SMT) (Gentner,1983). The cen-
tral idea of SMT is that an analogy is a mapping of
objects from a base domain
B
into a target domain
T
that is based on a common relational structure,
rather than object attributes. In our setup, the input
is two procedural texts describing a situation or a
process, expressed in natural language.
Entities.
First, we need to extract entities from the
texts. Let
B
=
{b1, ..., bn}
and
T
=
{t1, ..., tm}
be
the entities in the base and the target, respectively.
In our setting, entities are noun phrases. For ex-
ample, in the animal cell (Figure 1), some entities
include plasma membrane, animal cell, nucleus,
energy, mitochondria, proteins, ribosomes.
Relations.
Let
R
be a set of relations. A relation is
a set of ordered entity pairs. In this work, we focus
on verbs from the input text, but other formulations
are possible. Intuitively, relations should capture
that the mitochondria provides energy to the cell
(in
B
), just like electrical generators provide energy
to the factory (in
T
). Slightly abusing notation,
let
R(e1, e2)2R
be the set of relations between
e1
and
e2
. For example,
R(cell, proteins)
con-
tains {synthesize, use, move}. Note that
R
is an
asymmetric function, as the order matters.
Similarity.
Let sim be a similarity metric between
two sets of relations,
sim : 2R×2R[0,)
.
Intuitively, we want similarity to be high if the two
sets share many distinct relations. For example,
{provide, destroy}, should be more similar to {sup-
ply, ruin} than to {destroy, ruin} as the last set does
not include any relation similar to provide. Given
a pair of entities
bi, bj∈ B
and a pair of entities
tk, tl∈ T
, we define a similarity function mea-
suring how similar these pairs are, in terms of the
relations between them. Since sim is asymmetric,
we consider both possible orderings:
sim(bi, bj, tk,tl) = sim(R(bi, bj),R(tk, tl))
+sim(R(bj, bi),R(tl, tk)) (1)
Objective.
Our goal is to find a mapping function
M:B T ∪ ⊥
that maps entities from base
to target. Mapping into
means the entity was
not mapped. The mapping should be consistent
no two base entities can be mapped to the same
entity. We look for a mapping that maximizes the
relational similarity between mapped pairs:
M= arg max
MX
j[1,n1]
i[j+1,n]
sim(bj, bi,M(bj),M(bi))
If bior bjmaps to ,simis defined to be 0.
3 Analogous Matching Algorithm
Our goal in this section is to find the best map-
ping between
B
and
T
. Our algorithm consists of
four phases: we begin with a basic
text processing
(Section 3.1). Then, we
extract potential entities
and relations
(Section 3.2). Since entities can be
referred to in multiple ways, we next
cluster
the
entities (Section 3.3). Finally, we find a
mapping
between clusters from Band T(Section 3.4).
We note that our goal in this paper is to present a
new task and find a reasonable model for it; many
other architectures and design choices are possible
and could be explored in future work.
3.1 Text Processing
We begin by chunking the sentences in the input.
As our next step is structure extraction, we first
want to resolve pronouns. We apply a lightweight
co-reference model (Kirstain et al.,2021) which
generates clusters of strings (e.g, the plasma mem-
brane,plasma membrane,it) and replace all pro-
nouns by a representative from their cluster – the
shortest string which is not a pronoun or a verb.
3.2 Structure Extraction
Analogy is based on relational similarity; thus, we
now extract relations from the text. This naturally
falls under Semantic Role Labeling (SRL) (Gildea
and Jurafsky,2002) – identifying the underlying
relationship that words have with the main verb in
a clause. In particular, we chose to use QA-SRL
(FitzGerald et al.,2018). This model receives a
sentence as input and outputs questions and an-
swers about the sentence. See Table 1for example
questions and answers. Intuitively, the spans identi-
fied by QA-SRL as answers form the entities, and
similar questions that appear in both domains (e.g.,
“what provides something?”) indicate that the two
entities (mitochondria, generators) may play simi-
lar roles.
We chose to use QA-SRL since it allows the
questions themselves to define the set of roles, with
no predefined frame or thematic role ontologies.
Recent studies show that QA-SRL achieves 90%
coverage of PropBank arguments, while capturing
much implicit information that is often missed by
traditional SRL schemes (Roit et al.,2020).
We focus on questions likely to capture useful
relations for our task. We filter out “When” and
“Why” questions, “Be” verbs, and questions and
answers with a low probability (see Appendix A).
Text Verb Question Answer
The animal cell provide what provides something?
what provides something?
what provides something?
what does something provide?
the mitochondria
mitochondria
food molecules
the energy needs of the cell
The factory provide what provides something?
what provides something?
what does something provide?
the electrical generators
the electrical generators in the factory
energy
Table 1: Snippet from QA-SRL output on our cell/factory texts.
3.3 Clustering Entities
In classical computational analogy work, entities
are explicitly given, each with a unique name
(“cell”). However, in our input, entities are often re-
ferred to in different ways (“the animal cell”,“the
cell”,“cell”), which might confuse the mapping
algorithm. Therefore, in this step we merge those
different phrasings, resulting in a new, more refined
set of entities. Since we do not know in advance
the number of clusters, we use Agglomerative Clus-
tering (Zepeda-Mendoza and Resendis-Antonio,
2013). We manually fine-tuned the linkage thresh-
old that determines the number of clusters (see
Appendix B.2 for details). We denote the result-
ing clusters of entities as
B
=
{b1, ..., bn}
and
T
=
{t1, ..., tm}
. Figure 2shows the output clusters
for the animal cell text. Most entities are identified
correctly, although not all (e.g., clusters 5 and 7
should merge).
3.4 Find Mappings
Our problem definition (Equation 1) assumes we
know all relations between the entities. However,
our extraction algorithm is not perfect – e.g., it
cannot detect relations described across sentences,
or using complex references. Consider these sen-
tences (slight paraphrases of the original texts):
“Animal cells must also produce proteins and
other organic molecules necessary for growth and
repair. Ribosomes are used for this process” /
“The factory synthesizes products from raw
materials using machines”
Ideally, we would like to infer that ribosomes
produce proteins and machines synthesize products.
QA-SRL only gives us partial information, but it is
still useful. For example, both proteins and prod-
ucts are associated with similar questions (what
is produced?, what is synthesized?), hinting that
they might play similar roles. Thus, we propose a
heuristic approach to approximate Equation 1.
Intuitively, the similarity score between two enti-
ties
bi, tk
is high if the similarity between their
associated questions is high (for example, cell
and factory have multiple distinct similar ques-
tions). We define this as the sum of cosine distances
over their associated questions’ SBERT (Reimers
and Gurevych,2019) embeddings. We filter out
distances below a similarity threshold (see Ap-
pendix B.1). If there exists a sentence involving
bi
and
bj
and another sentence involving
tk
and
tl
, such that those sentences were used both for
mapping
bi
to
tk
and
bj
to
tl
using the same verb,
a complete relation was found. In this case, we
increase the score of both mappings. There are
multiple ways to do so; we found that a simple
schema of adding a constant
α
to the similarity
of
(bi, tk)
and
(bj, tl)
works well in practice (see
Appendix B.2 for parameters).
We observe that questions are mostly of similar
length (in ProPara,
1/3 of the questions have 3
words,
1/3 have 4 words,
1/6 have 2 words and
1/6 have 5 words). Note that the entities are not
part of the questions.
Beam Search.
After computing all similarities,
we use beam search to find the mapping
M
(see
Appendix B.2 for parameters).
Figure 1shows our algorithm’s top mapping
for the factory/cell example. Edge weights
represent similarity. All the mappings our al-
gorithm found are correct, but two are miss-
ing (ribosomes/machines and endoplasmic retic-
ulum/hallways). Being able to show the user the
relations that led to the output mapping renders
our method easy to interpret. We name it
Find
Mappings by Questions (FMQ).
摘要:

LifeisaCircusandWearetheClowns:AutomaticallyFindingAnalogiesbetweenSituationsandProcessesOrenSultanTheHebrewUniversityofJerusalemoren.sultan@mail.huji.ac.ilDafnaShahafTheHebrewUniversityofJerusalemdshahaf@cs.huji.ac.ilAbstractAnalogy-makinggivesrisetoreasoning,ab-straction,exiblecategorizationandco...

展开>> 收起<<
Life is a Circus and We are the Clowns Automatically Finding Analogies between Situations and Processes Oren Sultan.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:16 页 大小:1.52MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注