Life is a Circus and We are the Clowns Automatically Finding Analogies between Situations and Processes Oren Sultan

2025-05-02 0 0 1.52MB 16 页 10玖币

侵权投诉

Life is a Circus and We are the Clowns:

Automatically Finding Analogies between Situations and Processes

Oren Sultan

The Hebrew University of Jerusalem

oren.sultan@mail.huji.ac.il

Dafna Shahaf

The Hebrew University of Jerusalem

dshahaf@cs.huji.ac.il

Abstract

Analogy-making gives rise to reasoning, ab-

straction, ﬂexible categorization and counter-

factual inference – abilities lacking in even the

best AI systems today. Much research has

suggested that analogies are key to non-brittle

systems that can adapt to new domains. De-

spite their importance, analogies received lit-

tle attention in the NLP community, with most

research focusing on simple word analogies.

Work that tackled more complex analogies re-

lied heavily on manually constructed, hard-

to-scale input representations. In this work,

we explore a more realistic, challenging setup:

our input is a pair of natural language proce-

dural texts, describing a situation or a process

(e.g., how the heart works/how a pump works).

Our goal is to automatically extract entities

and their relations from the text and ﬁnd a map-

ping between the different domains based on

relational similarity (e.g., blood is mapped to

water). We develop an interpretable, scalable

algorithm and demonstrate that it identiﬁes the

correct mappings 87% of the time for proce-

dural texts and 94% for stories from cognitive-

psychology literature. We show it can extract

analogies from a large dataset of procedural

texts, achieving 79% precision (analogy preva-

lence in data: 3%). Lastly, we demonstrate

that our algorithm is robust to paraphrasing the

input texts1.

1 Introduction

The ability to ﬁnd parallels across diverse domains

and transfer ideas across them is one of the pinna-

cles of human cognition. The analogous reasoning

process allows us to abstract information, form

ﬂexible concepts and solve problems based on our

previous experience (Minsky,1988;Hofstadter and

Sander,2013;Holyoak,1984) – abilities that cur-

rent AI systems still lack. Many researchers have

Data and code are available in our GitHub repository

https://github.com/orensul/analogies_mining

suggested that analogy-making is essential for engi-

neering non-brittle AI that can robustly generalize

and adapt to diverse domains (Mitchell,2021b).

Surprisingly, despite analogy’s important role in

the way humans understand language, the problem

of recognizing analogies has received relatively lit-

tle attention in NLP. Most works have focused on

SAT-type of analogies (“ato bis like cto d”), with

recent works (Linzen,2016;Ushio et al.,2021;

Brown et al.,2020) showing that models still strug-

gle with abstract, loosely deﬁned relations.

In this work, we focus on a different type of

analogies: analogies between situations or pro-

cesses. Here the input is two domains (e.g., heart

and pump) and the goal is to map objects from the

base domain to objects from the target domain. Im-

portantly, the mapping should rely on a common

relational structure rather than object attributes,

making it challenging for NLP methods.

The most inﬂuential work in this line of re-

search is Structure Mapping Theory (SMT) (Gen-

tner,1983). In SMT and much of its follow-up

work (Falkenhainer et al.,1989;Turney,2008;For-

bus et al.,2011), domain descriptions are sets of

statements in a highly structured language, e.g.,

CAUSE(PULL(piston),CAUSE(GREATER(PRESSURE(water),

PRESSURE(pipe)),FLOW(water,pipe)))

Many have argued that too much human creativity

is required to construct these inputs, and the anal-

ogy is already expressed in them (Chalmers et al.,

1992). They are also brittle and hard-to-scale.

In our work, we explore a more realistic, chal-

lenging setup. Our input is a pair of two procedural

texts, describing a situation or a process, expressed

in natural language. We develop an algorithm to au-

tomatically extract entities and their relations from

the text and ﬁnd a mapping between the different

domains based on relational similarity.

For example, the two texts in Figure 1explain the

animal cell to students through an analogy to a fac-

arXiv:2210.12197v2 [cs.CL] 19 Jan 2023

Figure 1: A+B: Example input – two analogous texts, describing the animal cell (base) and factory (target), adapted

from Mikelskis-Seifert et al. (2008). C: Our algorithm’s output. The nodes are entities (clusters of text spans;

for the sake of presentation, we show up to two spans per cluster). Edge width represents similarity between

entities in terms of the roles they play in the text. For example, the boxes on the left illustrate the similar roles

associated with the red and the pink entities. All the mappings our algorithm found are correct, but two are missing

(ribosomes/machines and endoplasmic reticulum/hallways). Showing the mapping along with its justiﬁcation (the

similar roles) renders our output easy to interpret.

tory (adapted from Mikelskis-Seifert et al. (2008))

Note that in our setting entities and relations are

not given upfront, but need to be extracted; they

can appear multiple times, often expressed in dif-

ferent terms. Also, entities can play multiple roles

throughout the text.

Figure 1C shows our algorithm’s output. Nodes

are entities (clusters of text spans – mostly correct,

except for “factory activities”), and edge width rep-

resents similarity in terms of roles entities play.

For example, the boxes on the left illustrate the

similar roles associated with the red and the pink

entities (represented as questions), showing that

“plasma membrane” and “security guards” share

one role (control something), and “cell” and “fac-

tory” share three. Importantly, including the jus-

tiﬁcation as part of the output renders our algo-

rithm interpretable. We note that all the mappings

our algorithm found are correct, but two are miss-

ing (ribosomes/machines and endoplasmic reticu-

lum/hallways). Our contributions are:

•

We present a novel setting in computational

analogy – mapping between procedural texts

Note that this example is from a textbook, given for sim-

plicity of presentation, and thus the texts are particularly well-

aligned. Our method can handle much more variation.

expressed in natural language. We develop

a scalable, interpretable method to ﬁnd map-

pings based on relational similarity.

•

Our method identiﬁes the correct mappings

87% of the time for procedural texts from

ProPara dataset (Dalvi et al.,2018). Although

not designed for them, our method achieves

94% precision on cognitive-psychology sto-

ries (Gentner et al.,1993;Ichien et al.,2020).

•

We demonstrate our method can be used to

mine analogies in the ProPara dataset, achiev-

ing 79% precision at the top of the ranking,

when analogy prevalence in data is ∼3%.

•

We show our method is robust to paraphras-

ing the input texts.

•

We release data and code, including the

found mappings at

https://github.com/

orensul/analogies_mining.

We hope this work will pave the way for further

NLP work on computational analogy.

2 Problem Formulation

Our framework is based on Gentner’s structure

mapping theory (SMT) (Gentner,1983). The cen-

tral idea of SMT is that an analogy is a mapping of

objects from a base domain

into a target domain

that is based on a common relational structure,

rather than object attributes. In our setup, the input

is two procedural texts describing a situation or a

process, expressed in natural language.

Entities.

First, we need to extract entities from the

texts. Let

{b1, ..., bn}

and

{t1, ..., tm}

the entities in the base and the target, respectively.

In our setting, entities are noun phrases. For ex-

ample, in the animal cell (Figure 1), some entities

include plasma membrane, animal cell, nucleus,

energy, mitochondria, proteins, ribosomes.

Relations.

Let

be a set of relations. A relation is

a set of ordered entity pairs. In this work, we focus

on verbs from the input text, but other formulations

are possible. Intuitively, relations should capture

that the mitochondria provides energy to the cell

(in

), just like electrical generators provide energy

to the factory (in

). Slightly abusing notation,

let

R(e1, e2)⊆2R

be the set of relations between

and

. For example,

R(cell, proteins)

con-

tains {synthesize, use, move}. Note that

is an

asymmetric function, as the order matters.

Similarity.

Let sim be a similarity metric between

two sets of relations,

sim : 2R×2R→[0,∞)

Intuitively, we want similarity to be high if the two

sets share many distinct relations. For example,

{provide, destroy}, should be more similar to {sup-

ply, ruin} than to {destroy, ruin} as the last set does

not include any relation similar to provide. Given

a pair of entities

bi, bj∈ B

and a pair of entities

tk, tl∈ T

, we deﬁne a similarity function mea-

suring how similar these pairs are, in terms of the

relations between them. Since sim is asymmetric,

we consider both possible orderings:

sim∗(bi, bj, tk,tl) = sim(R(bi, bj),R(tk, tl))

+sim(R(bj, bi),R(tl, tk)) (1)

Objective.

Our goal is to ﬁnd a mapping function

M:B → T ∪ ⊥

that maps entities from base

to target. Mapping into

⊥

means the entity was

not mapped. The mapping should be consistent –

no two base entities can be mapped to the same

entity. We look for a mapping that maximizes the

relational similarity between mapped pairs:

M∗= arg max

j∈[1,n−1]

i∈[j+1,n]

sim∗(bj, bi,M(bj),M(bi))

If bior bjmaps to ⊥,sim∗is deﬁned to be 0.

3 Analogous Matching Algorithm

Our goal in this section is to ﬁnd the best map-

ping between

and

. Our algorithm consists of

four phases: we begin with a basic

text processing

(Section 3.1). Then, we

extract potential entities

and relations

(Section 3.2). Since entities can be

referred to in multiple ways, we next

cluster

the

entities (Section 3.3). Finally, we ﬁnd a

mapping

between clusters from Band T(Section 3.4).

We note that our goal in this paper is to present a

new task and ﬁnd a reasonable model for it; many

other architectures and design choices are possible

and could be explored in future work.

3.1 Text Processing

We begin by chunking the sentences in the input.

As our next step is structure extraction, we ﬁrst

want to resolve pronouns. We apply a lightweight

co-reference model (Kirstain et al.,2021) which

generates clusters of strings (e.g, the plasma mem-

brane,plasma membrane,it) and replace all pro-

nouns by a representative from their cluster – the

shortest string which is not a pronoun or a verb.

3.2 Structure Extraction

Analogy is based on relational similarity; thus, we

now extract relations from the text. This naturally

falls under Semantic Role Labeling (SRL) (Gildea

and Jurafsky,2002) – identifying the underlying

relationship that words have with the main verb in

a clause. In particular, we chose to use QA-SRL

(FitzGerald et al.,2018). This model receives a

sentence as input and outputs questions and an-

swers about the sentence. See Table 1for example

questions and answers. Intuitively, the spans identi-

ﬁed by QA-SRL as answers form the entities, and

similar questions that appear in both domains (e.g.,

“what provides something?”) indicate that the two

entities (mitochondria, generators) may play simi-

lar roles.

We chose to use QA-SRL since it allows the

questions themselves to deﬁne the set of roles, with

no predeﬁned frame or thematic role ontologies.

Recent studies show that QA-SRL achieves 90%

coverage of PropBank arguments, while capturing

much implicit information that is often missed by

traditional SRL schemes (Roit et al.,2020).

We focus on questions likely to capture useful

relations for our task. We ﬁlter out “When” and

“Why” questions, “Be” verbs, and questions and

answers with a low probability (see Appendix A).

Text Verb Question Answer

The animal cell provide what provides something?

what provides something?

what does something provide?

the mitochondria

mitochondria

food molecules

the energy needs of the cell

The factory provide what provides something?

what provides something?

what does something provide?

the electrical generators

the electrical generators in the factory

energy

Table 1: Snippet from QA-SRL output on our cell/factory texts.

3.3 Clustering Entities

In classical computational analogy work, entities

are explicitly given, each with a unique name

(“cell”). However, in our input, entities are often re-

ferred to in different ways (“the animal cell”,“the

cell”,“cell”), which might confuse the mapping

algorithm. Therefore, in this step we merge those

different phrasings, resulting in a new, more reﬁned

set of entities. Since we do not know in advance

the number of clusters, we use Agglomerative Clus-

tering (Zepeda-Mendoza and Resendis-Antonio,

2013). We manually ﬁne-tuned the linkage thresh-

old that determines the number of clusters (see

Appendix B.2 for details). We denote the result-

ing clusters of entities as

{b1, ..., bn}

and

{t1, ..., tm}

. Figure 2shows the output clusters

for the animal cell text. Most entities are identiﬁed

correctly, although not all (e.g., clusters 5 and 7

should merge).

3.4 Find Mappings

Our problem deﬁnition (Equation 1) assumes we

know all relations between the entities. However,

our extraction algorithm is not perfect – e.g., it

cannot detect relations described across sentences,

or using complex references. Consider these sen-

tences (slight paraphrases of the original texts):

“Animal cells must also produce proteins and

other organic molecules necessary for growth and

repair. Ribosomes are used for this process” /

“The factory synthesizes products from raw

materials using machines”

Ideally, we would like to infer that ribosomes

produce proteins and machines synthesize products.

QA-SRL only gives us partial information, but it is

still useful. For example, both proteins and prod-

ucts are associated with similar questions (what

is produced?, what is synthesized?), hinting that

they might play similar roles. Thus, we propose a

heuristic approach to approximate Equation 1.

Intuitively, the similarity score between two enti-

ties

bi, tk

is high if the similarity between their

associated questions is high (for example, cell

and factory have multiple distinct similar ques-

tions). We deﬁne this as the sum of cosine distances

over their associated questions’ SBERT (Reimers

and Gurevych,2019) embeddings. We ﬁlter out

distances below a similarity threshold (see Ap-

pendix B.1). If there exists a sentence involving

and

and another sentence involving

and

, such that those sentences were used both for

mapping

and

using the same verb,

a complete relation was found. In this case, we

increase the score of both mappings. There are

multiple ways to do so; we found that a simple

schema of adding a constant

to the similarity

(bi, tk)

and

(bj, tl)

works well in practice (see

Appendix B.2 for parameters).

We observe that questions are mostly of similar

length (in ProPara,

∼

1/3 of the questions have 3

words,

∼

1/3 have 4 words,

∼

1/6 have 2 words and

∼

1/6 have 5 words). Note that the entities are not

part of the questions.

Beam Search.

After computing all similarities,

we use beam search to ﬁnd the mapping

M∗

(see

Appendix B.2 for parameters).

Figure 1shows our algorithm’s top mapping

for the factory/cell example. Edge weights

represent similarity. All the mappings our al-

gorithm found are correct, but two are miss-

ing (ribosomes/machines and endoplasmic retic-

ulum/hallways). Being able to show the user the

relations that led to the output mapping renders

our method easy to interpret. We name it

Find

Mappings by Questions (FMQ).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LifeisaCircusandWearetheClowns:AutomaticallyFindingAnalogiesbetweenSituationsandProcessesOrenSultanTheHebrewUniversityofJerusalemoren.sultan@mail.huji.ac.ilDafnaShahafTheHebrewUniversityofJerusalemdshahaf@cs.huji.ac.ilAbstractAnalogy-makinggivesrisetoreasoning,ab-straction,exiblecategorizationandco...

展开>> 收起<<

Life is a Circus and We are the Clowns Automatically Finding Analogies between Situations and Processes Oren Sultan.pdf

共16页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Life is a Circus and We are the Clowns Automatically Finding Analogies between Situations and Processes Oren Sultan

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: