EACL23 Findings Zero-Shot On-the-Fly Event Schema Induction Rotem Dror Haoyu Wang and Dan Roth

2025-04-30 0 0 803.55KB 21 页 10玖币

侵权投诉

EACL’23 Findings

Zero-Shot On-the-Fly Event Schema Induction

Rotem Dror∗

, Haoyu Wang∗, and Dan Roth

Department of Computer and Information Science

University of Pennsylvania

{rtmdrr,why16gzl,danroth}@seas.upenn.edu

Abstract

What are the events involved in a pandemic

outbreak? What steps should be taken when

planning a wedding? The answers to these

questions can be found by collecting many

documents on the complex event of interest,

extracting relevant information, and analyzing

it. We present a new approach1in which

large language models are utilized to gener-

ate source documents that allow predicting,

given a high-level event deﬁnition, the speciﬁc

events, arguments, and relations between them

to construct a schema that describes the com-

plex event in its entirety. Using our model,

complete schemas on any topic can be gener-

ated on-the-ﬂy without any manual data collec-

tion, i.e., in a zero-shot manner. Moreover, we

develop efﬁcient methods to extract pertinent

information from texts and demonstrate in a

series of experiments that these schemas are

considered to be more complete than human-

curated ones in the majority of examined sce-

narios. Finally, we show that this framework is

comparable in performance with previous su-

pervised schema induction methods that rely

on collecting real texts and even reaching the

best score in the prediction task.

1 Introduction

Event processing refers to tracking, analyzing, and

drawing conclusions from streams of information

about events. This event analysis aims at identi-

fying meaningful events (such as opportunities or

threats) in real-time situations and responding ap-

propriately. Event processing can also be utilized

to gain a deep understanding of the speciﬁc steps,

arguments, and relations between them that are in-

volved in a complex event. The information above

can be consolidated into a graphical representation

called an event schema (Li et al.,2021). For in-

stance in Fig. 1, the graph representation of events

∗Indicating equal contribution.

1https://cogcomp.seas.upenn.edu/page/

publication_view/995

and participants assists in gaining an understanding

of the complex event of kidnapping and could help

composing a reaction plan if needed.

The NLP community has devoted much effort to

understanding events that are described in a docu-

ment or in a collection of documents for this pur-

pose. These efforts include identifying event trig-

gers (Lu and Roth,2012;Huang et al.,2018;Wad-

den et al.,2019;Han et al.,2019), extracting event

arguments (Punyakanok et al.,2008;Peng et al.,

2016;Lin et al.,2020;Zhang et al.,2021a), and pre-

dicting the relations between events, e.g., temporal,

coreferential, causal or hierarchical relations (Do

et al.,2012;Lee et al.,2012;Glavaš et al.,2014;

Ning et al.,2018;Wang et al.,2020;Zhang et al.,

2020a;Trong et al.,2022).

Previous works on event schema induction re-

lied on the information extracted from manually

collected documents to build the schema graph. For

instance, Li et al. (2020) learn an auto-regressive

language model (LM) over paths in the instance

graphs depicting events, arguments and relations of

instances of the complex events, and then construct

a schema graph by merging the top

ranked paths.

Their approach, however, requires access to many

documents on each topic of interest, which can be

extremely laborious and time consuming to obtain.

In this paper, our goal is to allow creating

schemas on-the-ﬂy by taking as input only the

name of the complex event of interest (like a “pan-

demic outbreak” or an “armed robbery”). To avoid

manually collecting many documents on the topic

of the schema, we utilize pre-trained text genera-

tors, e.g., GPT-3 (Brown et al.,2020), to obtain

documents of diverse genres on the desired topic

(examples presented in Fig. 2). These documents

are then processed to extract pertinent informa-

tion from which a schema is constructed. The fact

that we do not collect any data makes our learning

framework zero-shot since we do not rely on any

human-collected articles or example schemas.

arXiv:2210.06254v2 [cs.CL] 27 Mar 2023

Preparation

Kidnapping

Kidnapper

plans

AND

collects

information

Kidnapper

looks for

victim

Kidnapper

chooses

victim

location

Kidnapper

kidnaps

victim

Kidnapper

transports

victim

Kidnapper

hides

victim

Kidnapper

asks for

ransom

makes

demands

Figure 1: An example schema for the event of Kidnapping. The regular arrows represent temporal relations and

the dashed arrows represent hierarchical relations (PARENT-CHILD).

In addition to eliminating the need to collect data,

we also made the information extraction process

faster by implementing new and efﬁcient methods

for identifying temporal and hierarchical relations

between events mentioned in the text. These two

steps are the most time consuming in the process of

schema induction and could take up to 2 hours each

using state-of-the-art models proposed by Zhou

et al. (2021); Wang et al. (2021). Sending the whole

text as input instead of two sentences at each time,

our proposed model shortens the inference time

signiﬁcantly to several minutes without enduring a

major loss in performance.

The process of generating texts is explained

in Section §3, and the process of extracting rele-

vant and salient information is described in Sec-

tion §4, then we introduce the construction of

schema graphs in Section §5. To evaluate our

zero-shot schema generator we conduct experi-

ments on a benchmark dataset for schema induc-

tion, LDC2020E25, and provide a new dataset for

further evaluation called Schema-11. Additionally,

we design a subject-matter expert Turing test, a.k.a.

Feigenbaum test (Feigenbaum,2003), to determine

whether our algorithm could mimic experts’ re-

sponse. We also demonstrate that documents gen-

erated by GPT-3 are informative and useful for the

task of schema induction. The experiments and re-

sults are presented in Section §6. The contributions

of our work include:

Predicting an entire schema given the name

of a complex event without collecting data.

Implementing a novel and efﬁcient One-Pass

approach for identifying temporal and hierar-

chical relations between events.

Presenting a method for automatically induc-

ing logical relations between events based on

temporal relations.

Offering a Feigenbaum test for evaluation on

a new schema dataset, Schema-11.

2 Related Work

Schema Induction:

Early schema induction ef-

forts focused on identifying the triggers and partic-

ipants of atomic events without considering rela-

tions between atomic events that comprise com-

plex schemas (Chambers,2013;Cheung et al.,

2013;Nguyen et al.,2015;Sha et al.,2016;Yuan

et al.,2018). More recent work focuses on induc-

ing schemas for pairs of events (Li et al.,2020)

and multiple events (Zhang et al.,2021b;Li et al.,

2021), but they require access to large corpora for

the induction process. In this work, we induce

schemas on-the-ﬂy in a zero-shot manner. As is

standard in state-of-the-art (SOTA) works (Li et al.,

2020,2021;Wen et al.,2021;Lawley and Schubert,

2022), we output all the essential information about

relations between events and arguments extracted

from the text, in addition to logical and hierarchical

relations not studied previously in schema induc-

tion.

Script Learning:

Early script learning work con-

centrated on chains of events with a single pro-

tagonist (Chambers and Jurafsky,2008,2009;

Jans et al.,2012;Rudinger et al.,2015;Granroth-

Wilding and Clark,2016) and later extended to

multiple protagonists (Pichotta and Mooney,2014;

Peng and Roth,2016;Pichotta and Mooney,2016;

Modi,2016;Weber et al.,2018,2020;Zhang et al.,

2020b). All of these works assume there exists a

single line of events that describes all occurrences

within a complex event. This work does not limit it-

self to generating single-chained schemas. We also

consider more complex graphs as schema outputs.

In addition, none of these works deal with zero-shot

scenarios that do not require training data.

Pre-Trained Generation Models:

Large-scale

pre-trained text generation models such as GPT-2

(Radford et al.,2019), GPT-3 (Brown et al.,2020),

BART (Lewis et al.,2020), T5 (Raffel et al.,2020),

i.a. have been used in many NLP tasks. These

models are often seen as few-shot learners (Brown

et al.,2020) and therefore used as inference meth-

ods. However, these text generation models are not

explicitly trained to perform inference, but to pro-

duce the most likely sequence of words to proceed

a certain prompt, similar to language models. In

our work, we use these large pre-trained LMs as

text generators. The generated documents on a par-

ticular topic are leveraged as a corpus for extracting

the schema of the given topic. We rely on the in-

tuition that the generated text will include salient

and stereotypical information that is expected to

be mentioned in the context of the topic (e.g., for

the topic of “planning a wedding,” we assume most

documents will include “order catering”).

3 Data Generation

The schema induction process begins with generat-

ing texts using large LMs as text generation models.

These texts are joined to form a knowledge base

for the schema, including all of the potential infor-

mation that the schema may present. One could,

of course, create this knowledge base by crawling

the web for real news articles or Wikipedia entries

related to a certain topic.

We argue, however, that in addition to the obvi-

ous advantages of not having to rely on the avail-

ability of data online and not having to crawl the

entire web for relevant documents on each topic,

the generated data from these large generative mod-

els is more efﬁcient in reporting salient events than

random events described in the news, i.e., gener-

ated texts are more likely to mention important

information than real documents do.

Generated Text Real Text

# events / # tokens 12.52% 6.31%

# arguments / # tokens 5.45% 3.01%

Table 1: The ratio of relevant events and relevant argu-

ment roles identiﬁed in generated texts and real texts

for the scenario of IED attack.

Our analysis shows that the generated stories

contain a higher percentage of relevant tokens than

real news articles that are used for schema induc-

tion. To demonstrate this phenomenon, we com-

pare manually collected documents with those that

are automatically generated using GPT-3 for the

event of Improvised Explosive Device (IED) At-

tack (Li et al.,2021). To identify salient events and

arguments concerning IED attacks, we adopt the

DARPA KAIROS Phase 1 (v3.0) ontology

— a

ﬁne-grained ontology for schema learning, with 24

entity types, 67 event types, and 85 argument roles.

We calculate the number of relevant event trig-

gers and arguments identiﬁed in the text, where a

relevant mention is one whose type appears in the

ontology. The results shown in Table 1demonstrate

that the quality of the generated texts in terms of

conciseness and appearance of important details is

higher than that of real texts. For example, the ratio

of relevant events per token is more than twice as

high in generated texts as it is in real texts. Hence

we are able to not only generate a schema for every

given topic without putting any effort in searching

the web, but the information we generate is also

better suited for our end task of depicting all of the

important aspects of a complex event.

Given a topic for which we want to create a

schema, we generate multiple texts that discuss the

topic event using the OpenAI GPT-3 API

with the

Davinci-instruct-beta-v3 model and we also experi-

ment with the Hugging Face GPT-2 API

. We use

three prompting methods to generate documents of

diverse genres as follows:

News Articles:

We begin by generating a headline

using the prompt: “Write a news headline about

topic.” The output from this prompt is then used

in the following prompt: “Write a news story titled

headline.” The output from the second prompt is

added to the pool of generated texts. The process

is repeated 30 times. See example in Fig. 2b.

The full ontology deﬁnition can be accessed at this link:

https://bit.ly/3mIWJoN.

3https://openai.com/blog/openai-api/.

4https://huggingface.co/gpt2

Prompt

: What are the steps involved in holding a

summit meeting?

Output:

1. Plan the agenda and objectives of the meeting.

Select a venue that is appropriate for the meet-

ing’s purpose and size.

Send out invitations and ensure all participants

have the appropriate information.

Prepare for the meeting by creating materials

and briefing the participants.

Hold the meeting and follow up on any action

items that are generated.

(a) Example generated text of steps involved in the

event of “holding a summit meeting”.

First prompt: Write a news headline about ﬁling a patent.

Output

: New Patent Filing Could Speed Up Drug Develop-

ment.

Second prompt

: Write a news story titled "New Patent

Filing Could Speed Up Drug Development".

Output

: The United States Patent and Trademark Ofﬁce has

just published a new patent that could speed up the drug de-

velopment process. The patent, ﬁled by a team of researchers

from the DPharm Institute, covers a new drug screening pro-

cess that could help identify potential new drugs much more

quickly. The new process, which is based on a computer

algorithm, is said to be up to 10 times faster than traditional

methods. It can identify potential new drugs by screening

through large databases of existing drugs and drug candi-

dates. The DPharm Institute is already in the process of

licensing the patent to several pharmaceutical companies,

and is planning to commercialize the technology over the

next few years.

(b) Example generation of news-like text for the event of

“ﬁling a patent”.

Prompt: Describe how to 3D print an object?

Output

: To 3D print an object, you need to ﬁrst design the object in a 3D modeling program. Once the object

is designed, you need to export it as a 3D printable ﬁle format. Next, you need to choose a 3D printer and print

settings. Finally, you need to print the object.

Figure 2: Examples of generated texts using different prompting methods. The highlighted tokens display relevant

events that will be extracted in the information extraction step.

How-To Articles:

We use the prompt: “Describe

how to topic.” to generate wikiHow-like instruction

articles. The process is repeated 30 times. See

example in Fig. 2c.

Direct Step-by-Step Schema:

We use the prompt:

“What are the steps involved in topic? 1.”

to di-

rectly generate a schema. We run this process once.

See example in Fig. 2a.

Generating documents of various genres enables

our model to induce comprehensive schemas on

any given topics. Considering that some events are

more likely to be in the news (e.g., elections, pan-

demic outbreaks) while others are more technical

in nature and are hence less newsworthy (such as

earning a Ph.D. degree or planning a wedding), we

generate diverse texts and then use a ranking model

to choose the most relevant documents.

The ranking process includes embedding the

texts and the topic with the model proposed in

Reimers and Gurevych (2019), and then calculat-

ing the cosine similarity between each text and the

topic embeddings. Only the 30 texts closest to the

topic are selected, together with the output from the

direct step-by-step schema. The following section

The “1.” in the prompt is for the LM to automatically

complete the steps.

describes the next step in generating a schema of

extracting relevant information from the texts.

4 Information Extraction

For each document, we extract event triggers, ar-

guments and relations between the events that are

important and relevant to the schema topic. We do

not work with a predeﬁned ontology that deﬁnes

what events and arguments are salient in advance

because we allow generating a schema on any topic.

Instead, we employ a statistical approach by ex-

tracting all the information and later ﬁlter it down

to include just frequent items. Here are the steps

involved in our information extraction pipeline:

Semantic Role Labeling (SRL):

We use the

SOTA SRL system

trained on CoNLL12 (Prad-

han et al.,2012) and Nombank dataset (Meyers

et al.,2004) to extract both verb and nominal event

triggers and arguments.

Named Entity Recognition (NER):

We employ

the SOTA NER model (Guo and Roth,2021) to

extract and map entities (potential arguments of

events) into entity types deﬁned in the CoNLL 2002

6https://cogcomp.seas.upenn.edu/page/

demo_view/SRLEnglish

dataset (Tjong Kim Sang,2002) and the LORELEI

project (Strassel and Tracey,2016).

Constituency Parsing:

The arguments extracted

by SRL can be clauses and long phrasal nouns,

hence we employ the AllenNLP

constituency pars-

ing model for argument head word extraction.

Coreference Resolution:

We use the SOTA

model (Yu et al.,2022) for event and entity corefer-

ence resolution to identify within-document coref-

erential relations.

Temporal Relation Extraction:

We ﬁrst try to

use SOTA models (Ning et al.,2019;Zhou et al.,

2021) to predict the temporal relations

between

all possible pairs of extracted events but since the

SOTA models accept two sentences containing

events as input, the inference time

for an

-event

document is

O(n2)

, making the schema induction

process several hours long.

One-Pass Model:

We develop a One-Pass model

that takes the document as input and uses

the contextual representation of events to pre-

dict relations between them. A document

is represented as a sequence of tokens

[t1,· · · , e1,· · · , e2,· · · , tn]

where some of the to-

kens belong to the set of annotated event triggers,

i.e.,

ED={e1, e2,· · · , ek}

, whereas the rest are

other lexemes. We employ the transformer-based

language model Big Bird (Zaheer et al.,2020) to

encode a whole document and obtain the contex-

tualized representations for all the event mentions.

These representations are fed into a multi-layer per-

ceptron in a pairwise fashion and the cross-entropy

loss for each pair is calculated and accumulated for

a batch of documents. As shown in Tab. 2, the in-

ference time is shortened 63-186 times on average,

while the performance of the One-Pass model is

comparable to SOTA models.

Hierarchical Relation Extraction:

The ex-

tremely long inference time of SOTA models for

predicting hierarchical relations (PARENT-CHILD,

CHILD-PARENT, COREF, NOREL) (Zhou et al.,

2020;Wang et al.,2021) also impairs the efﬁciency

of our schema induction system. Thus we use

7https://demo.allennlp.org/

constituency-parsing.

The possible temporal relations (start-time comparison)

are: BEFORE, AFTER, EQUAL and VAGUE.

The inference time is mostly spent on obtaining the con-

textual representation of events using large ﬁne-tuned LMs.

Metrics

Corpus Model F1score Speed GPU Memory

HiEve

Zhou et al. (2020) 0.489 - -

Wang et al. (2021) 0.522 41.68s 4515MiB

One-Pass model 0.472 0.65s 2941MiB

MATRES

Ning et al. (2019) 0.767 30.12s 4187MiB

Zhou et al. (2021) 0.821 89.36s 9311MiB

One-Pass model 0.768 0.48s 2419MiB

Table 2: Performance comparison between the One-

Pass model and SOTA models for event temporal and

hierarchical relation extraction. We report F1scores on

benchmark datasets (HiEve for hierarchical relations,

MATRES for temporal relations), speed (average infer-

ence time for 100 event pairs), and required GPU mem-

ory during inference. The One-Pass models are 63-186

times faster than SOTA models and take up only 26%-

65% of the GPU memory required by SOTA models.

the same One-Pass methodology to extract hier-

archical relations. We observe that the inference

time is greatly shortened, and the One-Pass model

achieves comparable results to previous models

while taking up less GPU memory (see Tab. 2).

After processing the data using the procedure

described above, we get a list of events, their ar-

guments, and relations between the events. We

concentrate on events and relations that frequently

appear in the generated texts since we assume those

are the most important to add to the schema (with-

out any other source of information that could iden-

tify what is salient). We describe the process of

building a schema in the following section.

5 Schema Induction

To consolidate the information extracted from the

previous step, we build a schema as follows:

Make a list of events and relations

: To compare

similar event mentions in different texts, we com-

pare the event trigger itself (whether they are the

same verb or coreferential verbs

) and the NER

types of its arguments. For example, the trigger

“(take) precautions” appeared in 5 documents gen-

erated for the topic of Pandemic Outbreak. In two

documents the subject of the verb phrase “take pre-

cautions” was “residents”, in another two it was

“people” and in the last one, it was “public”. Nev-

ertheless, the NER type is identical in all cases

(PER), and thus we set the frequency of “(take)

precautions” to 5. Similarly, we calculate the fre-

quency of the temporal and hierarchical relations.

We only consider relations and events that appeared

We only consider coreferential and hierarchical relations

if they appear in more than 2 documents.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

EACL'23FindingsZero-ShotOn-the-FlyEventSchemaInductionRotemDror,HaoyuWang,andDanRothDepartmentofComputerandInformationScienceUniversityofPennsylvania{rtmdrr,why16gzl,danroth}@seas.upenn.eduAbstractWhataretheeventsinvolvedinapandemicoutbreak?Whatstepsshouldbetakenwhenplanningawedding?Theanswerstoth...

展开>> 收起<<

EACL23 Findings Zero-Shot On-the-Fly Event Schema Induction Rotem Dror Haoyu Wang and Dan Roth.pdf

共21页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

EACL23 Findings Zero-Shot On-the-Fly Event Schema Induction Rotem Dror Haoyu Wang and Dan Roth

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: