EACL23 Findings Zero-Shot On-the-Fly Event Schema Induction Rotem Dror Haoyu Wang and Dan Roth

2025-04-30 0 0 803.55KB 21 页 10玖币
侵权投诉
EACL’23 Findings
Zero-Shot On-the-Fly Event Schema Induction
Rotem Dror
, Haoyu Wang, and Dan Roth
Department of Computer and Information Science
University of Pennsylvania
{rtmdrr,why16gzl,danroth}@seas.upenn.edu
Abstract
What are the events involved in a pandemic
outbreak? What steps should be taken when
planning a wedding? The answers to these
questions can be found by collecting many
documents on the complex event of interest,
extracting relevant information, and analyzing
it. We present a new approach1in which
large language models are utilized to gener-
ate source documents that allow predicting,
given a high-level event definition, the specific
events, arguments, and relations between them
to construct a schema that describes the com-
plex event in its entirety. Using our model,
complete schemas on any topic can be gener-
ated on-the-fly without any manual data collec-
tion, i.e., in a zero-shot manner. Moreover, we
develop efficient methods to extract pertinent
information from texts and demonstrate in a
series of experiments that these schemas are
considered to be more complete than human-
curated ones in the majority of examined sce-
narios. Finally, we show that this framework is
comparable in performance with previous su-
pervised schema induction methods that rely
on collecting real texts and even reaching the
best score in the prediction task.
1 Introduction
Event processing refers to tracking, analyzing, and
drawing conclusions from streams of information
about events. This event analysis aims at identi-
fying meaningful events (such as opportunities or
threats) in real-time situations and responding ap-
propriately. Event processing can also be utilized
to gain a deep understanding of the specific steps,
arguments, and relations between them that are in-
volved in a complex event. The information above
can be consolidated into a graphical representation
called an event schema (Li et al.,2021). For in-
stance in Fig. 1, the graph representation of events
Indicating equal contribution.
1https://cogcomp.seas.upenn.edu/page/
publication_view/995
and participants assists in gaining an understanding
of the complex event of kidnapping and could help
composing a reaction plan if needed.
The NLP community has devoted much effort to
understanding events that are described in a docu-
ment or in a collection of documents for this pur-
pose. These efforts include identifying event trig-
gers (Lu and Roth,2012;Huang et al.,2018;Wad-
den et al.,2019;Han et al.,2019), extracting event
arguments (Punyakanok et al.,2008;Peng et al.,
2016;Lin et al.,2020;Zhang et al.,2021a), and pre-
dicting the relations between events, e.g., temporal,
coreferential, causal or hierarchical relations (Do
et al.,2012;Lee et al.,2012;Glavaš et al.,2014;
Ning et al.,2018;Wang et al.,2020;Zhang et al.,
2020a;Trong et al.,2022).
Previous works on event schema induction re-
lied on the information extracted from manually
collected documents to build the schema graph. For
instance, Li et al. (2020) learn an auto-regressive
language model (LM) over paths in the instance
graphs depicting events, arguments and relations of
instances of the complex events, and then construct
a schema graph by merging the top
k
ranked paths.
Their approach, however, requires access to many
documents on each topic of interest, which can be
extremely laborious and time consuming to obtain.
In this paper, our goal is to allow creating
schemas on-the-fly by taking as input only the
name of the complex event of interest (like a “pan-
demic outbreak” or an “armed robbery”). To avoid
manually collecting many documents on the topic
of the schema, we utilize pre-trained text genera-
tors, e.g., GPT-3 (Brown et al.,2020), to obtain
documents of diverse genres on the desired topic
(examples presented in Fig. 2). These documents
are then processed to extract pertinent informa-
tion from which a schema is constructed. The fact
that we do not collect any data makes our learning
framework zero-shot since we do not rely on any
human-collected articles or example schemas.
arXiv:2210.06254v2 [cs.CL] 27 Mar 2023
Preparation
Kidnapping
Kidnapper
plans
AND
collects
information
Kidnapper
looks for
victim
Kidnapper
chooses
victim
OR
location
Kidnapper
kidnaps
victim
Kidnapper
transports
victim
Kidnapper
hides
victim
Kidnapper
asks for
ransom
OR
makes
demands
Figure 1: An example schema for the event of Kidnapping. The regular arrows represent temporal relations and
the dashed arrows represent hierarchical relations (PARENT-CHILD).
In addition to eliminating the need to collect data,
we also made the information extraction process
faster by implementing new and efficient methods
for identifying temporal and hierarchical relations
between events mentioned in the text. These two
steps are the most time consuming in the process of
schema induction and could take up to 2 hours each
using state-of-the-art models proposed by Zhou
et al. (2021); Wang et al. (2021). Sending the whole
text as input instead of two sentences at each time,
our proposed model shortens the inference time
significantly to several minutes without enduring a
major loss in performance.
The process of generating texts is explained
in Section §3, and the process of extracting rele-
vant and salient information is described in Sec-
tion §4, then we introduce the construction of
schema graphs in Section §5. To evaluate our
zero-shot schema generator we conduct experi-
ments on a benchmark dataset for schema induc-
tion, LDC2020E25, and provide a new dataset for
further evaluation called Schema-11. Additionally,
we design a subject-matter expert Turing test, a.k.a.
Feigenbaum test (Feigenbaum,2003), to determine
whether our algorithm could mimic experts’ re-
sponse. We also demonstrate that documents gen-
erated by GPT-3 are informative and useful for the
task of schema induction. The experiments and re-
sults are presented in Section §6. The contributions
of our work include:
1.
Predicting an entire schema given the name
of a complex event without collecting data.
2.
Implementing a novel and efficient One-Pass
approach for identifying temporal and hierar-
chical relations between events.
3.
Presenting a method for automatically induc-
ing logical relations between events based on
temporal relations.
4.
Offering a Feigenbaum test for evaluation on
a new schema dataset, Schema-11.
2 Related Work
Schema Induction:
Early schema induction ef-
forts focused on identifying the triggers and partic-
ipants of atomic events without considering rela-
tions between atomic events that comprise com-
plex schemas (Chambers,2013;Cheung et al.,
2013;Nguyen et al.,2015;Sha et al.,2016;Yuan
et al.,2018). More recent work focuses on induc-
ing schemas for pairs of events (Li et al.,2020)
and multiple events (Zhang et al.,2021b;Li et al.,
2021), but they require access to large corpora for
the induction process. In this work, we induce
schemas on-the-fly in a zero-shot manner. As is
standard in state-of-the-art (SOTA) works (Li et al.,
2020,2021;Wen et al.,2021;Lawley and Schubert,
2022), we output all the essential information about
relations between events and arguments extracted
from the text, in addition to logical and hierarchical
relations not studied previously in schema induc-
tion.
Script Learning:
Early script learning work con-
centrated on chains of events with a single pro-
tagonist (Chambers and Jurafsky,2008,2009;
Jans et al.,2012;Rudinger et al.,2015;Granroth-
Wilding and Clark,2016) and later extended to
multiple protagonists (Pichotta and Mooney,2014;
Peng and Roth,2016;Pichotta and Mooney,2016;
Modi,2016;Weber et al.,2018,2020;Zhang et al.,
2020b). All of these works assume there exists a
single line of events that describes all occurrences
within a complex event. This work does not limit it-
self to generating single-chained schemas. We also
consider more complex graphs as schema outputs.
In addition, none of these works deal with zero-shot
scenarios that do not require training data.
Pre-Trained Generation Models:
Large-scale
pre-trained text generation models such as GPT-2
(Radford et al.,2019), GPT-3 (Brown et al.,2020),
BART (Lewis et al.,2020), T5 (Raffel et al.,2020),
i.a. have been used in many NLP tasks. These
models are often seen as few-shot learners (Brown
et al.,2020) and therefore used as inference meth-
ods. However, these text generation models are not
explicitly trained to perform inference, but to pro-
duce the most likely sequence of words to proceed
a certain prompt, similar to language models. In
our work, we use these large pre-trained LMs as
text generators. The generated documents on a par-
ticular topic are leveraged as a corpus for extracting
the schema of the given topic. We rely on the in-
tuition that the generated text will include salient
and stereotypical information that is expected to
be mentioned in the context of the topic (e.g., for
the topic of “planning a wedding, we assume most
documents will include “order catering”).
3 Data Generation
The schema induction process begins with generat-
ing texts using large LMs as text generation models.
These texts are joined to form a knowledge base
for the schema, including all of the potential infor-
mation that the schema may present. One could,
of course, create this knowledge base by crawling
the web for real news articles or Wikipedia entries
related to a certain topic.
We argue, however, that in addition to the obvi-
ous advantages of not having to rely on the avail-
ability of data online and not having to crawl the
entire web for relevant documents on each topic,
the generated data from these large generative mod-
els is more efficient in reporting salient events than
random events described in the news, i.e., gener-
ated texts are more likely to mention important
information than real documents do.
Generated Text Real Text
# events / # tokens 12.52% 6.31%
# arguments / # tokens 5.45% 3.01%
Table 1: The ratio of relevant events and relevant argu-
ment roles identified in generated texts and real texts
for the scenario of IED attack.
Our analysis shows that the generated stories
contain a higher percentage of relevant tokens than
real news articles that are used for schema induc-
tion. To demonstrate this phenomenon, we com-
pare manually collected documents with those that
are automatically generated using GPT-3 for the
event of Improvised Explosive Device (IED) At-
tack (Li et al.,2021). To identify salient events and
arguments concerning IED attacks, we adopt the
DARPA KAIROS Phase 1 (v3.0) ontology
2
— a
fine-grained ontology for schema learning, with 24
entity types, 67 event types, and 85 argument roles.
We calculate the number of relevant event trig-
gers and arguments identified in the text, where a
relevant mention is one whose type appears in the
ontology. The results shown in Table 1demonstrate
that the quality of the generated texts in terms of
conciseness and appearance of important details is
higher than that of real texts. For example, the ratio
of relevant events per token is more than twice as
high in generated texts as it is in real texts. Hence
we are able to not only generate a schema for every
given topic without putting any effort in searching
the web, but the information we generate is also
better suited for our end task of depicting all of the
important aspects of a complex event.
Given a topic for which we want to create a
schema, we generate multiple texts that discuss the
topic event using the OpenAI GPT-3 API
3
with the
Davinci-instruct-beta-v3 model and we also experi-
ment with the Hugging Face GPT-2 API
4
. We use
three prompting methods to generate documents of
diverse genres as follows:
News Articles:
We begin by generating a headline
using the prompt: “Write a news headline about
topic. The output from this prompt is then used
in the following prompt: “Write a news story titled
headline. The output from the second prompt is
added to the pool of generated texts. The process
is repeated 30 times. See example in Fig. 2b.
2
The full ontology definition can be accessed at this link:
https://bit.ly/3mIWJoN.
3https://openai.com/blog/openai-api/.
4https://huggingface.co/gpt2
Prompt
: What are the steps involved in holding a
summit meeting?
1.
Output:
1. Plan the agenda and objectives of the meeting.
2.
Select a venue that is appropriate for the meet-
ing’s purpose and size.
3.
Send out invitations and ensure all participants
have the appropriate information.
4.
Prepare for the meeting by creating materials
and briefing the participants.
5.
Hold the meeting and follow up on any action
items that are generated.
(a) Example generated text of steps involved in the
event of “holding a summit meeting”.
First prompt: Write a news headline about filing a patent.
Output
: New Patent Filing Could Speed Up Drug Develop-
ment.
Second prompt
: Write a news story titled "New Patent
Filing Could Speed Up Drug Development".
Output
: The United States Patent and Trademark Office has
just published a new patent that could speed up the drug de-
velopment process. The patent, filed by a team of researchers
from the DPharm Institute, covers a new drug screening pro-
cess that could help identify potential new drugs much more
quickly. The new process, which is based on a computer
algorithm, is said to be up to 10 times faster than traditional
methods. It can identify potential new drugs by screening
through large databases of existing drugs and drug candi-
dates. The DPharm Institute is already in the process of
licensing the patent to several pharmaceutical companies,
and is planning to commercialize the technology over the
next few years.
(b) Example generation of news-like text for the event of
“filing a patent”.
Prompt: Describe how to 3D print an object?
Output
: To 3D print an object, you need to first design the object in a 3D modeling program. Once the object
is designed, you need to export it as a 3D printable file format. Next, you need to choose a 3D printer and print
settings. Finally, you need to print the object.
(c) Example generation of How-To article for the event of “3D printing”.
Figure 2: Examples of generated texts using different prompting methods. The highlighted tokens display relevant
events that will be extracted in the information extraction step.
How-To Articles:
We use the prompt: “Describe
how to topic. to generate wikiHow-like instruction
articles. The process is repeated 30 times. See
example in Fig. 2c.
Direct Step-by-Step Schema:
We use the prompt:
“What are the steps involved in topic? 1.
5
to di-
rectly generate a schema. We run this process once.
See example in Fig. 2a.
Generating documents of various genres enables
our model to induce comprehensive schemas on
any given topics. Considering that some events are
more likely to be in the news (e.g., elections, pan-
demic outbreaks) while others are more technical
in nature and are hence less newsworthy (such as
earning a Ph.D. degree or planning a wedding), we
generate diverse texts and then use a ranking model
to choose the most relevant documents.
The ranking process includes embedding the
texts and the topic with the model proposed in
Reimers and Gurevych (2019), and then calculat-
ing the cosine similarity between each text and the
topic embeddings. Only the 30 texts closest to the
topic are selected, together with the output from the
direct step-by-step schema. The following section
5
The “1. in the prompt is for the LM to automatically
complete the steps.
describes the next step in generating a schema of
extracting relevant information from the texts.
4 Information Extraction
For each document, we extract event triggers, ar-
guments and relations between the events that are
important and relevant to the schema topic. We do
not work with a predefined ontology that defines
what events and arguments are salient in advance
because we allow generating a schema on any topic.
Instead, we employ a statistical approach by ex-
tracting all the information and later filter it down
to include just frequent items. Here are the steps
involved in our information extraction pipeline:
Semantic Role Labeling (SRL):
We use the
SOTA SRL system
6
trained on CoNLL12 (Prad-
han et al.,2012) and Nombank dataset (Meyers
et al.,2004) to extract both verb and nominal event
triggers and arguments.
Named Entity Recognition (NER):
We employ
the SOTA NER model (Guo and Roth,2021) to
extract and map entities (potential arguments of
events) into entity types defined in the CoNLL 2002
6https://cogcomp.seas.upenn.edu/page/
demo_view/SRLEnglish
dataset (Tjong Kim Sang,2002) and the LORELEI
project (Strassel and Tracey,2016).
Constituency Parsing:
The arguments extracted
by SRL can be clauses and long phrasal nouns,
hence we employ the AllenNLP
7
constituency pars-
ing model for argument head word extraction.
Coreference Resolution:
We use the SOTA
model (Yu et al.,2022) for event and entity corefer-
ence resolution to identify within-document coref-
erential relations.
Temporal Relation Extraction:
We first try to
use SOTA models (Ning et al.,2019;Zhou et al.,
2021) to predict the temporal relations
8
between
all possible pairs of extracted events but since the
SOTA models accept two sentences containing
events as input, the inference time
9
for an
n
-event
document is
O(n2)
, making the schema induction
process several hours long.
One-Pass Model:
We develop a One-Pass model
that takes the document as input and uses
the contextual representation of events to pre-
dict relations between them. A document
D
is represented as a sequence of tokens
D=
[t1,· · · , e1,· · · , e2,· · · , tn]
where some of the to-
kens belong to the set of annotated event triggers,
i.e.,
ED={e1, e2,· · · , ek}
, whereas the rest are
other lexemes. We employ the transformer-based
language model Big Bird (Zaheer et al.,2020) to
encode a whole document and obtain the contex-
tualized representations for all the event mentions.
These representations are fed into a multi-layer per-
ceptron in a pairwise fashion and the cross-entropy
loss for each pair is calculated and accumulated for
a batch of documents. As shown in Tab. 2, the in-
ference time is shortened 63-186 times on average,
while the performance of the One-Pass model is
comparable to SOTA models.
Hierarchical Relation Extraction:
The ex-
tremely long inference time of SOTA models for
predicting hierarchical relations (PARENT-CHILD,
CHILD-PARENT, COREF, NOREL) (Zhou et al.,
2020;Wang et al.,2021) also impairs the efficiency
of our schema induction system. Thus we use
7https://demo.allennlp.org/
constituency-parsing.
8
The possible temporal relations (start-time comparison)
are: BEFORE, AFTER, EQUAL and VAGUE.
9
The inference time is mostly spent on obtaining the con-
textual representation of events using large fine-tuned LMs.
Metrics
Corpus Model F1score Speed GPU Memory
HiEve
Zhou et al. (2020) 0.489 - -
Wang et al. (2021) 0.522 41.68s 4515MiB
One-Pass model 0.472 0.65s 2941MiB
MATRES
Ning et al. (2019) 0.767 30.12s 4187MiB
Zhou et al. (2021) 0.821 89.36s 9311MiB
One-Pass model 0.768 0.48s 2419MiB
Table 2: Performance comparison between the One-
Pass model and SOTA models for event temporal and
hierarchical relation extraction. We report F1scores on
benchmark datasets (HiEve for hierarchical relations,
MATRES for temporal relations), speed (average infer-
ence time for 100 event pairs), and required GPU mem-
ory during inference. The One-Pass models are 63-186
times faster than SOTA models and take up only 26%-
65% of the GPU memory required by SOTA models.
the same One-Pass methodology to extract hier-
archical relations. We observe that the inference
time is greatly shortened, and the One-Pass model
achieves comparable results to previous models
while taking up less GPU memory (see Tab. 2).
After processing the data using the procedure
described above, we get a list of events, their ar-
guments, and relations between the events. We
concentrate on events and relations that frequently
appear in the generated texts since we assume those
are the most important to add to the schema (with-
out any other source of information that could iden-
tify what is salient). We describe the process of
building a schema in the following section.
5 Schema Induction
To consolidate the information extracted from the
previous step, we build a schema as follows:
Make a list of events and relations
: To compare
similar event mentions in different texts, we com-
pare the event trigger itself (whether they are the
same verb or coreferential verbs
10
) and the NER
types of its arguments. For example, the trigger
“(take) precautions” appeared in 5 documents gen-
erated for the topic of Pandemic Outbreak. In two
documents the subject of the verb phrase “take pre-
cautions” was “residents”, in another two it was
“people” and in the last one, it was “public”. Nev-
ertheless, the NER type is identical in all cases
(PER), and thus we set the frequency of “(take)
precautions” to 5. Similarly, we calculate the fre-
quency of the temporal and hierarchical relations.
We only consider relations and events that appeared
10
We only consider coreferential and hierarchical relations
if they appear in more than 2 documents.
摘要:

EACL'23FindingsZero-ShotOn-the-FlyEventSchemaInductionRotemDror,HaoyuWang,andDanRothDepartmentofComputerandInformationScienceUniversityofPennsylvania{rtmdrr,why16gzl,danroth}@seas.upenn.eduAbstractWhataretheeventsinvolvedinapandemicoutbreak?Whatstepsshouldbetakenwhenplanningawedding?Theanswerstoth...

展开>> 收起<<
EACL23 Findings Zero-Shot On-the-Fly Event Schema Induction Rotem Dror Haoyu Wang and Dan Roth.pdf

共21页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:21 页 大小:803.55KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 21
客服
关注