EtriCA Event-Triggered Context-Aware Story Generation Augmented by Cross Attention Chen Tang1 Chenghua Lin2 Henglin Huang1 Frank Guerin1and Zhihao Zhang3

2025-05-06 0 0 1.07MB 15 页 10玖币
侵权投诉
EtriCA: Event-Triggered Context-Aware Story Generation
Augmented by Cross Attention
Chen Tang1, Chenghua Lin2
, Henglin Huang1, Frank Guerin1and Zhihao Zhang3
1Department of Computer Science, The University of Surrey, UK
2Department of Computer Science, The University of Sheffield, UK
3School of Economics and Management, Beihang University, Beijing, China
{chen.tang,hh01034,f.guerin}@surrey.ac.uk
c.lin@sheffield.ac.uk,zhhzhang@buaa.edu.cn
Abstract
One of the key challenges of automatic story
generation is how to generate a long narrative
that can maintain fluency, relevance, and
coherence. Despite recent progress, current story
generation systems still face the challenge of
how to effectively capture contextual and event
features, which has a profound impact on a
model’s generation performance. To address
these challenges, we present EtriCA, a novel
neural generation model, which improves the
relevance and coherence of the generated stories
through residually mapping context features to
event sequences with a cross-attention mech-
anism. Such a feature capturing mechanism
allows our model to better exploit the logical
relatedness between events when generating
stories. Extensive experiments based on both
automatic and human evaluations show that our
model significantly outperforms state-of-the-art
baselines, demonstrating the effectiveness of our
model in leveraging context and event features.
1 Introduction
Story Generation aims to generate fluent, relevant and
coherent narratives conditioned on a given context.
As the task is notoriously difficult, a common strategy
is to employ storylines composed of events to support
the generation process (Yao et al.,2019;Chen et al.,
2021;Alhussain and Azmi,2021;Tang et al.,2022b).
This process imitates the behavior of human writers.
Firstly, a story will start from a sketch of key words
containing events, and then human writers will
unfold the story following the track of planned event
sequences.
Despite recent progress, existing approaches are
still ineffective in exploiting planned events when
generating stories. Usually, pre-trained generation
models, e.g., BART (Goldfarb-Tarrant et al.,2020;
Clark and Smith,2021;Huang et al.,2022) are em-
ployed to generate stories after event planning. How-
*Corresponding author.
Figure 1: Conditioned on leading context and reference
events (extracted from reference stories), existing gen-
eration models still suffer from problems of relevance
and coherence. For instance, we fine-tune BART (Lewis
et al.,2020) to generate stories. The leading context
and reference text in this example are collected from
ROC Stories (Mostafazadeh et al.,2016). Some conflicts
among them are observed and coloured.
ever, as shown by the conflicts in Figure 1, the sep-
arate sentences generated by BART look reasonable,
but there are several issues observed considering the
whole story: As a commonsense story, if the car needs
to be fixed and replaced” then it is too broken to
drive around”;“Kenshould not drive the car “very
fast” in the “snow”; If “Ken” “got stuck in the ditch
or “lost traction, he cannot then be "driving long
distances". We hypothesise that these problems come
from the inadequacy of capturing contextual features
when keeping track of event sequences, because (i) the
planned events generally lack background information,
e.g., Ken (the character) and snow (the scene) and (ii)
training stories may have the same events but different
arXiv:2210.12463v1 [cs.CL] 22 Oct 2022
reference stories, which may lead to confusion during
inference if not considering the story-specific scenario.
Therefore, to address these challenges we propose
EtriCA
- a novel
E
vent-
Tri
ggered
C
ontext-
A
ware
end-to-end framework for story generation. Given
both leading context and planned events, EtriCA can
more effectively capture contextual and event features
from inputs than state-of-the-art (abbr. SOTA)
baseline models. Traditional generation models
struggle to learn contextual representations when
implicitly keeping track of the state of events due
to feature differences of events and contexts. As an
abstract storyline, an event sequence only contains
schematic information related to actions (e.g. the
verb), while the context usually records story-specific
details, e.g., the scene and characters in a story.
To comprehensively leverage both features, we
draw inspiration from prior work dealing with infor-
mation fusion (Chen et al.,2018;Xing et al.,2020;He
et al.,2020;You et al.,2020;Wang et al.,2021;Tang
et al.,2022a) to encode heterogeneous features with
a cross attention mechanism (Gheini et al.,2021). We
aim to inform our model of the context background
when the neural module unfolds each event into a
narrative. We propose a novel neural module that
learns to implicitly map contextual features to event
features through information fusion on their numeric
vector spaces (we call this process contextualising
events). The whole process is illustrated in Figure. 2.
With the contextualised event features, an autoregres-
sive decoder is employed to dynamically generate
stories by learning to unfold the contextualised events.
We also introduce an auxiliary task of Sentence
Similarity Prediction (Guan et al.,2021) to enhance
the coherence between event-driven sentences.
To support research on event-driven story gener-
ation, we propose a new task formulated by writing
stories according to a given leading context and event
sequence. We improve the event extraction framework
of Chen et al. (2021) by exploiting dependency
parsing to capture event related roles from sentences,
instead of using heuristic rules. We also present two
datasets where multi-sentence narratives from existing
datasets are paired with event sequences using our au-
tomatic event extraction framework. Importantly, our
task formulation can also benefit the study of control-
lable story generation, considering there is increasing
interest in storyline-based neural generative frame-
works (Xu et al.,2020;Ghazarian et al.,2021;Chen
et al.,2021). According to our extensive experiments,
EtriCA performs better than baseline models consid-
ering the metrics of fluency, coherence, and relevance.
Our contributions1can be summarised as follows:
A new task formulation for event-driven story
writing, which requires the generation model to
write stories according to a given leading context
and event sequence.
We annotate event sequences on two existing
popular datasets for our new task, and introduce new
automatic metrics based on semantic embeddings
to measure the coherence and relevance of the
generated stories.
We propose a neural generation model
EtriCA
,
which leverages the context and event sequence
with an enhanced cross-attention based feature cap-
turing mechanism and sentence-level representation
learning.
We conduct a range of experiments to demonstrate
the advances of our proposed approach, and com-
prehensively analyse the underlying characteristics
contributing to writing a more fluent, relevant, and
coherent story.
2 Related Work
2.1 Neural Story Generation
Before the surge of deep learning techniques, story
generation models only generated simple sentences
and heavily relied on manual designs (McIntyre and
Lapata,2009;Woodsend and Lapata,2010;McIntyre
and Lapata,2010;Huang and Huang,2013;Kybartas
and Bidarra,2016). Since neural story generation
came into being, end-to-end neural models, especially
pre-trained models, e.g., BART (Lewis et al.,2020)
and GPT-2 (Radford et al.,2019), are widely em-
ployed as the main module of story writing (Rashkin
et al.,2020;Guan et al.,2020;Goldfarb-Tarrant et al.,
2020;Clark and Smith,2021). However, it is hard
to guarantee logical correctness for naive Seq2Seq
models when the generated text is growing longer, so
recent work is exploring multi-step generations which
implement neural models in traditional generative
pipelines (Guan et al.,2021). For example, Yao et al.
(2019); Goldfarb-Tarrant et al. (2020); Chen et al.
(2021) split story generation into planning (inputs to
events) and writing (events to stories), and leverage
two neural generation models to learn them.
2.2 Event Planning for Story Generation
At the planning stage, prior research (Yao et al.,2019;
Rashkin et al.,2020;Goldfarb-Tarrant et al.,2020;
1
The related code is available at
https://github.com/
tangg555/EtriCA-storygeneration
Figure 2: The overview of the event feature contextualising process. The leading context coloured in red contains some
important information which affects the generation process, e.g., the weather "snows" may lead to "accident". These
implicit clues help the neural generator to disambiguate the context of events. We firstly fuse both context and event
features, and then feed them to the generator.
Figure 3: An example to illustrate the process of event extraction. TOK is the basic unit of a sentence. POS is the part
of speech, and DEP stands for dependencies between tokens. Through parsing dependencies, the event trigger (also
recognised as the root of sentence) filters all significant roles to represent a complete action. Meanwhile, extracted
neighbour events are considered to have temporal relations.
Jhamtani and Berg-Kirkpatrick,2020;Ghazarian
et al.,2021) mostly focused on extracting event
sequences from the reference text as the ground truths
of plot planning, and then leveraged neural models
(Radford et al.,2019;Lewis et al.,2020) to predict
events with given leading context or titles. Events
have a lot of representation formats, e.g., verbs, tuples,
key words, etc. Among them a straight forward
approach is extracting verbs as events (Jhamtani and
Berg-Kirkpatrick,2020;Guan et al.,2020;Kong
et al.,2021), which is also the method we followed.
However, verbs alone are not good enough to keep
information integrity. For instance, semantic roles like
negation (not) are significant for correct understand-
ing. Peng and Roth (2016) and Chen et al. (2021) use
some heuristic rules to include these semantic roles,
but those heuristic rules are not complete to include
all the key roles. Therefore, inspired by related
works (Rusu et al.,2014;Björne and Salakoski,2018;
Huang et al.,2018) in open-domain event extraction,
we propose an event extraction workflow based
dependency parsing to capture essential components
for verb phrases in sentences as events.
3 Methodology
3.1 Task Formulation
Under the umbrella of controllable story generation
we define the following task: write a story that
leverages both the given leading context and a
given planned event sequence. Our primary goal
is to investigate how to consider the context while
keeping track of the given event sequence with
neural generation models, so we expand the original
context-aware story generation settings of Guan et al.
(2021) by adding an event sequence, for each leading
context, as the storyline to follow.
Input:
The input for each sample includes a leading
context
C={c1, c2, ..., cn}
which acts as the
first sentence of a story, and an event sequence
E={e1,e2,...,em}
as a storyline to build up a sketch
for a story.
ci
means the
i
-th token of the leading
摘要:

EtriCA:Event-TriggeredContext-AwareStoryGenerationAugmentedbyCrossAttentionChenTang1,ChenghuaLin2,HenglinHuang1,FrankGuerin1andZhihaoZhang31DepartmentofComputerScience,TheUniversityofSurrey,UK2DepartmentofComputerScience,TheUniversityofSheffield,UK3SchoolofEconomicsandManagement,BeihangUniversity,B...

展开>> 收起<<
EtriCA Event-Triggered Context-Aware Story Generation Augmented by Cross Attention Chen Tang1 Chenghua Lin2 Henglin Huang1 Frank Guerin1and Zhihao Zhang3.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:1.07MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注