
reference stories, which may lead to confusion during
inference if not considering the story-specific scenario.
Therefore, to address these challenges we propose
EtriCA
- a novel
E
vent-
Tri
ggered
C
ontext-
A
ware
end-to-end framework for story generation. Given
both leading context and planned events, EtriCA can
more effectively capture contextual and event features
from inputs than state-of-the-art (abbr. SOTA)
baseline models. Traditional generation models
struggle to learn contextual representations when
implicitly keeping track of the state of events due
to feature differences of events and contexts. As an
abstract storyline, an event sequence only contains
schematic information related to actions (e.g. the
verb), while the context usually records story-specific
details, e.g., the scene and characters in a story.
To comprehensively leverage both features, we
draw inspiration from prior work dealing with infor-
mation fusion (Chen et al.,2018;Xing et al.,2020;He
et al.,2020;You et al.,2020;Wang et al.,2021;Tang
et al.,2022a) to encode heterogeneous features with
a cross attention mechanism (Gheini et al.,2021). We
aim to inform our model of the context background
when the neural module unfolds each event into a
narrative. We propose a novel neural module that
learns to implicitly map contextual features to event
features through information fusion on their numeric
vector spaces (we call this process contextualising
events). The whole process is illustrated in Figure. 2.
With the contextualised event features, an autoregres-
sive decoder is employed to dynamically generate
stories by learning to unfold the contextualised events.
We also introduce an auxiliary task of Sentence
Similarity Prediction (Guan et al.,2021) to enhance
the coherence between event-driven sentences.
To support research on event-driven story gener-
ation, we propose a new task formulated by writing
stories according to a given leading context and event
sequence. We improve the event extraction framework
of Chen et al. (2021) by exploiting dependency
parsing to capture event related roles from sentences,
instead of using heuristic rules. We also present two
datasets where multi-sentence narratives from existing
datasets are paired with event sequences using our au-
tomatic event extraction framework. Importantly, our
task formulation can also benefit the study of control-
lable story generation, considering there is increasing
interest in storyline-based neural generative frame-
works (Xu et al.,2020;Ghazarian et al.,2021;Chen
et al.,2021). According to our extensive experiments,
EtriCA performs better than baseline models consid-
ering the metrics of fluency, coherence, and relevance.
Our contributions1can be summarised as follows:
•
A new task formulation for event-driven story
writing, which requires the generation model to
write stories according to a given leading context
and event sequence.
•
We annotate event sequences on two existing
popular datasets for our new task, and introduce new
automatic metrics based on semantic embeddings
to measure the coherence and relevance of the
generated stories.
•
We propose a neural generation model
EtriCA
,
which leverages the context and event sequence
with an enhanced cross-attention based feature cap-
turing mechanism and sentence-level representation
learning.
•
We conduct a range of experiments to demonstrate
the advances of our proposed approach, and com-
prehensively analyse the underlying characteristics
contributing to writing a more fluent, relevant, and
coherent story.
2 Related Work
2.1 Neural Story Generation
Before the surge of deep learning techniques, story
generation models only generated simple sentences
and heavily relied on manual designs (McIntyre and
Lapata,2009;Woodsend and Lapata,2010;McIntyre
and Lapata,2010;Huang and Huang,2013;Kybartas
and Bidarra,2016). Since neural story generation
came into being, end-to-end neural models, especially
pre-trained models, e.g., BART (Lewis et al.,2020)
and GPT-2 (Radford et al.,2019), are widely em-
ployed as the main module of story writing (Rashkin
et al.,2020;Guan et al.,2020;Goldfarb-Tarrant et al.,
2020;Clark and Smith,2021). However, it is hard
to guarantee logical correctness for naive Seq2Seq
models when the generated text is growing longer, so
recent work is exploring multi-step generations which
implement neural models in traditional generative
pipelines (Guan et al.,2021). For example, Yao et al.
(2019); Goldfarb-Tarrant et al. (2020); Chen et al.
(2021) split story generation into planning (inputs to
events) and writing (events to stories), and leverage
two neural generation models to learn them.
2.2 Event Planning for Story Generation
At the planning stage, prior research (Yao et al.,2019;
Rashkin et al.,2020;Goldfarb-Tarrant et al.,2020;
1
The related code is available at
https://github.com/
tangg555/EtriCA-storygeneration