Detecting Narrative Elements in Informational Text Efﬁ Levi1 Guy Mor2 Tamir Sheafer23 Shaul R. Shenhav2 1Institute of Computer Science The Hebrew University of Jerusalem

2025-04-27 0 0 237.48KB 11 页 10玖币

侵权投诉

Detecting Narrative Elements in Informational Text

Efﬁ Levi1, Guy Mor2, Tamir Sheafer2,3, Shaul R. Shenhav2

1Institute of Computer Science, The Hebrew University of Jerusalem

2Department of Political Science, The Hebrew University of Jerusalem

3Department of Communication and Journalism, The Hebrew University of Jerusalem

efle@cs.huji.ac.il

{guy.mor|tamir.sheafer|shaul.shenhav}@mail.huji.ac.il

Abstract

Automatic extraction of narrative elements

from text, combining narrative theories with

computational models, has been receiving in-

creasing attention over the last few years. Pre-

vious works have utilized the oral narrative the-

ory by Labov and Waletzky to identify vari-

ous narrative elements in personal stories texts.

Instead, we direct our focus to informational

texts, speciﬁcally news stories.

We introduce NEAT (Narrative Elements An-

noTation) – a novel NLP task for detecting nar-

rative elements in raw text. For this purpose,

we designed a new multi-label narrative anno-

tation scheme, better suited for informational

text (e.g. news media), by adapting elements

from the narrative theory of Labov and Walet-

zky (Complication and Resolution)

and adding a new narrative element of our own

(Success). We then used this scheme to an-

notate a new dataset of 2,209 sentences, com-

piled from 46 news articles from various cate-

gory domains1. We trained a number of super-

vised models in several different setups over

the annotated dataset to identify the different

narrative elements, achieving an average F1

score of up to 0.77. The results demonstrate

the holistic nature of our annotation scheme as

well as its robustness to domain category.

1 Introduction

Automatic extraction of narrative elements from

texts is a multidisciplinary ﬁeld of research, com-

bining narrative theories with computational mod-

els, which has been receiving increasing attention

over the last few years. Examples include modeling

narrative structures for story generation (Gervás

et al.,2006), using unsupervised methods to de-

tect narrative event chains (Chambers and Juraf-

sky,2008) and detecting content zones (Baiamonte

et al.,2016) in news articles, using semantic fea-

tures to detect narreme boundaries in ﬁctitious

1https://github.com/efle/NEAT

prose (Delmonte and Marchesini,2017), identi-

fying turning points in movie plots (Papalampidi

et al.,2019) and using temporal word embeddings

to analyze the evolution of characters in the context

of a narrative plot (Volpetti et al.,2020).

A recent and more speciﬁc line of work focuses

on using the theory laid out by Labov and Walet-

zky (1967) and later reﬁned by Labov (2013) to

characterize narrative elements in personal experi-

ence texts. Swanson et al. (2014) relied on Labov

and Waletzky (1967) to annotate a corpus of 50

personal stories from weblogs posts, and tested sev-

eral models over hand-crafted features to classify

clauses into three narrative clause types: orienta-

tion,evaluation and action.Ouyang and McKeown

(2014) constructed a corpus from 20 oral narratives

of personal experience collected by Labov (2013),

and utilized logistic regression over hand-crafted

features to detect instances of complicating actions.

More recently, Li et al. (2017) utilized a combi-

nation of ideas from Labov and Waletzky (1967)

and Freytag (1894) to annotate a collection of short

stories, and Saldias and Roy (2020) used convolu-

tional neural networks (CNNs) to classify clauses

from spoken personal texts into the same three nar-

rative clause types as Swanson et al. (2014).

While these works concentrated their effort on

narrative analysis of personal experience texts, we

direct our focus to detecting narrative patterns in

informational texts, such as news stories. The so-

cial impact of news stories distributed by the media

and their role in creating and shaping of public

opinion incentivized our efforts to adapt narrative

analysis approaches to this domain. To the best of

our knowledge, this is the ﬁrst attempt to automati-

cally detect narrative elements based on Labov and

Waletzky (1967) and later works by Labov (1972,

2013) in news articles.

In this work, we introduce NEAT (Narrative Ele-

ments AnnoTation) – a novel NLP task for detect-

ing narrative elements in raw text. For this pur-

arXiv:2210.03028v1 [cs.CL] 6 Oct 2022

pose, we adapted two elements from the narrative

theory presented in Labov and Waletzky (1967);

Labov (1972,2013), namely

Complication

and

Resolution

, while adding a new narrative

element,

Success

, to create a new multi-label

narrative annotation scheme. This scheme was de-

signed with two main objectives in mind. First, cap-

turing elements oriented towards discourse struc-

ture, rather than semantic content. Second, pos-

sessing the ﬂexibility required to capture narrative

characteristics within a wide variety of text types,

speciﬁcally informational text (as opposed to per-

sonal experience), and not only literary and well-

structured stories. We used this scheme to anno-

tate a newly-constructed dataset of 2,209 sentences,

compiled from 46 English news articles; each sen-

tence was tagged with a subset of the three narrative

elements (or, in some cases, none of them), thus

deﬁning a novel multi-label classiﬁcation task.

We explored two different approaches towards

solving our new task: splitting into three unre-

lated binary classiﬁcation tasks (

Complication

Resolution

and

Success

), and jointly learn-

ing the three narrative categories as a multi-label

classiﬁcation task. We experimented with three

supervised models, each based on ﬁne-tuning a dif-

ferent pre-trained language model: BERT (Devlin

et al.,2018), RoBERTa (Liu et al.,2019) and Dis-

tilBERT (Sanh et al.,2020), achieving an average

score of up to 0.77. An analysis of the results

indicates that our narrative categories are strongly

connected and form a coherent narrative scheme

which is more than just the sum of its parts. Addi-

tional experimentation with cross-domain classiﬁ-

cation demonstrates the task’s robustness to domain

category, suggesting that our annotation scheme is

more grounded in discourse characteristics rather

than semantic context.

The remainder of this paper is organized as fol-

lows: Section 2gives a theoretical background

and describes the adjustments we have made to

the scheme in Labov (2013) in order to adapt it to

informational text. Section 3provides a complete

description of the dataset and of the processes and

methodologies which were used to construct and

annotate it, along with a short analysis and some

examples for annotated sentences. Section 4de-

scribes the experiments conducted on the dataset,

and Section 5provides an analysis and a discus-

sion of the results. Finally, Section 6contains a

summary of our contributions as well as several

potential directions for future work.

2 Narrative Analysis

2.1 Background

Ever since the emergence of formalism and

structuralistic literary criticism (Propp,1968)

and throughout the development of narratology

(Genette,1980;Fludernik,2009;Chatman,1978;

Rimmon-Kenan,2003), narrative structure has

been the focus of extensive theoretical and em-

pirical research. While most of these studies were

conducted in the context of literary analysis, the

interest in narrative structures has made inroads

into social sciences (Shenhav,2015). The classi-

cal work by Labov and Waletzky (1967) on oral

narratives, as well as later works (Labov,1972,

2013), signify this stream of research by provid-

ing a schema for an overall structure of narratives,

according to which a narrative construction encom-

passes the following building blocks (Labov,1972,

2013): abstract (what is the narrative about), ori-

entation (information on the time, the place, the

persons and the behavior involved), complicating

action (or simply complication; the forward pro-

gression of narrative clauses), evaluation (estab-

lishing the narrative’s "point"), resolution (what

ﬁnally happened), and coda (bringing the time of

reference back to the present time of narration).

These building blocks provide useful and inﬂuen-

tial guidelines for oral narratives analysis.

2.2 Adaptation

Despite the substantial inﬂuence of Labov and

Waletzky (1967) and Labov (2013), scholars in

the ﬁeld of communication have noticed that this

overall structure does not necessarily comply with

the form of informational text, such as news sto-

ries (Thornborrow and Fitzgerald,2004;Van Dijk,

1988), and consequently proposed modiﬁed narra-

tive structures (Thornborrow and Fitzgerald,2004).

Unlike well-tailored narrative texts, such as per-

sonal experience texts, narrativity in informational

text is somewhat more challenging as it does

not necessarily follow conventional or predeﬁned

genre-related structures. This requires a ﬂexible

coding scheme, unconstrained by a speciﬁc type

of text. Instead, it should be open to a wide range

of text types (such as informational text), and al-

low the presence of micro stories, encompassing

any combination of all narrative categories even

at the sentence level. We set to accomplish that

Complication Resolution Success

# Sentences 1,092 541 312

Proportion in Dataset 49% 24% 14%

Table 1: Overview of the NEAT dataset. Note that the categories are not mutually exclusive, due to the multi-

labeled nature of the annotation scheme.

via two objectives: ﬁrst, formalizing narrative cate-

gories which are oriented towards discourse struc-

ture, rather than semantic context. Second, deﬁning

our task as a multi-labeled one, to allow the ﬂexibil-

ity required to capture sentence-level narrative char-

acteristics. A special consideration was given to the

variety of contents, forms and writing styles typical

for media texts. For example, we required a coding

scheme that would ﬁt laconic or problem-driven

short reports (too short for full-ﬂedged “Labovian”

narrative style), as well as complicated texts with

multiple story-lines moving from one story to an-

other. We addressed this challenge by focusing on

two of Labov’s six elements - complicating action

and resolution, considered to be the most funda-

mental and relevant for informational text analysis

(Labov,2013). There are several reasons for our fo-

cus on these particular elements: ﬁrst, it goes in line

with the understanding that worth-telling stories

usually consist of protagonists facing and resolving

problematic experiences (Eggins and Slade,2005).

Moreover, these elements resonate with what is

considered by Entman (2004) to be the most impor-

tant Framing Functions - problem deﬁnition and

remedy.

In order to adapt the original complicating ac-

tion and resolution categories to informational con-

tent, we designed our annotation scheme as follows.

Complicating action – hence,

Complication

–

was deﬁned in our narrative scheme as an event,

series of events or situation, that point at problems

or tensions.

Resolution

refers to the way the

story is resolved or to the release of the tension. An

improvement from – or a manner of coping with

– an existing or a hypothetical situation was also

considered to be a

Resolution

. This choice was

made in order to follow the often tentative or spec-

ulative notion of future resolutions in news stories

(Thornborrow and Fitzgerald,2004;Bell,1991).

We have therefore included in this category any

temporary or partial resolutions. The transitional

characteristic of the

Resolution

motivated us

to add a new category deﬁned as

Success

. Un-

Resolution

, which refers, implicitly or ex-

plicitly, to a prior situation, this category was de-

signed to capture any description or indication of

an achievement or a desirable outcome.

3 The Dataset

3.1 Pilot Study

We started by conducting a pilot study, for the pur-

pose of formalizing an annotation scheme and train-

ing our annotators. For this study, sample sentences

were gathered from print news articles, published

between 1995 and 2017 and collected via Lexis-

Nexis. These were used to reﬁne the annotation

scheme described in Section 2.2, as well as per-

form extensive training for our annotators.

Following the conclusion of the pilot study, we

used the sentences which were collected and manu-

ally annotated during the pilot to train a multi-label

classiﬁer, later used to provide labeled candidates

for the annotators during the annotation stage of

the NEAT dataset, in order to optimize annotation

rate and accuracy. The pilot samples were then

discarded.

3.2 News Articles

The news articles for the dataset were sampled from

leading news websites in the English language, all

published between 2017 and 2020. The result is

a corpus of 2,209 sentences taken from 46 news

articles, with an average of 48 sentences per article

(

σ2= 39.44

), and an average of 20.2 tokens per

sentence (

σ2= 11.2

). The articles are semantically

diverse, as they were sampled from a wide array of

domain categories.

3.3 Preprocessing

The news articles’ content was extracted using diff-

bot. The texts were scraped and split into sentences

using the Punkt unsupervised sentence segmenter

(Kiss and Strunk,2006). Remaining segmentation

errors were manually corrected.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DetectingNarrativeElementsinInformationalTextEfLevi1,GuyMor2,TamirSheafer2,3,ShaulR.Shenhav21InstituteofComputerScience,TheHebrewUniversityofJerusalem2DepartmentofPoliticalScience,TheHebrewUniversityofJerusalem3DepartmentofCommunicationandJournalism,TheHebrewUniversityofJerusalemefle@cs.huji.ac.il{...

展开>> 收起<<

Detecting Narrative Elements in Informational Text Efﬁ Levi1 Guy Mor2 Tamir Sheafer23 Shaul R. Shenhav2 1Institute of Computer Science The Hebrew University of Jerusalem.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Detecting Narrative Elements in Informational Text Efﬁ Levi1 Guy Mor2 Tamir Sheafer23 Shaul R. Shenhav2 1Institute of Computer Science The Hebrew University of Jerusalem

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: