An Ordinal Latent Variable Model of Conflict Intensity Niklas StoehrQLucas Torroba HennigenJosef ValvodaD Robert WestNRyan CotterellQAaron ScheinR

2025-04-27 0 0 911.79KB 12 页 10玖币

侵权投诉

An Ordinal Latent Variable Model of Conﬂict Intensity

Niklas StoehrQLucas Torroba Hennigen@Josef ValvodaD

Robert WestNRyan CotterellQAaron ScheinR

QETH Zürich @MIT DUniversity of Cambridge NEPFL RThe University of Chicago

niklas.stoehr@inf.ethz.ch lucastor@mit.edu jv406@cam.ac.uk

robert.west@epfl.ch ryan.cotterell@inf.ethz.ch schein@uchicago.edu

Abstract

Measuring the intensity of events is crucial

for monitoring and tracking armed conﬂict.

Advances in automated event extraction have

yielded massive data sets of “who did what

to whom” micro-records that enable data-

driven approaches to monitoring conﬂict. The

Goldstein scale is a widely-used expert-based

measure that scores events on a conﬂictual–

cooperative scale. It is based only on the action

category (“what”) and disregards the subject

(“who”) and object (“to whom”) of an event, as

well as contextual information, like associated

casualty count, that should contribute to the

perception of an event’s “intensity”. To address

these shortcomings, we take a latent variable-

based approach to measuring conﬂict intensity.

We introduce a probabilistic generative model

that assumes each observed event is associated

with a latent intensity class. A novel aspect of

this model is that it imposes an ordering on the

classes, such that higher-valued classes denote

higher levels of intensity. The ordinal nature of

the latent variable is induced from naturally or-

dered aspects of the data (e.g., casualty counts)

where higher values naturally indicate higher

intensity. We evaluate the proposed model both

intrinsically and extrinsically, showing that it

obtains good held-out predictive performance.

https://github.com/niklasstoehr/

ordinal-conflict-intensity

1 Introduction

On a scale from

−10

for conﬂictual to

+10

for

cooperative, which of the following events should

be considered more “intense”: “Soldiers injured

two civilians” or “Rebels detained ﬁfty soldiers”?

Measuring the intensity of events is crucial

for monitoring and tracking armed conﬂict. Ad-

vances in the automated collection and coding of

events have produced massive and systematized

data sets of micro-records that enable data-driven

CAMEO

code

action

name

Goldstein

value

avg. #

casualties

19 ﬁght -10.0 9.31

20 mass violence -10.0 42.20

18 assault -9.0 11.47

15 force posture -7.2 0.13

17 coerce -7.0 1.44

14 protest -6.5 2.06

13 threaten -6.0 0.13

10 demand -5.0 0.01

12 reject -4.0 0.00

16 reduce relations -4.0 0.00

9 investigate -2.0 0.04

11 disapprove -2.0 0.03

1 public statement 0.0 0.19

4 consult 1.0 0.03

2 appeal 3.0 0.00

5 diplom cooperation 3.5 0.03

3 intent cooperate 4.0 0.05

8 yield 5.0 0.10

6 material cooperation 6.0 0.01

7 provide aid 7.0 0.00

Table 1: The The Goldstein scale is an expert-based

intensity ranking of the

CAMEO action categories

ranging between

−10.0

(conﬂictual) to

+10.0

(cooper-

ative). The scale disregards casualty counts that are typ-

ically considered in conﬂict assessment. Here, we dis-

play casualty counts as reported in the NAVCO dataset.

approaches to monitoring conﬂict. While the “in-

tensity” of a given event has traditionally been

assessed by human expert raters, the tremendous

quantity of events collected every day makes case-

by-case analysis unmanageable. As a consequence,

there is a strong demand for automated and model-

based methods to aggregate events into meaningful

“conﬂict intensity” measures.

One of the most frequently used measures is

the Goldstein scale (Goldstein,1992). Major

event datasets like IDEA (Bond et al.,2003),

KEDS (Schrodt,2008), GDELT (Leetaru and

Schrodt,2013), ICEWS (Boschee et al.,2015),

Phoenix (Beieler,2016) and NAVCO (Lewis et al.,

2016) all rely on it. The Goldstein scale as-

signs intensity scores between

−10.0

and

+10

arXiv:2210.03971v2 [cs.LG] 4 Jun 2023

on a conﬂictual–cooperative scale to the action

categories deﬁned by the Conﬂict and Media-

tion Event Observations (CAMEO) event coding

scheme (Schrodt,2012). CAMEO speciﬁes

204

low-level event types which are summarized into

high-level action categories. The Goldstein

scale ranks “use unconventional mass violence”

and “ﬁght” as the most conﬂictual of the 20 high-

level action categories (

−10.0

) and “provide aid”

(+7.0) as the most cooperative; see Tab. 1.

Despite its usage, the Goldstein scale has many

well-known shortcomings (King and Lowe,2003;

Schrodt,2019). In particular, it applies only to

action categories, and does not account for any

contextual information of a given event, like which

actors are involved, or how many fatalities resulted,

among other bits of context that should contribute

to the perception of an event’s “intensity”.

This paper takes a latent-variable based

approach to measuring conﬂict intensity. We

introduce a probabilistic generative model that

assumes each observed event

is associated with

a latent intensity class

. A novel aspect of this

model is that it imposes an ordering on the classes,

such that higher values of

denote higher levels

of intensity. The ordinal nature of

is induced

from naturally ordered aspects of the data (e.g.,

casualty counts) where higher values naturally

indicate higher intensity. The model effectively

learns to interpolate the ordered (i.e., cardinal

or ordinal) elements of the data while inferring

correlation structure with the non-ordered (e.g.,

categorical) elements of the data (e.g., actor types).

We start with a discussion of the Goldstein scale

and introduce a political event dataset annotated

with Goldstein values in §2and §3. Then, we

propose our model with an ordinal latent variable

in §4. We evaluate the performance of the model

intrinsically (§5) and extrinsically (§6) and ﬁnd that

it improves over measures based on the original

Goldstein scale or heuristics based on the raw data.

2 Limitations of the Goldstein Scale

The Goldstein scale is a widely-used measure of the

conﬂictual versus cooperative nature of interactions

between countries (Goldstein,1992). The scale

was created by a panel of international relations

experts who ranked descriptions of interactions. It

was initially created to score action categories in

the WEIS event ontology (McClelland,1984) and

was later adapted to CAMEO (Schrodt,2012).

The Goldstein scale applies only to the action

category of an event (e.g., “ﬁght” or “trade”). Thus,

two different “ﬁght” events, which might involve

two different pairs of actors, occur at different

times, or differ dramatically with respect to the

number of associated fatalities, will still be as-

signed the same Goldstein value. The Goldstein

scale is thus a poor measure of a conﬂict’s per-

ceived “intensity”, as it ignores much of the infor-

mation that contributes to that perception. Recent

work in conﬂict studies, for instance, operational-

izes “intensity” primarily using casualty counts

(Chaudoin et al.,2017;Zhong et al.,2023), which

the Goldstein scale ignores entirely.

In Tab. 1, we show the empirical distribution

of assigned Goldstein values alongside the empir-

ical distribution of casualty counts in a dataset of

conﬂict events. The Goldstein scale is very coarse-

grained; while it ostensibly ranges between

−10.0

and

+10.0

, only a small number of discrete values

ever occur, with many action categories assigned

the same value. For the purpose of measuring con-

ﬂict intensity, a ﬁner-grained and more contextual

scale is desirable.

3 Conﬂict Event Data

This paper considers the publicly available Non-

violent and Violent Campaigns and Outcomes

(NAVCO) data collection (Chenoweth et al.,2018),

speciﬁcally, the latest release NAVCO 3.0 from

November 2019 which comprises

N= 112,089

events between December 1990 and December

2012. An exemplary event description is “On

19 May 2012, soldiers injured two civilians in

Afghanistan”. Each part of this description has

been parsed by human coders into standardized,

structural features. We color-code the features that

correspond to the semantic roles subject,predicate,

quantiﬁer,object, which are the focus of our mod-

eling approach. Each data point

thus consists

of a four-element tuple

{sn,pn,qn,on}

. We note

that events are further coded for their location (in

this case, Afghanistan) and time (19 May 2012),

among other bits of contextual information. Let us

discuss each feature in more detail:

Subject

.NAVCO contains columns termed

“actor3”, “actor6” and “actor9” which code for the

subject (or agent) of a given action. The actor types

are deﬁned by the CAMEO actor codebook. We

ﬁrst merge the higher-level categories “actor3” and

“actor6”, resulting in

different actor types, and

military political government civilian

20000

subject type distribution

0.0 0.2 0.4 0.6 0.8 1.0

Goldstein values

10000

predicate: Goldstein value distribution

0 50 100 150 200 250 300

victim counts

102

105quantifier: casualty count distribution

military government political civilian

20000

40000

object type distribution

counts

Figure 1: Data distributions of NAVCO 3.0 dataset

(Chenoweth et al.,2018). Goldstein values and

casualty counts have intrinsic intensity orderings.

Goldstein values are reversed and transformed, so

that

1.0

represents the most conﬂictual. We model the

subject

as Categorical, the predicate

as Beta, the

quantiﬁer

as Zero-inﬂated Geometric and the object

onas Categorical.

then map all actor types into one of

S= 4

classes:

sn∈ {civilian,military,governmental,political}

We present our

-class actor type mapping in Tab. 3

of the appendix.

Predicate

(Goldstein values). NAVCO

codes

each event description into one of the 20

CAMEO action categories in the column “verb10”,

which is by extension associated with a Goldstein

value

. Throughout, we refer to the Goldstein

value

as an action’s “predicate”, since there is a

one-to-one mapping between action categories and

Goldstein values. We scale Goldstein values pnto

[0,1]

range and invert them (i.e.,

pn←1−pn

)

so that higher values close to

represent more con-

ﬂictual action categories.

Quantiﬁer

(casualty counts). Each event

description is annotated with human-veriﬁed

fatality and wounded counts. We add the two and

NAVCO features a

21st

action category which we exclude

since it is not speciﬁed by the CAMEO taxonomy and thus

has no Goldstein value.

refer to the resulting value

qn∈N+

as an event’s

“quantiﬁer” or its “casualty count”. In Tab. 1, we

give the average number of casualties associated

with each action alongside its Goldstein value—as

intuition might suggest, actions that Goldstein

scores as more conﬂictual (e.g., “ﬁght” (

−10.0

))

coincide with more casualties on average.

Object

.Similar to its subject, NAVCO codes

for the direct object or “target” of a conﬂict action

using the CAMEO coding scheme; these codes are

found in the columns “target3” and “target6”. We

map those into the

O= 4

classes so that

on∈

{civilian,military,governmental,political}.

Contextual information: location and time.

Finally, each event is further annotated with a

timestamp and location, which we use to design

extrinsic evaluation tasks in §6.

4 Ordinal Latent Variable Model

We operationalize conﬂict intensity as a latent vari-

able that expresses the association between the ob-

served variables subject (

), predicate (

), quan-

tiﬁer (

) and object (

). Each data point is a

tuple

{sn,pn,qn,on}

representing an event. Our

Bayesian latent variable model is depicted in Fig. 2.

We assume the following generative story. For each

event

, we assume that its event intensity class

zn∈ {1, . . . , C}is a Categorical random variable

zn∼Categorical(π(z))(1)

where

π(z)

is a

-dimensional discrete distribution

over latent intensity classes. We place a Dirichlet

prior over π(z)

π(z)∼Dirichlet(α(z))(2)

with concentration parameter

α(z)∈RC

. Condi-

tioned on

, we assume each of the observed sites

per event tuple

sn,pn,qn

and

are then drawn as

sn|zn∼Categorical(π(s)

zn)(3)

pn|zn∼Beta(ω(p)

zn,κ(p)

zn)(4)

qn|zn∼Zero-inﬂ.Geom.(δ(q)

zn,b(q)

zn)(5)

on|zn∼Categorical(π(o)

zn)(6)

Here

π(s)

and

π(o)

are the discrete distributions for

class

over

subject and

object types, respec-

tively. They are given as row vectors in the matrices

Π(s)∈(0,1)C×S

and

Π(o)∈(0,1)C×O

that

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AnOrdinalLatentVariableModelofConflictIntensityNiklasStoehrQLucasTorrobaHennigen@JosefValvodaDRobertWestNRyanCotterellQAaronScheinRQETHZürich@MITDUniversityofCambridgeNEPFLRTheUniversityofChicagoniklas.stoehr@inf.ethz.chlucastor@mit.edujv406@cam.ac.ukrobert.west@epfl.chryan.cotterell@inf.ethz.chsche...

展开>> 收起<<

An Ordinal Latent Variable Model of Conflict Intensity Niklas StoehrQLucas Torroba HennigenJosef ValvodaD Robert WestNRyan CotterellQAaron ScheinR.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

An Ordinal Latent Variable Model of Conflict Intensity Niklas StoehrQLucas Torroba HennigenJosef ValvodaD Robert WestNRyan CotterellQAaron ScheinR

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: