EventGraph at CASE 2021 Task 1 A General Graph-based Approach to Protest Event Extraction Huiling You1David Samuel1Samia Touileb2andLilja Øvrelid1

2025-05-06 0 0 271.55KB 6 页 10玖币

侵权投诉

EventGraph at CASE 2021 Task 1: A General Graph-based Approach to

Protest Event Extraction

Huiling You,1David Samuel,1Samia Touileb,2and Lilja Øvrelid1

1University of Oslo

2University of Bergen

{huiliny, davisamu, liljao}@ifi.uio.no

samia.touileb@uib.no

Abstract

This paper presents our submission to the

2022 edition of the CASE 2021 shared task

1, subtask 4. The EventGraph system adapts

an end-to-end, graph-based semantic parser

to the task of Protest Event Extraction and

more speciﬁcally subtask 4 on event trigger

and argument extraction. We experiment with

various graphs, encoding the events as ei-

ther “labeled-edge” or “node-centric” graphs.

We show that the “node-centric” approach

yields best results overall, performing well

across the three languages of the task, namely

English, Spanish, and Portuguese. Event-

Graph is ranked 3rd for English and Por-

tuguese, and 4th for Spanish. Our code

is available at: https://github.com/

huiling-y/eventgraph_at_case

1 Introduction

The automated extraction of socio-political event

information from text constitutes an important

NLP task, with a number of application areas for

social scientists, policy makers, etc. The task

involves analysis at different levels of granular-

ity: document-level, sentence-level, and the ﬁne-

grained extraction of event triggers and arguments

within a sentence. The CASE 2022 Shared Task

1 on Multilingual Protest Event Detection extends

the 2021 shared task (Hürriyeto˘

glu et al.,2021a)

with additional data in the evaluation phase and

features four subtasks: (i) document classiﬁcation,

(ii) sentence classiﬁcation, (iii) event sentence co-

reference, and (iv) event extraction.

The task of event extraction involves the detec-

tion of explicit event triggers and corresponding

arguments in text. Current classiﬁcation-based ap-

proaches to the task typically model the task as a

pipeline of classiﬁers (Ji and Grishman,2008;Li

et al.,2013;Liu et al.,2020;Du and Cardie,2020;

Li et al.,2020) or using joint modeling approaches

(Yang and Mitchell,2016;Nguyen et al.,2016;Liu

et al.,2018;Wadden et al.,2019;Lin et al.,2020).

In this paper, we present the EventGraph sys-

tem and its application to Task 1 Subtask 4 in the

2022 edition of the CASE 2021 shared task. Event-

Graph is a joint framework for event extraction,

which encodes events as graphs and solves event

extraction as semantic graph parsing. We show

that it is beneﬁcial to model the relation between

event triggers and arguments and approach event

extraction via structured prediction instead of se-

quence labelling. Our system performs well on the

three languages, achieving competitive results and

consistently ranked among the top four systems.

In the following, we brieﬂy describe the data

supplied by the shared task organizers and present

Subtask 4 in some more detail. We then go on

to present an overview of the EventGraph system

focusing on the encoding of the data to semantic

graphs and the model architecture. We experiment

with several different graph encodings and provide

a more detailed analysis of the results.

2 Data and task

Our contribution is to subtask 4, which falls under

shared task 1 – the detection and extraction of socio-

political and crisis events. While most subtasks of

shared task 1 have sentence-level annotations, sub-

task 4 has been annotated at the token-level while

providing the annotators the document-level con-

texts. Subtask 4 focuses on the extraction of event

triggers and event arguments related to contentious

politics and riots (Hürriyeto˘

glu et al.,2021a). This

subtask has been previously approached as a se-

quence labeling problem combining various meth-

ods of ﬁne-tuning pre-trained language models

(Hürriyeto˘

glu et al.,2021a).

The data supplied for Subtask 4 is identical to

that of the 2021 edition of the task, as presented

in Hürriyeto˘

glu et al. (2021a). The data is part of

the multilingual extension of the GLOCON dataset

(Hürriyeto˘

glu et al.,2021b) with data from En-

glish, Portuguese, and Spanish. The source of the

arXiv:2210.09770v1 [cs.CL] 18 Oct 2022

trigger

<root>

target participant

chased, hacked to death

groupChale people

participant

Artiﬁcial root:

Triggers:

Arguments:

<root>

trigger

chased, hacked to death

participant

group

target

Chale

participant

people

<root>

trigger

chased

participant

group

target

Chale

participant

people

trigger

hacked to death

Labeled-edge representation Node-centric representation Node-centric-split representation

Figure 1: Graph representations of sentence “Chale was allegedly chased by a group of about 30 people and was

hacked to death with pangas, axes and spears.”

data is protest event coverage in news articles from

speciﬁc countries: China and South Africa (En-

glish), Brazil (Portuguese), and Argentina (Span-

ish). The data has been doubly annotated by grad-

uate students in political science with token-level

information regarding event triggers and arguments.

Hürriyeto˘

glu et al. (2021a) reports the token level

inter-annotator agreement to be between 0.35 and

0.60. Disagreements between annotators were sub-

sequently resolved by an annotation supervisor. Ta-

ble 1 shows the number of news articles for each

of the languages in the task, distributed over the

training and test sets. This clearly shows that the

majority of the data is in English with only a frac-

tion of articles in Portuguese and Spanish.

Relevant statistics for the different event compo-

nent annotations for Subtask 4 are presented in Ta-

ble 1 detailing the number of triggers, participants,

and various other types of argument components,

such as place, target, organizer, etc. Once again,

the table also illustrates the comparative imbalance

in data across the three languages.

3 System overview

We use our system, EventGraph, that adapts an

end-to-end graph-based semantic parser to solve

the task of extracting socio-political events. In

what follows, we give more details about the graph

representation and the model architecture of our

system.

3.1 Graph representations

We represent each sentence as an event graph,

which contains event trigger(s) and arguments as

nodes. In an event graph, edges are constrained

between the trigger(s) and the corresponding ar-

guments. However, since our system can take as

input graphs in a general sense the precise graph

representation that works best for this task must

English Portuguese Spanish

train 732 (2,925) 29 (78) 29 (91)

dev 76 (323) 4 (9) 1 (15)

test 179 (311) 50 (190) 50 (192)

trigger 4,595 122 157

participant 2,663 73 88

place 1,570 61 15

target 1,470 32 64

organizer 1,261 19 25

etime 1,209 41 40

fname 1,201 48 49

Table 1: Top: Number of articles (sentences) for the

different languages in Subtask 4 (Hürriyeto˘

glu et al.,

2021a). About 10 percent (in terms of sentences) of the

ofﬁcial training data is used as the development split.

Bottom: Counts for the different event components in

Subtask 4 training data for English, Portuguese, and

Spanish (Hürriyeto˘

glu et al.,2021a).

be determined empirically. We here explore two

different graph encoding methods, where the labels

for triggers and arguments are represented either

as edge labels or node labels, namely “labeled-

edge” and “node-centric”. Since sentences in the

data may contain information about several events

with arguments shared across these, we also experi-

ment with a version of the “node-centric” approach

where multiple triggers give rise to separate nodes

in the graph. The intuition behind this is that it is

easier for the model to predict a node anchoring to

a single span than to several disjoint spans.

•Labeled-edge

: labels for event trigger(s) and

arguments are represented as edge labels; mul-

tiple triggers are merged into one node, as

shown by the ﬁrst graph of Figure 1.

•Node-centric

: labels for event trigger(s) and

arguments are represented as node labels;

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

EventGraphatCASE2021Task1:AGeneralGraph-basedApproachtoProtestEventExtractionHuilingYou,1DavidSamuel,1SamiaTouileb,2andLiljaØvrelid11UniversityofOslo2UniversityofBergen{huiliny,davisamu,liljao}@ifi.uio.nosamia.touileb@uib.noAbstractThispaperpresentsoursubmissiontothe2022editionoftheCASE2021sharedtas...

展开>> 收起<<

EventGraph at CASE 2021 Task 1 A General Graph-based Approach to Protest Event Extraction Huiling You1David Samuel1Samia Touileb2andLilja Øvrelid1.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

EventGraph at CASE 2021 Task 1 A General Graph-based Approach to Protest Event Extraction Huiling You1David Samuel1Samia Touileb2andLilja Øvrelid1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: