Full-Text Argumentation Mining on Scientiﬁc Publications Arne Binder1Bhuvanesh Verma2Leonhard Hennig1 1German Research Center for Artiﬁcial Intelligence DFKI

2025-05-06 0 0 638.16KB 13 页 10玖币

侵权投诉

Full-Text Argumentation Mining on Scientiﬁc Publications

Arne Binder1Bhuvanesh Verma2Leonhard Hennig1

1German Research Center for Artiﬁcial Intelligence (DFKI)

2University of Potsdam

1{arne.binder, leonhard.hennig}@dfki.de

2bhuvanesh.verma@uni-potsdam.de

Abstract

Scholarly Argumentation Mining (SAM) has

recently gained attention due to its potential

to help scholars with the rapid growth of pub-

lished scientiﬁc literature. It comprises two

subtasks: argumentative discourse unit recog-

nition (ADUR) and argumentative relation ex-

traction (ARE), both of which are challeng-

ing since they require e.g. the integration of

domain knowledge, the detection of implicit

statements, and the disambiguation of argu-

ment structure (Al Khatib et al.,2021). While

previous work focused on dataset construc-

tion and baseline methods for speciﬁc doc-

ument sections, such as abstract or results,

full-text scholarly argumentation mining has

seen little progress. In this work, we intro-

duce a sequential pipeline model combining

ADUR and ARE for full-text SAM, and pro-

vide a ﬁrst analysis of the performance of pre-

trained language models (PLMs) on both sub-

tasks. We establish a new SotA for ADUR

on the Sci-Arg corpus, outperforming the pre-

vious best reported result by a large margin

(+7% F1). We also present the ﬁrst results for

ARE, and thus for the full AM pipeline, on this

benchmark dataset. Our detailed error analy-

sis reveals that non-contiguous ADUs as well

as the interpretation of discourse connectors

pose major challenges and that data annotation

needs to be more consistent.

1 Introduction

Argumentation Mining (AM) is concerned with

the detection of the argumentative structure of text

(Stede and Schneider,2018). It is commonly or-

ganized into two subtasks: 1) Recognition of ar-

gumentative discourse units (ADUs), i.e. detecting

argumentative spans of text and classifying them

into types such as claim or premise, and 2) deter-

mining which ADUs have a relationship to each

other and of what kind, e.g. support or attack. Con-

sider the following example, where the premise P

supports the claim C:

:::::::::::

Dot-product :::::::::

attention:::

is ::::::

much::::::

faster

::::

than::::::::

additive:::::::::

attentionC

, since

it can

be implemented using highly optimized

matrix multiplication codeP.1

Since the amount of published scientiﬁc liter-

ature is growing exponentially (Fortunato et al.,

2018), there is recently an increased interest in

scholarly argumentation mining (SAM). Under-

standing the argumentative structure is key, not just

to efﬁciently digest such work, but also to assess

its quality (Walton,2001). Solving scholarly AM

is challenging, because it requires, among other

things, the use of domain knowledge, the detection

of implicit statements, and the disambiguation of

argument structure (Al Khatib et al.,2021). This

is even harder when handling full-text that is often

less concise and standardized, than, for example,

abstracts.

Previous work in SAM has focused on dataset

construction (Teufel and Moens,1999;Lauscher

et al.,2018b), ADU recognition (Lauscher et al.,

2018a;Li et al.,2021), and the analysis of spe-

ciﬁc document sections, such as abstract or results

(Dasigi et al.,2017;Accuosto and Saggion,2019;

Mayer et al.,2020). However, to get a thorough un-

derstanding of a scientiﬁc publication, all parts of

the document matter. Ideally, they back up the main

argumentation and usually contain details that are

relevant for the knowledgeable reader, thus, they

should not be neglected. However, since the task is

very complex, also for humans, there is not much

training data for full-text SAM available.

Pretrained Language Models (PLMs) such as

SciBERT (Beltagy et al.,2019) may help to address

the above challenges because they contain a lot of

linguistic and domain knowledge and have better

long-range capabilities, allowing for improved con-

textualisation, especially when training data is rare.

We hence propose a PLM based model for full-text

1replicated from Vaswani et al. (2017)

arXiv:2210.13084v1 [cs.CL] 24 Oct 2022

Figure 1: Example with argumentative structure from the Sci-Arg dataset.

SAM. To summarize, our contributions in this work

are:

•

We are the ﬁrst to investigate PLMs for full-

text SAM, and to present a sequential pipeline

for both ADU recognition and argumentative

RE on full-text scientiﬁc publications (Sec-

tion 3).

•

Our experimental results show that a

SciBERT-based ADU recognition model im-

proves over the state-of-the-art by

+7%

F1-

score. We present the ﬁrst relation extraction

baseline for the Sci-Args corpus and achieve

strong 0.74 F1 (Section 5.1).

•

Our detailed error analysis reveals open chal-

lenges and possible ways of improvements

(Section 5.2).

2 Preliminaries

We ﬁrst deﬁne the two tasks of ADUR and ARE,

and discuss differences to the standard Information

Extraction (IE) tasks of Named Entity Recognition

(NER) and Relation Extraction (RE).

An Argumentative Discourse Unit (ADU) can

be deﬁned as “span of text that plays a single role

for an argument being analyzed and is demarcated

by neighboring text spans that play a different role,

or none at all” (Stede and Schneider,2018). It is

the smallest unit of argumentation, and may span

anything from an in-sentence clause up to multiple

full sentences. ADU recognition requires both de-

tecting argumentative spans, as well as classifying

them into predeﬁned categories. Typically, this is

realised as sequence tagging task similar to NER,

where a sequence of tokens

X={t1, t2, ..., tN}

assigned with a corresponding

-length sequence

of labels

Y={l1, l2, ..., lN}

with

li∈C

where

is the set of tags that result from converting the

ADU types into a tagging scheme like BIO2.

scholarly AM, common ADU classes are (Own /

Background) Claim, and Evidence,Data, or War-

rant (Green,2014;Lauscher et al.,2018b).

2BIO2: Begin, Inside, Outside of an entity

In contrast to NER, ADUs typically vary much

more in length than named entities. They are also

highly context dependent and often discontinuous.

ADUR is also related to discourse segmentation,

but depends more on broader context and seman-

tics instead of linguistic structure. Elementary Dis-

course Units (EDUs), the building blocks in the

context of Rethorical Structure Theory (Mann and

Thompson,1988), are more ﬁne-grained, of shorter

length and usually cover the complete text which

is less the case for argumentative units.

Argumentative Relation Extraction is usually de-

ﬁned as classifying a pair of ADUs, head and tail,

as either an instance of one of the target types or the

artiﬁcial NO-RELATION type. In other words, the

task is to assign a label

Y∈C∪ {NO-RELATION}

to a given input

X={T, h, t}

, where

is the set

of relation types,

is the text and

h= (sh, eh, lh)

and

t= (st, et, lt)

describe the candidate head and

tail entities where

and

are the start and end in-

dices with respect to

and

is the entity type. Typ-

ical relation types for SAM are Supports,Mentions,

Attacks,Contradicts, and Contrasts (Lauscher et al.,

2018b;Accuosto and Saggion,2019;Nicholson

et al.,2021).

ARE is very similar to standard RE, but SAM

relations are often marked by syntactic cues such

as connectors, e.g. “because”, “however”, or “but”,

whereas in common RE, content words like verbs

and nouns are typical relation triggers. This makes

ARE challenging because these connectors do not

always realise argumentative structure, but also

mark other aspects of discourse. Consider, for ex-

ample, the different meanings of “while” in the

following example:

1. While

I love a romantic dinner, I also like fast

food.

2. :::::

While I prepare dinner, I watch a movie.

Here, the “while” in sentence 1) has a contrastive

meaning, whereas sentence 2) denotes a temporal

aspect.

Bi-LSTM

CRF

execution time increased quite substantially ( 4.54ms to 22.74ms )Furthermore

B-own-claim I-own-claim B-data I-data L-data OI-own-claim L-own-claimO I-own-claim O

SciBERT

(a)

ADU Recognition.

Tokens are embedded with a frozen

PLM, further contextualized with a trained LSTM followed

by a CRF to calculate the tag sequence.

supports

Bi-LSTM

Classifier

CNN+Pooling

Argument embeddings

ADU embeddings

SciBERT token embeddings

execution substantially ( 4.54ms )Furthermore

B-own-claim L-own-claim O B-data OO

B-Arg2 I-Arg2 O B-Arg1 OO

(b)

Argumentative RE.

Tokens are embedded with a frozen

PLM, ADU tags and argument tags are embedded with simple

embedding matrices. Embeddings are concatenated, contex-

tualized with a LSTM and converted into a single vector that

gets classiﬁed by a single fully connected layer.

Figure 2: Model setup for (a) ADUR (top) and (b) ARE

(bottom).

3 Models

We propose a pipeline of two distinct models, one

for each subtask, that are described in the follow-

ing.

ADU Recognition (ADUR).

The architecture of

the ADUR model is visualized in Figure 2a. We

ﬁrst embed the token sequence with a frozen PLM

encoder. For sequences that exceed the maximum

input length of the embedding model, we process

the sequence piece-wise and concatenate the result

afterwards. The embedded tokens are then fed into

a BiLSTM (Schuster and Paliwal,Nov./1997). Fi-

nally, a Conditional Random Field (CRF) (Lafferty

et al.,2001) is used to obtain the label probabilities

for each token. We use a combination of a frozen

PLM with a trainable contextualization (LSTM) on

top because its training requires less resources than

ﬁne-tuning the PLM and initial tests have shown

similar performance.3

Argumentative RE (ARE).

The model architec-

ture for the relation extraction subtask is shown in

Note that the training dataset is relative small, so restrict-

ing the number of trainable parameters seems to mitigate

overﬁtting.

Train Test Total

ADUs

background claim 2563 661 3224

own claim 4608 1241 5849

data 3346 858 4204

Relations

supports 4426 1260 5686

contradicts 551 133 684

semantically same 36 3 39

parts of same 1000 269 1269

Table 1: Label counts for the Sci-Arg dataset.

Figure 2b. ARE is implemented as a classiﬁcation

task, where a pair of candidate ADUs is selected

and marked in the input token sequence. To reduce

combinatorial complexity, only ADU pairs with a

distance smaller than some threshold

are consid-

ered. Similar to ADU recognition, we ﬁrst embed

the token sequence in a window of

tokens around

the candidate entity pair with a frozen PLM model.

We also create non-contextualized embeddings for

the ADU- and argument-tags of the tokens within

the window. As argument tags we simply use head

and tail labels to mark the candidate entity tokens.

All three embedding sequences are concatenated

token-wise and fed into a BiLSTM. The result is

converted into a single vector using a Convolutional

Neural Network (CNN) and max-pooling, which

then is classiﬁed as one of the relation labels by a

linear projection with softmax.

4 Experimental Setup

Dataset.

We use the Sci-Arg dataset (Lauscher

et al.,2018b) for model training and evaluation. It

is the only available full text argumentation min-

ing dataset for scientiﬁc publications. It contains

40 full text publications annotated with ADUs and

argumentative relations. Figure 1shows an ex-

ample excerpt, and Table 1summarizes the main

dataset statistics. The PARTS OF SAME relation

type is used to model non-contiguous spans. The

label counts differ slightly from values published

in Lauscher et al. (2018b), because annotations in

one ﬁle (A28) caused parsing errors and were ex-

cluded. Furthermore, non-contiguous spans are not

merged. We create a train/test split by using the

ﬁrst 30 documents for training and the remaining 9

for evaluation.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Full-TextArgumentationMiningonScienticPublicationsArneBinder1BhuvaneshVerma2LeonhardHennig11GermanResearchCenterforArticialIntelligence(DFKI)2UniversityofPotsdam1{arne.binder,leonhard.hennig}@dfki.de2bhuvanesh.verma@uni-potsdam.deAbstractScholarlyArgumentationMining(SAM)hasrecentlygainedattentiond...

展开>> 收起<<

Full-Text Argumentation Mining on Scientiﬁc Publications Arne Binder1Bhuvanesh Verma2Leonhard Hennig1 1German Research Center for Artiﬁcial Intelligence DFKI.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Full-Text Argumentation Mining on Scientiﬁc Publications Arne Binder1Bhuvanesh Verma2Leonhard Hennig1 1German Research Center for Artiﬁcial Intelligence DFKI

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: