Full-Text Argumentation Mining on Scientific Publications Arne Binder1Bhuvanesh Verma2Leonhard Hennig1 1German Research Center for Artificial Intelligence DFKI

2025-05-06 0 0 638.16KB 13 页 10玖币
侵权投诉
Full-Text Argumentation Mining on Scientific Publications
Arne Binder1Bhuvanesh Verma2Leonhard Hennig1
1German Research Center for Artificial Intelligence (DFKI)
2University of Potsdam
1{arne.binder, leonhard.hennig}@dfki.de
2bhuvanesh.verma@uni-potsdam.de
Abstract
Scholarly Argumentation Mining (SAM) has
recently gained attention due to its potential
to help scholars with the rapid growth of pub-
lished scientific literature. It comprises two
subtasks: argumentative discourse unit recog-
nition (ADUR) and argumentative relation ex-
traction (ARE), both of which are challeng-
ing since they require e.g. the integration of
domain knowledge, the detection of implicit
statements, and the disambiguation of argu-
ment structure (Al Khatib et al.,2021). While
previous work focused on dataset construc-
tion and baseline methods for specific doc-
ument sections, such as abstract or results,
full-text scholarly argumentation mining has
seen little progress. In this work, we intro-
duce a sequential pipeline model combining
ADUR and ARE for full-text SAM, and pro-
vide a first analysis of the performance of pre-
trained language models (PLMs) on both sub-
tasks. We establish a new SotA for ADUR
on the Sci-Arg corpus, outperforming the pre-
vious best reported result by a large margin
(+7% F1). We also present the first results for
ARE, and thus for the full AM pipeline, on this
benchmark dataset. Our detailed error analy-
sis reveals that non-contiguous ADUs as well
as the interpretation of discourse connectors
pose major challenges and that data annotation
needs to be more consistent.
1 Introduction
Argumentation Mining (AM) is concerned with
the detection of the argumentative structure of text
(Stede and Schneider,2018). It is commonly or-
ganized into two subtasks: 1) Recognition of ar-
gumentative discourse units (ADUs), i.e. detecting
argumentative spans of text and classifying them
into types such as claim or premise, and 2) deter-
mining which ADUs have a relationship to each
other and of what kind, e.g. support or attack. Con-
sider the following example, where the premise P
supports the claim C:
:::::::::::
Dot-product :::::::::
attention:::
is ::::::
much::::::
faster
::::
than::::::::
additive:::::::::
attentionC
, since
it can
be implemented using highly optimized
matrix multiplication codeP.1
Since the amount of published scientific liter-
ature is growing exponentially (Fortunato et al.,
2018), there is recently an increased interest in
scholarly argumentation mining (SAM). Under-
standing the argumentative structure is key, not just
to efficiently digest such work, but also to assess
its quality (Walton,2001). Solving scholarly AM
is challenging, because it requires, among other
things, the use of domain knowledge, the detection
of implicit statements, and the disambiguation of
argument structure (Al Khatib et al.,2021). This
is even harder when handling full-text that is often
less concise and standardized, than, for example,
abstracts.
Previous work in SAM has focused on dataset
construction (Teufel and Moens,1999;Lauscher
et al.,2018b), ADU recognition (Lauscher et al.,
2018a;Li et al.,2021), and the analysis of spe-
cific document sections, such as abstract or results
(Dasigi et al.,2017;Accuosto and Saggion,2019;
Mayer et al.,2020). However, to get a thorough un-
derstanding of a scientific publication, all parts of
the document matter. Ideally, they back up the main
argumentation and usually contain details that are
relevant for the knowledgeable reader, thus, they
should not be neglected. However, since the task is
very complex, also for humans, there is not much
training data for full-text SAM available.
Pretrained Language Models (PLMs) such as
SciBERT (Beltagy et al.,2019) may help to address
the above challenges because they contain a lot of
linguistic and domain knowledge and have better
long-range capabilities, allowing for improved con-
textualisation, especially when training data is rare.
We hence propose a PLM based model for full-text
1replicated from Vaswani et al. (2017)
arXiv:2210.13084v1 [cs.CL] 24 Oct 2022
Figure 1: Example with argumentative structure from the Sci-Arg dataset.
SAM. To summarize, our contributions in this work
are:
We are the first to investigate PLMs for full-
text SAM, and to present a sequential pipeline
for both ADU recognition and argumentative
RE on full-text scientific publications (Sec-
tion 3).
Our experimental results show that a
SciBERT-based ADU recognition model im-
proves over the state-of-the-art by
+7%
F1-
score. We present the first relation extraction
baseline for the Sci-Args corpus and achieve
strong 0.74 F1 (Section 5.1).
Our detailed error analysis reveals open chal-
lenges and possible ways of improvements
(Section 5.2).
2 Preliminaries
We first define the two tasks of ADUR and ARE,
and discuss differences to the standard Information
Extraction (IE) tasks of Named Entity Recognition
(NER) and Relation Extraction (RE).
An Argumentative Discourse Unit (ADU) can
be defined as “span of text that plays a single role
for an argument being analyzed and is demarcated
by neighboring text spans that play a different role,
or none at all” (Stede and Schneider,2018). It is
the smallest unit of argumentation, and may span
anything from an in-sentence clause up to multiple
full sentences. ADU recognition requires both de-
tecting argumentative spans, as well as classifying
them into predefined categories. Typically, this is
realised as sequence tagging task similar to NER,
where a sequence of tokens
X={t1, t2, ..., tN}
is
assigned with a corresponding
N
-length sequence
of labels
Y={l1, l2, ..., lN}
with
liC
where
C
is the set of tags that result from converting the
ADU types into a tagging scheme like BIO2.
2
In
scholarly AM, common ADU classes are (Own /
Background) Claim, and Evidence,Data, or War-
rant (Green,2014;Lauscher et al.,2018b).
2BIO2: Begin, Inside, Outside of an entity
In contrast to NER, ADUs typically vary much
more in length than named entities. They are also
highly context dependent and often discontinuous.
ADUR is also related to discourse segmentation,
but depends more on broader context and seman-
tics instead of linguistic structure. Elementary Dis-
course Units (EDUs), the building blocks in the
context of Rethorical Structure Theory (Mann and
Thompson,1988), are more fine-grained, of shorter
length and usually cover the complete text which
is less the case for argumentative units.
Argumentative Relation Extraction is usually de-
fined as classifying a pair of ADUs, head and tail,
as either an instance of one of the target types or the
artificial NO-RELATION type. In other words, the
task is to assign a label
YC∪ {NO-RELATION}
to a given input
X={T, h, t}
, where
C
is the set
of relation types,
T
is the text and
h= (sh, eh, lh)
and
t= (st, et, lt)
describe the candidate head and
tail entities where
s
and
e
are the start and end in-
dices with respect to
T
and
l
is the entity type. Typ-
ical relation types for SAM are Supports,Mentions,
Attacks,Contradicts, and Contrasts (Lauscher et al.,
2018b;Accuosto and Saggion,2019;Nicholson
et al.,2021).
ARE is very similar to standard RE, but SAM
relations are often marked by syntactic cues such
as connectors, e.g. “because”, “however”, or “but”,
whereas in common RE, content words like verbs
and nouns are typical relation triggers. This makes
ARE challenging because these connectors do not
always realise argumentative structure, but also
mark other aspects of discourse. Consider, for ex-
ample, the different meanings of “while” in the
following example:
1. While
I love a romantic dinner, I also like fast
food.
2. :::::
While I prepare dinner, I watch a movie.
Here, the “while” in sentence 1) has a contrastive
meaning, whereas sentence 2) denotes a temporal
aspect.
Bi-LSTM
CRF
execution time increased quite substantially ( 4.54ms to 22.74ms )Furthermore
B-own-claim I-own-claim B-data I-data L-data OI-own-claim L-own-claimO I-own-claim O
SciBERT
(a)
ADU Recognition.
Tokens are embedded with a frozen
PLM, further contextualized with a trained LSTM followed
by a CRF to calculate the tag sequence.
supports
Bi-LSTM
Classifier
CNN+Pooling
Argument embeddings
ADU embeddings
SciBERT token embeddings
execution substantially ( 4.54ms )Furthermore
B-own-claim L-own-claim O B-data OO
B-Arg2 I-Arg2 O B-Arg1 OO
(b)
Argumentative RE.
Tokens are embedded with a frozen
PLM, ADU tags and argument tags are embedded with simple
embedding matrices. Embeddings are concatenated, contex-
tualized with a LSTM and converted into a single vector that
gets classified by a single fully connected layer.
Figure 2: Model setup for (a) ADUR (top) and (b) ARE
(bottom).
3 Models
We propose a pipeline of two distinct models, one
for each subtask, that are described in the follow-
ing.
ADU Recognition (ADUR).
The architecture of
the ADUR model is visualized in Figure 2a. We
first embed the token sequence with a frozen PLM
encoder. For sequences that exceed the maximum
input length of the embedding model, we process
the sequence piece-wise and concatenate the result
afterwards. The embedded tokens are then fed into
a BiLSTM (Schuster and Paliwal,Nov./1997). Fi-
nally, a Conditional Random Field (CRF) (Lafferty
et al.,2001) is used to obtain the label probabilities
for each token. We use a combination of a frozen
PLM with a trainable contextualization (LSTM) on
top because its training requires less resources than
fine-tuning the PLM and initial tests have shown
similar performance.3
Argumentative RE (ARE).
The model architec-
ture for the relation extraction subtask is shown in
3
Note that the training dataset is relative small, so restrict-
ing the number of trainable parameters seems to mitigate
overfitting.
Train Test Total
ADUs
background claim 2563 661 3224
own claim 4608 1241 5849
data 3346 858 4204
Relations
supports 4426 1260 5686
contradicts 551 133 684
semantically same 36 3 39
parts of same 1000 269 1269
Table 1: Label counts for the Sci-Arg dataset.
Figure 2b. ARE is implemented as a classification
task, where a pair of candidate ADUs is selected
and marked in the input token sequence. To reduce
combinatorial complexity, only ADU pairs with a
distance smaller than some threshold
d
are consid-
ered. Similar to ADU recognition, we first embed
the token sequence in a window of
k
tokens around
the candidate entity pair with a frozen PLM model.
We also create non-contextualized embeddings for
the ADU- and argument-tags of the tokens within
the window. As argument tags we simply use head
and tail labels to mark the candidate entity tokens.
All three embedding sequences are concatenated
token-wise and fed into a BiLSTM. The result is
converted into a single vector using a Convolutional
Neural Network (CNN) and max-pooling, which
then is classified as one of the relation labels by a
linear projection with softmax.
4 Experimental Setup
Dataset.
We use the Sci-Arg dataset (Lauscher
et al.,2018b) for model training and evaluation. It
is the only available full text argumentation min-
ing dataset for scientific publications. It contains
40 full text publications annotated with ADUs and
argumentative relations. Figure 1shows an ex-
ample excerpt, and Table 1summarizes the main
dataset statistics. The PARTS OF SAME relation
type is used to model non-contiguous spans. The
label counts differ slightly from values published
in Lauscher et al. (2018b), because annotations in
one file (A28) caused parsing errors and were ex-
cluded. Furthermore, non-contiguous spans are not
merged. We create a train/test split by using the
first 30 documents for training and the remaining 9
for evaluation.
摘要:

Full-TextArgumentationMiningonScienticPublicationsArneBinder1BhuvaneshVerma2LeonhardHennig11GermanResearchCenterforArticialIntelligence(DFKI)2UniversityofPotsdam1{arne.binder,leonhard.hennig}@dfki.de2bhuvanesh.verma@uni-potsdam.deAbstractScholarlyArgumentationMining(SAM)hasrecentlygainedattentiond...

展开>> 收起<<
Full-Text Argumentation Mining on Scientific Publications Arne Binder1Bhuvanesh Verma2Leonhard Hennig1 1German Research Center for Artificial Intelligence DFKI.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:638.16KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注