EDU-level Extractive Summarization with Varying Summary Lengths

2025-04-22 0 0 1.53MB 13 页 10玖币

侵权投诉

Yuping Wu, Ching-Hsun Tseng, Jiayu Shang, Shengzhong Mao,

Goran Nenadic, Xiao-Jun Zeng∗

Department of Computer Science, University of Manchester

{yuping.wu-2, ching-hsun.tseng, jiayu.shang,

shengzhong.mao}@postgrad.manchester.ac.uk

gnenadic, x.zeng@manchester.ac.uk

Abstract

Extractive models usually formulate text sum-

marization as extracting ﬁxed top-ksalient sen-

tences from the document as a summary. Few

works exploited extracting ﬁner-grained Ele-

mentary Discourse Unit (EDU) with little anal-

ysis and justiﬁcation for the extractive unit se-

lection. Further, the selection strategy of the

ﬁxed top-ksalient sentences ﬁts the summa-

rization need poorly, as the number of salient

sentences in different documents varies and

therefore a common or best kdoes not exist in

reality. To ﬁll these gaps, this paper ﬁrst con-

ducts the comparison analysis of oracle sum-

maries based on EDUs and sentences, which

provides evidence from both theoretical and

experimental perspectives to justify and quan-

tify that EDUs make summaries with higher

automatic evaluation scores than sentences.

Then, considering this merit of EDUs, this pa-

per further proposes an EDU-level extractive

model with Varying summary Lengths (EDU-

VL1) and develops the corresponding learning

algorithm. EDU-VL learns to encode and pre-

dict probabilities of EDUs in the document,

generate multiple candidate summaries with

varying lengths based on various kvalues, and

encode and score candidate summaries, in an

end-to-end training manner. Finally, EDU-VL

is experimented on single and multi-document

benchmark datasets and shows improved per-

formances on ROUGE scores in comparison

with state-of-the-art extractive models, and

further human evaluation suggests that EDU-

constituent summaries maintain good gram-

maticality and readability.

1 Introduction

Automatic text summarization aims at aggregat-

ing information in long document(s) into a shorter

piece of text while keeping important information.

Extractive summarization and abstractive summa-

rization are two categories of it. This paper focuses

∗Corresponding author.

1https://github.com/yuping-wu/EDU-VL

Document:

(...) [The second audio,] [taken

from dash cam video from inside a patrol car,]

[captures a phone call between Slager and

someone] [CNN believes] [is his wife.] (...)

Reference Summary:

The second audio cap-

tures a phone call between Slager and some-

one CNN believes is his wife.

Table 1: Example to demonstrate redundant informa-

tion in sentence. Content within [] indicates an EDU.

only on the extractive task which formulates sum-

marization as identifying salient textual segments

in document (Lunh,1958). Under the supervised

learning framework, this task is further formulated

as a label classiﬁcation task, i.e., encoding textual

segments and predicting labels on the encoded vec-

tors. Recent state-of-the-art models (Liu and Lap-

ata,2019;Zhong et al.,2020;Liu et al.,2021;Ruan

et al.,2022) on this task tend to be Transformer-

based since BERT (Devlin et al.,2019) shows sig-

niﬁcantly better performance than RNN on most

natural language understanding tasks.

Most existing works extract sentences from the

document and some works further (Xu and Durrett,

2019) propose post-processing steps to prune the

generated summary. The only exception is the few

works (Liu and Chen,2019;Huang and Kurohashi,

2021), which extract ﬁner-grained textual segments,

i.e., discourse-level text or EDU, with little justiﬁ-

cation. The intuition is that a sentence consisting

of multiple clauses is inevitable to contain less im-

portant information. As demonstrated in Table 1,

partially removing a clause in the sentence is con-

ducive to generating a summary. Certainly, such

an intuitive explanation does not provide enough

evidence and support to justify the use of ﬁner-

grained textual segments such as EDU to substitute

sentences. Considering such a gap in existing re-

search, the ﬁrst main motivation of this paper is to

propose and conduct the comparison analysis be-

arXiv:2210.04029v2 [cs.CL] 13 Mar 2023

tween sentences and EDUs to disclose and justify

whether using EDU is a theoretically advanced and

application-advantaged extractive unit.

When selecting textual segments, the top-

strat-

egy with

ﬁxed for all documents is dominant

in deciding the length of the generated summary.

Some works (Zhong et al.,2020;Chen et al.,2021)

manage to output summaries with different lengths,

i.e., various numbers of extracted segments, via

formulating the problem as deriving a subset of

sentences from the combination of top-

sentences.

Due to the foreseeing explosion of the combina-

tion of sentences to form subsets, these approaches

are limited to generating summaries with relatively

small values of

. To overcome such a weakness,

the second main motivation of this paper is to pro-

pose and develop an approach allowing varying

lengths for extractive summarization without ex-

plicit limitation on the maximum value of

, i.e.,

the maximum length.

Following the above motivations, the compari-

son analysis between EDUs and sentences ascer-

tains that EDU is a better text unit for the extrac-

tive task because EDU-level summaries achieve

higher automatic evaluation scores than sentence-

level summaries. This conclusion is justiﬁed from

two perspectives. Theoretically, a formal theorem

about this conclusion could be derived from the

property that EDU is essentially part of a sentence.

Experimentally, results of comprehensive analy-

sis about oracle summaries of ﬁve datasets fur-

ther quantify this conclusion, i.e., how much the

ROUGE scores of EDU-level oracle summary are

higher than sentence-level oracle summary.

Based on the aforementioned conclusion and

foundation, this paper further proposes and devel-

ops an EDU-level extractive model and algorithm,

which generates summaries with varying lengths,

i.e., EDU-VL. We extend Transformer-based pre-

trained language model with an extra classiﬁcation

layer to encode EDUs in a document and predict

the corresponding probabilities. Multiple

values

are provided to the model to generate a set of can-

didate summaries under the ﬂexible top-

strategy

for the document. Multiple Transformer encoder

layers encode the full document and candidate sum-

maries individually. Finally, a similarity score with

the encoded document is calculated for each candi-

date summary and the one with the highest score is

the ﬁnal output of EDU-VL.

Experiments are conducted on ﬁve benchmark

datasets from different domains and with various

writing styles. The experimental results suggest

that EDU-VL achieves better performance than

all state-of-the-art extractive baselines on single-

document summarization datasets CNN/DailyMail,

XSum, Reddit, and WikiHow, in terms of three

ROUGE metrics. With direct comparison to the

multi-document model, EDU-VL still achieves

comparable performance on the multi-document

summarization dataset Multi-News. Human eval-

uation is further carried for the summaries gener-

ated by EDU-VL to assess the syntax structure of

EDU-constituent summaries. The results provide

evidence for the good grammaticality and readabil-

ity of EDU-constituent summaries and therefore

justify the applicability.

The contributions of this paper are threefold:

We justify and quantify that EDU-level

achieves higher automatic evaluation scores

than sentence-level oracle summary from both

theoretical and experimental perspectives, in-

dicating that setting EDU as the extractive

text unit is exploitable and superior in appli-

cations.

We propose a varying summary lengths-

enabled extractive model with EDU-level text

unit. Such a model and its learning algorithm

encodes EDUs in a document and outputs a

summary with varying length by making

the top-kextraction strategy varying.

Our proposed model achieves superior per-

formance on four single-document summa-

rization datasets on three ROUGE metrics.

Human evaluations show that the generated

EDU-constituent summaries maintain good

grammaticality and readability.

2 Related Work

2.1 Neural Extractive Summarization

The extractive text summarization task aims at ex-

tracting salient textual segments from the original

document(s) as a summary. A tendency observed

among extractive neural models is that the archi-

tecture changes from RNN (Nallapati et al.,2017;

Xu and Durrett,2019) to Transformer-based mod-

els, e.g., BERT (Zhang et al.,2019;Liu and Lap-

ata,2019) and Longformer (Liu et al.,2021;Ruan

et al.,2022). GNN also gained extensive atten-

tion in recent years and is usually stacked after

an RNN (Wang et al.,2020;Jing et al.,2021) or

Transformer-based encoder (Cui et al.,2020;Kwon

et al.,2021) to supplement graph-based features.

Some research works integrated neural networks

with reinforcement learning (Dong et al.,2018;Gu

et al.,2022) or unsupervised learning frameworks

(Liang et al.,2021). In general, it can be said that

taking a pre-trained Transformer-based language

model as the starting point to encode textual seg-

ments in a document is currently the state-of-the-art

approach among neural extractive models. There-

fore, the Transformer-based models, i.e., RoBERTa

(Liu et al.,2019) and BART (Lewis et al.,2020),

are used as the basic building blocks in this paper.

2.2 Sub-sentential Extractive Summarization

Most previous works about the extractive task

focused on generating sentence-level summaries,

though some of them (Xiao et al.,2020;Cho et al.,

2020;Ernst et al.,2022) utilized sub-sentential

features. Early works by Marcu (1999); Alonso i

Alemany and Fuentes Fort (2003); Yoshida et al.

(2014); Li et al. (2016) exploited extracting

discourse-level textual segments as the summary

but those approaches were tested on small datasets.

More recent works by Liu and Chen (2019); Xu

et al. (2020); Huang and Kurohashi (2021) were

evaluated on relatively larger datasets. However,

whether the discourse-level textual segments are a

better alternative than sentences as the extractive

text unit was not justiﬁed in those works. To ﬁll

this gap, we provide justiﬁcation for this research

question from both theoretical and experimental

perspectives in this paper.

2.3 Flexible Extractive Summarization

Extractive summarization task is usually formu-

lated as extracting the top-

number of salient tex-

tual segments from a document. The ﬁxed

value

for all documents results in the lack of variety in

the length of the generated summary. Few works

(Jia et al.,2020;Zhong et al.,2020;Chen et al.,

2021) managed to output summaries with varying

lengths. However, either it requires extra effort for

hyper-parameter searching on validation dataset to

ﬁnd a valid threshold, or formulating the problem

as selecting a subset of top-

sentences makes the

variety of lengths limited to small lengths due to

the explosive nature of combination. In this paper,

we propose a model with varying

values but with-

out explicit limitation on the length or the need to

do hyper-parameter searching.

3 Oracle Analysis of EDUs and Sentences

Oracle analysis refers to the analysis of oracle

summary whose deﬁnition is stated in Section 3.1.

We conducted oracle analysis from both theoreti-

cal and experimental perspectives to justify and

quantify that discourse-level summary achieves

higher scores on automatic evaluation metrics than

sentence-level summary.

3.1 Theoretical Formulation

Elementary Discourse Unit (EDU), the discourse-

level textual segment in this paper, refers to the

terminal node in the Rhetorical Structure Theory

(RST) (Mann and Thompson,1988) tree which de-

scribes the discourse structure of a piece of text.

EDUs are non-overlapping and adjacent text spans

in the piece of text and a single EDU is essentially

a segment of a complete sentence, i.e., the sen-

tence itself or a clause in the sentence (Zeldes et al.,

2019). Namely, a sentence can always be expressed

with multiple EDUs, i.e., for the

-th sentence in

a document, there is

sents= [edus1, . . . , edusm]

Consequently, a one-way property from sentence

to EDU regarding expressiveness is derived.

Expressiveness Property

For any given

subset of sentences in a document, i.e.,

[senti, . . . , sentj, . . . , sentk]

, there is al-

ways a subset of EDUs in the document, i.e.,

[edui1, . . . , eduim, . . . , eduj1, . . . , edujm,...,

eduk1, . . . , edukm], having identical content.

Oracle Summary

The set of salient textual seg-

ments that have greedily the highest ROUGE

score(s) with the reference summary is the ora-

cle summary for a document. It signiﬁes the upper

bound of performance that an extractive summa-

rization model could achieve on ROUGE metrics.

Denote the sentence-level oracle summary as

OSsent

and the EDU-level oracle summary as

OSedu

. Based on the aforementioned property and

deﬁnition, Theorem 1can be derived and its de-

tailed proof is provided below.

Theorem 1.

Given a document

and its reference

summary

, for any derived

OSsent

, there is al-

ways an

OSedu

having ROUGE

F1(R,OSedu)≥

ROUGEF1(R,OSsent).

Proof.

For ROUGE-N, let

be a function that

generates the set of n-grams for the string

and

be a function that calculates the number of

overlapping elements between two sets

and

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

EDU-levelExtractiveSummarizationwithVaryingSummaryLengthsYupingWu,Ching-HsunTseng,JiayuShang,ShengzhongMao,GoranNenadic,Xiao-JunZengDepartmentofComputerScience,UniversityofManchester{yuping.wu-2,ching-hsun.tseng,jiayu.shang,shengzhong.mao}@postgrad.manchester.ac.ukgnenadic,x.zeng@manchester.ac.ukAb...

展开>> 收起<<

EDU-level Extractive Summarization with Varying Summary Lengths.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

EDU-level Extractive Summarization with Varying Summary Lengths

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: