How Multi is Multi-Document Summarization Ruben WolhandlerArie CattanOri Ernst Ido Dagan Computer Science Department Bar Ilan University

2025-05-06 0 0 401.88KB 9 页 10玖币

侵权投诉

How “Multi” is Multi-Document Summarization?

Ruben Wolhandler∗Arie Cattan∗Ori Ernst Ido Dagan

Computer Science Department, Bar Ilan University

{rwolhandler,arie.cattan,oriern}@gmail.com dagan@cs.biu.ac.il

Abstract

The task of multi-document summarization

(MDS) aims at models that, given multiple

documents as input, are able to generate a sum-

mary that combines disperse information, orig-

inally spread across these documents. Accord-

ingly, it is expected that both reference sum-

maries in MDS datasets, as well as system

summaries, would indeed be based on such dis-

persed information. In this paper, we argue

for quantifying and assessing this expectation.

To that end, we propose an automated mea-

sure for evaluating the degree to which a sum-

mary is “disperse”, in the sense of the number

of source documents needed to cover its con-

tent. We apply our measure to empirically an-

alyze several popular MDS datasets, with re-

spect to their reference summaries, as well as

the output of state-of-the-art systems. Our re-

sults show that certain MDS datasets barely

require combining information from multiple

documents, where a single document often

covers the full summary content. Overall, we

advocate using our metric for assessing and

improving the degree to which summarization

datasets require combining multi-document in-

formation, and similarly how summarization

models actually meet this challenge.1

1 Introduction

Multi-document Summarization (MDS) consists of

creating a short and concise summary that includes

the salient information in a set of related documents.

Beyond the challenges in single-document summa-

rization, a summary of multiple texts is expected to

combine and assemble information spread across

several input texts. Table 1illustrates such an ex-

ample where the summary combines multiple facts

from the different documents about global warm-

ing. While the main fact (“melting ice”) can be

∗Equal contribution.

Our code is available in

https://github.com/

ariecattan/multi_mds.

Doc 1

:Indigenous Arctic people urged European countries to

step up the ﬁght against global warming, saying it is threatening

their societies. The Arctic Council said that the amount of sea

ice around the North Pole has decreased about 8 percent in 30

years because of global warming

Doc 2

: One of the topics discussed at the global warming

conference is the decrease of the sea ice in the Arctic..

Doc 3

: Glaciologists worry most about the Arctic ice sheet:

if gradually melted,to raise ocean levels worldwide by about

ﬁve meters stems directly from global warming or from more

localized conditions.

Summary

:Global warming has caused the Arctic ice to melt

considerably. These changes are threatening the indigenous

Arctic population and could raise ocean levels worldwide.

Table 1: An example of a summary of multiple docu-

ments. The proposition “melting ice” (in blue) appears

in all source documents, while “the threat for the Arc-

tic population” (in ochre) and “the rising water” (in red)

are mentioned only in documents 1 and 3 respectively.

described in all source documents, secondary infor-

mation such as “the rising water” often appear only

in certain document(s).

In order to develop MDS models that effectively

merge information from various sources, it is nec-

essary that reference summaries in MDS datasets

should be based on such dispersed information

across the source documents. However, to the best

of our knowledge, while existing datasets assume

that this property is realized, measuring (automat-

ically) the degree of multi-text merging was not

investigated in the literature.

In this work, we suggest quantifying the degree

to which a summary is “disperse” in terms of the

minimum number of documents needed to cover

its content. Accordingly, we develop an automated

method for measuring this aspect for any MDS

summary. To that end, we ﬁrst identify the poten-

tial provenance of the summary information in all

source documents. Then, for each possible number

of documents, we form the subset of documents

that includes the largest amount aligned informa-

arXiv:2210.12688v1 [cs.CL] 23 Oct 2022

tion with the summary. Finally, we deﬁne the de-

gree of multi-text merging of an MDS summary as

a function of the amount of summary information

not covered by each subset of documents.

We apply our automated measure to evaluate

the degree of multi-text merging in four promi-

nent MDS datasets (DUC, TAC, MultiNews and

WCEP) as well as the output of ﬁve recent systems.

Our results show that some existing datasets barely

involve multi-text merging because the reference

summary information mostly appears in a single

document. Unsurprisingly, the length of the sum-

mary has a substantial impact on the amount of

multi-text merging since longer summaries cover

more detailed information which tends to be spread

across documents.

Taken together, our work is the ﬁrst to measure

and empirically analyze multi-text merging in MDS

datasets and model summaries. We suggest that

future work will use our methodology to develop

better datasets and to improve the degree of multi-

text merging in MDS models.

2 A Measure for Multi-text Merging

2.1 Motivating Analysis

The common dataset structure for an MDS instance

is a topic that consists of a set of source documents

D={D1, ..., Dn}

and a summary

. To motivate

our measure, we ﬁrst analyze the degree of multi-

text merging on a sample of topics. To that end, we

leverage the Summary-Source-Alignment dataset

of Ernst et al. (2021), in which human annotators

aligned all propositions in reference summaries

with corresponding propositions in the source doc-

uments that cover the same information, as exem-

pliﬁed in Table 1. Given these alignments on 9

MDS topics from MultiNews (Fabbri et al.,2019),

each composed of 4 source documents, we ﬁnd that

a single source document sufﬁces to cover alone

70% of the summary propositions while 2 docu-

ments cover 95% of them. The remaining source

documents thus hardly provide any substantial in-

formation to the summary.

Motivated by this analysis, we develop an auto-

mated measure that allows to evaluate the degree

of multi-text merging in entire MDS datasets and

in systems summaries. Our measure operates in

the following steps. We ﬁrst deﬁne the coverage

score for a given subset of source documents (§2.2).

Then, to approximate the minimum number of doc-

uments required to cover increasing portions of the

summary information, we greedily construct, for

each possible number of source documents, the

subset of source documents with the highest cov-

erage score (§2.3). Finally, we measure the total

amount of summary information in all subset sizes,

yielding a corresponding coverage curve (§2.4).

2.2 Relative Coverage Score

Let

D∗

be a subset of source documents

D∗⊆D

We deﬁne the relative coverage of

D∗

as the pro-

portion of information that is covered by

D∗

, nor-

malized by the information covered by all source

documents D:

cov(D∗, D, S) = s(D∗, S)

s(D, S)(1)

For the absolute coverage score

s(D∗, S)

, we

aim to approximate the human annotation of

summary-source proposition alignment in (Ernst

et al.,2021), which is based on the well estab-

lished Pyramid scheme (Nenkova and Passonneau,

2004). Speciﬁcally, we follow their automated

scheme: (1) we extract all propositions from the

summary and all source documents using Ope-

nIE (Banko et al.,2008),

(2) we compute the

similarity score between the propositions in the

summary and the source documents using SU-

PERPAL, an NLI model ﬁne-tuned on proposition

alignment (Ernst et al.,2021), (3)

s(D∗, S)

is de-

ﬁned as the number of propositions in

that are

aligned with some proposition in D∗.

We consider the proportion

s(D∗, S)/s(D, S)

and not the absolute coverage

s(D∗, S)

for two

main reasons. First, as both reference and system

summaries are known to include hallucinated infor-

mation (Maynez et al.,2020), we need to discard

them in our measure in order to properly estimate

the amount of information that each single source

document actually provides to the summary. Sec-

ond, normalizing the coverage score will mitigate

the potential omissions of the alignment model.

2.3 Maximally-Covering Document Subsets

Given an MDS topic with

source documents, we

aim to measure the maximal content coverage of

the summary content by a document subset of size

k6n

. To that end, we form

subsets of source

We use the AllenNLP implementation of (Stanovsky et al.,

2018) to extract the OpenIE tuples. Following Ernst et al.

(2021,2022), we convert each OpenIE tuple into a proposition

string by concatenating the predicate and its arguments by

their original order.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

HowMultiisMulti-DocumentSummarization?RubenWolhandlerArieCattanOriErnstIdoDaganComputerScienceDepartment,BarIlanUniversity{rwolhandler,arie.cattan,oriern}@gmail.comdagan@cs.biu.ac.ilAbstractThetaskofmulti-documentsummarization(MDS)aimsatmodelsthat,givenmultipledocumentsasinput,areabletogeneratea...

展开>> 收起<<

How Multi is Multi-Document Summarization Ruben WolhandlerArie CattanOri Ernst Ido Dagan Computer Science Department Bar Ilan University.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

How Multi is Multi-Document Summarization Ruben WolhandlerArie CattanOri Ernst Ido Dagan Computer Science Department Bar Ilan University

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: