How Multi is Multi-Document Summarization Ruben WolhandlerArie CattanOri Ernst Ido Dagan Computer Science Department Bar Ilan University

2025-05-06 0 0 401.88KB 9 页 10玖币
侵权投诉
How “Multi” is Multi-Document Summarization?
Ruben WolhandlerArie CattanOri Ernst Ido Dagan
Computer Science Department, Bar Ilan University
{rwolhandler,arie.cattan,oriern}@gmail.com dagan@cs.biu.ac.il
Abstract
The task of multi-document summarization
(MDS) aims at models that, given multiple
documents as input, are able to generate a sum-
mary that combines disperse information, orig-
inally spread across these documents. Accord-
ingly, it is expected that both reference sum-
maries in MDS datasets, as well as system
summaries, would indeed be based on such dis-
persed information. In this paper, we argue
for quantifying and assessing this expectation.
To that end, we propose an automated mea-
sure for evaluating the degree to which a sum-
mary is “disperse”, in the sense of the number
of source documents needed to cover its con-
tent. We apply our measure to empirically an-
alyze several popular MDS datasets, with re-
spect to their reference summaries, as well as
the output of state-of-the-art systems. Our re-
sults show that certain MDS datasets barely
require combining information from multiple
documents, where a single document often
covers the full summary content. Overall, we
advocate using our metric for assessing and
improving the degree to which summarization
datasets require combining multi-document in-
formation, and similarly how summarization
models actually meet this challenge.1
1 Introduction
Multi-document Summarization (MDS) consists of
creating a short and concise summary that includes
the salient information in a set of related documents.
Beyond the challenges in single-document summa-
rization, a summary of multiple texts is expected to
combine and assemble information spread across
several input texts. Table 1illustrates such an ex-
ample where the summary combines multiple facts
from the different documents about global warm-
ing. While the main fact (“melting ice”) can be
Equal contribution.
1
Our code is available in
https://github.com/
ariecattan/multi_mds.
Doc 1
:Indigenous Arctic people urged European countries to
step up the fight against global warming, saying it is threatening
their societies. The Arctic Council said that the amount of sea
ice around the North Pole has decreased about 8 percent in 30
years because of global warming
Doc 2
: One of the topics discussed at the global warming
conference is the decrease of the sea ice in the Arctic..
Doc 3
: Glaciologists worry most about the Arctic ice sheet:
if gradually melted,to raise ocean levels worldwide by about
five meters stems directly from global warming or from more
localized conditions.
Summary
:Global warming has caused the Arctic ice to melt
considerably. These changes are threatening the indigenous
Arctic population and could raise ocean levels worldwide.
Table 1: An example of a summary of multiple docu-
ments. The proposition “melting ice” (in blue) appears
in all source documents, while “the threat for the Arc-
tic population” (in ochre) and “the rising water” (in red)
are mentioned only in documents 1 and 3 respectively.
described in all source documents, secondary infor-
mation such as “the rising water” often appear only
in certain document(s).
In order to develop MDS models that effectively
merge information from various sources, it is nec-
essary that reference summaries in MDS datasets
should be based on such dispersed information
across the source documents. However, to the best
of our knowledge, while existing datasets assume
that this property is realized, measuring (automat-
ically) the degree of multi-text merging was not
investigated in the literature.
In this work, we suggest quantifying the degree
to which a summary is “disperse” in terms of the
minimum number of documents needed to cover
its content. Accordingly, we develop an automated
method for measuring this aspect for any MDS
summary. To that end, we first identify the poten-
tial provenance of the summary information in all
source documents. Then, for each possible number
of documents, we form the subset of documents
that includes the largest amount aligned informa-
arXiv:2210.12688v1 [cs.CL] 23 Oct 2022
tion with the summary. Finally, we define the de-
gree of multi-text merging of an MDS summary as
a function of the amount of summary information
not covered by each subset of documents.
We apply our automated measure to evaluate
the degree of multi-text merging in four promi-
nent MDS datasets (DUC, TAC, MultiNews and
WCEP) as well as the output of five recent systems.
Our results show that some existing datasets barely
involve multi-text merging because the reference
summary information mostly appears in a single
document. Unsurprisingly, the length of the sum-
mary has a substantial impact on the amount of
multi-text merging since longer summaries cover
more detailed information which tends to be spread
across documents.
Taken together, our work is the first to measure
and empirically analyze multi-text merging in MDS
datasets and model summaries. We suggest that
future work will use our methodology to develop
better datasets and to improve the degree of multi-
text merging in MDS models.
2 A Measure for Multi-text Merging
2.1 Motivating Analysis
The common dataset structure for an MDS instance
is a topic that consists of a set of source documents
D={D1, ..., Dn}
and a summary
S
. To motivate
our measure, we first analyze the degree of multi-
text merging on a sample of topics. To that end, we
leverage the Summary-Source-Alignment dataset
of Ernst et al. (2021), in which human annotators
aligned all propositions in reference summaries
with corresponding propositions in the source doc-
uments that cover the same information, as exem-
plified in Table 1. Given these alignments on 9
MDS topics from MultiNews (Fabbri et al.,2019),
each composed of 4 source documents, we find that
a single source document suffices to cover alone
70% of the summary propositions while 2 docu-
ments cover 95% of them. The remaining source
documents thus hardly provide any substantial in-
formation to the summary.
Motivated by this analysis, we develop an auto-
mated measure that allows to evaluate the degree
of multi-text merging in entire MDS datasets and
in systems summaries. Our measure operates in
the following steps. We first define the coverage
score for a given subset of source documents (§2.2).
Then, to approximate the minimum number of doc-
uments required to cover increasing portions of the
summary information, we greedily construct, for
each possible number of source documents, the
subset of source documents with the highest cov-
erage score (§2.3). Finally, we measure the total
amount of summary information in all subset sizes,
yielding a corresponding coverage curve (§2.4).
2.2 Relative Coverage Score
Let
D
be a subset of source documents
DD
.
We define the relative coverage of
D
as the pro-
portion of information that is covered by
D
, nor-
malized by the information covered by all source
documents D:
cov(D, D, S) = s(D, S)
s(D, S)(1)
For the absolute coverage score
s(D, S)
, we
aim to approximate the human annotation of
summary-source proposition alignment in (Ernst
et al.,2021), which is based on the well estab-
lished Pyramid scheme (Nenkova and Passonneau,
2004). Specifically, we follow their automated
scheme: (1) we extract all propositions from the
summary and all source documents using Ope-
nIE (Banko et al.,2008),
2
(2) we compute the
similarity score between the propositions in the
summary and the source documents using SU-
PERPAL, an NLI model fine-tuned on proposition
alignment (Ernst et al.,2021), (3)
s(D, S)
is de-
fined as the number of propositions in
S
that are
aligned with some proposition in D.
We consider the proportion
s(D, S)/s(D, S)
and not the absolute coverage
s(D, S)
for two
main reasons. First, as both reference and system
summaries are known to include hallucinated infor-
mation (Maynez et al.,2020), we need to discard
them in our measure in order to properly estimate
the amount of information that each single source
document actually provides to the summary. Sec-
ond, normalizing the coverage score will mitigate
the potential omissions of the alignment model.
2.3 Maximally-Covering Document Subsets
Given an MDS topic with
n
source documents, we
aim to measure the maximal content coverage of
the summary content by a document subset of size
k6n
. To that end, we form
n
subsets of source
2
We use the AllenNLP implementation of (Stanovsky et al.,
2018) to extract the OpenIE tuples. Following Ernst et al.
(2021,2022), we convert each OpenIE tuple into a proposition
string by concatenating the predicate and its arguments by
their original order.
摘要:

How“Multi”isMulti-DocumentSummarization?RubenWolhandlerArieCattanOriErnstIdoDaganComputerScienceDepartment,BarIlanUniversity{rwolhandler,arie.cattan,oriern}@gmail.comdagan@cs.biu.ac.ilAbstractThetaskofmulti-documentsummarization(MDS)aimsatmodelsthat,givenmultipledocumentsasinput,areabletogeneratea...

展开>> 收起<<
How Multi is Multi-Document Summarization Ruben WolhandlerArie CattanOri Ernst Ido Dagan Computer Science Department Bar Ilan University.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:401.88KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注