Salience Allocation as Guidance for Abstractive Summarization Fei Wangy Kaiqiang Songz Hongming Zhangz Lifeng Jinz Sangwoo Choz Wenlin Yaoz Xiaoyang Wangz Muhao Chenyand Dong Yuz

2025-05-03 0 0 435.3KB 13 页 10玖币

侵权投诉

Salience Allocation as Guidance for Abstractive Summarization

Fei Wang†∗, Kaiqiang Song‡∗, Hongming Zhang‡, Lifeng Jin‡, Sangwoo Cho‡

Wenlin Yao‡, Xiaoyang Wang‡, Muhao Chen†and Dong Yu‡

†University of Southern California; ‡Tecent AI Lab, Seattle

{fwang598,muhaoche}@usc.edu

{riversong,hongmzhang,lifengjin,swcho,wenlinyao,shawnxywang,dyu}@global.tencent.com

Abstract

Abstractive summarization models typically

learn to capture the salient information from

scratch implicitly. Recent literature adds ex-

tractive summaries as guidance for abstrac-

tive summarization models to provide hints

of salient content and achieves better perfor-

mance. However, extractive summaries as

guidance could be over strict, leading to in-

formation loss or noisy signals. Furthermore,

it cannot easily adapt to documents with var-

ious abstractiveness. As the number and al-

location of salience content pieces varies, it

is hard to ﬁnd a ﬁxed threshold deciding

which content should be included in the guid-

ance. In this paper, we propose a novel

summarization approach with a ﬂexible and

reliable salience guidance, namely SEASON

(SaliencE Allocation as Guidance for Abstrac-

tive SummarizatiON). SEASON utilizes the al-

location of salience expectation to guide ab-

stractive summarization and adapts well to ar-

ticles in different abstractiveness. Automatic

and human evaluations on two benchmark

datasets show that the proposed method is ef-

fective and reliable. Empirical results on more

than one million news articles demonstrate a

natural ﬁfteen-ﬁfty salience split for news ar-

ticle sentences, providing a useful insight for

composing news articles.1

1 Introduction

Abstractive summarization seeks to generate con-

cise descriptions about synoptic information of

longer documents (Rush et al.,2015;Nallapati

et al.,2016;See et al.,2017). Tackling this task

can provide users with improved dissemination and

acquisition of more readable content in long doc-

uments. More concretely, it allows for enhanced

selection, compression and retrieval of Web-scale

Work done during Fei Wang’s internship at Tencent AI

Lab Seattle. The ﬁrst two authors contributed equally.

Code and model weights are available at

https://

github.com/tencent-ailab/season.

Between iPhones, flat-screens and …

A new report from Suncorp Bank …

The report found Australians spent …

Men spent twice as much as women …

On average, men spent $2618 over …

The report also found that families …

'The report found adults without …

Despite the mounting costs, the …

Mobile phone bills were the biggest …

'Call and data plans for phones …

'A quarter of Australians who use …

Document Sentences

Extractive

Summary

Salience

Allocation

Figure 1: Illustration of different guidance. Extractive

summary is a strict guidance consisting of extracted

sentences labeled with check-mark. Salience allocation

is a ﬂexible guidance mapping sentences to different

salience degrees shown as a bar chart.

textual information that beneﬁts other NLP tasks

such as machine reading comprehension (Inoue

et al.,2021), mention linking (Cheng et al.,2015),

claim veriﬁcation (Yin et al.,2021), and informa-

tion extraction (Lu et al.,2022).

Abstractive summarization models are typically

trained end-to-end using large collections of paired

corpora of raw documents and human-written sum-

maries to directly perform sequence-to-sequence

generation. In terms of deciding what to include

in the generated summaries, these models im-

plicitly learn to capture the salient information

from scratch. Accordingly, recent literature has

attempted to add auxiliary extractive salience guid-

ance for abstractive summarization models to give

them a higher-level understanding of input docu-

ments, among which, extractive summaries appear

to provide the most effective guidance (Li et al.,

2020;Jin et al.,2020;Dou et al.,2021). Methods

following this strategy learn to ﬁrst perform extrac-

tive summarization, then perform abstraction on

top of the extractive summaries (Hsu et al.,2018;

Pilault et al.,2020;Dou et al.,2021).

However, incorporating extractive summaries as

a form of guidance is evidently imperfect, even

arXiv:2210.12330v1 [cs.CL] 22 Oct 2022

though it improves the overall performance of ab-

stractive summarization in some cases (Dou et al.,

2021): 1) Extractive summaries are not reliable

guidance. When there are too many summary-

worthy sentences in the document, selecting a part

of them may prone to information loss. When

there are too few or no summary-worthy sentences,

using the selected extractive summaries could be

noisy and confusing to the model. 2) Extractive

summaries are not ﬂexible to adapt to different

cases. The number and allocation of salience con-

tent pieces can vary by documents. Rather than

extracting a ﬁxed number of sentences, a ﬂexible

guidance should select salient content based on

document properties. An imperfect selection pro-

cess may also lead to further model biases, such

as positional biases or length biases (Zhong et al.,

2019). As the summarization process can differ for

distinct documents (Grusky et al.,2018;Koupaee

and Wang,2018), a reliable guidance should al-

low ﬂexible content selection, and be adaptive to

documents with different abstractiveness.

In this paper, we propose a novel summariza-

tion approach with a ﬂexible and reliable salience

guidance, namely

SEASON

(

alienc

E A

llocation

as Guidance for Abstractive

ummarizati

Salience is the degree to which a sentence con-

tributes to the central idea of a document, and

its allocation means how salience is distributed

among all sentences in a document. To estimate

the salience allocation, a linear classiﬁer is trained

on top of the encoder. This estimation is incorpo-

rated into the decoder with Salience-Aware Cross-

Attention (SACA). It provides the ﬂexibility to de-

cide how much signal to accept from the salience

guidance to supervise the abstractive summariza-

tion. The ground-truth salience label is assigned

to each sentence based on its similarity with the

ground-truth summary. Meanwhile, the number

of salience degrees and their cut-off thresholds are

decided based on the corpus to balance informative-

ness and prediction accuracy. To further improve

the robustness of the summarization model, we

apply label smoothing between adjacent salience

degrees during training, and use the expectation of

salience as a more robust salience estimation.

The technical contributions of this work are

three-fold. First, we develop a new method for

abstractive summarization on Transformer-based

encoder-decoder architecture with the allocation

of salience expectation as ﬂexible guidance (§3).

Our method provides reliable guidance that adapts

well to articles in different abstractiveness (§5.1).

Second, we show the effectiveness and reliability

of our proposed method comparing to the existing

methods in both automatic (§4.2) and human evalu-

ation (§5.3). Third, empirical results on more than

one million news articles show a natural ﬁfteen-ﬁfty

salience split for news article sentences (§4.3), pro-

viding a useful insight for composing news articles.

2 Related Work

Joint extractive and abstractive summarization.

Extractive summarization and abstractive summa-

rization are two general paradigms of text summa-

rization (See et al.,2017;Grusky et al.,2018). Ex-

tractive summarization ensures the faithfulness of

the generated summary but is not able to properly

summarize documents when rephrasing is needed

(Liu and Liu,2009). Abstractive summarization,

comparatively, is more ﬂexible but may suffer from

hallucination (Maynez et al.,2020).

A series of studies attempt to beneﬁt from the

advantages of both paradigms by combining them.

Hsu et al. (2018) encourage the word-level atten-

tion of an abstractive summarization model and the

relative sentence-level extraction probability from

an extractive summarization model to be consistent.

More recent studies show that conducting abstrac-

tive summarization with extractive summaries as a

part of the input leads to better performance (Saito

et al.,2020;Pilault et al.,2020;Dou et al.,2021).

Extractive summarization can also work as an ef-

fective content selector for abstractive summariza-

tion when summarizing long documents (Manakul

and Gales,2021). Some studies (Gehrmann et al.,

2018;Li et al.,2020;Saito et al.,2020) also con-

sider to extract key words or phrases instead of

summary worthy sentences as guidance, but their

performances are not as good as those using sen-

tences (Dou et al.,2021).

Our work extends the strict extractive summary

guidance to a soft guidance of salience allocation.

The proposed guidance is more ﬂexible, reliable

and adaptive, leading to better performance.

Selective attention.

Selective attention is a psy-

chological concept referring to the differential pro-

cessing of simultaneous sources of information

(Johnston and Dark,1986). Incorporating prior

knowledge through selective attention is widely ex-

plored in natural language processing, especially in

Self

Attention

Feed

Forward

Add & Norm

N×

Input

Embedding

Add & Norm

Self

Attention

Feed

Forward

Add & Norm

×N

Output

Embedding

Add & Norm

Cross

Attention

Add & Norm

Linear

Softmax

Salience

Probabilities

Softmax

Output

Probabilities

Linear

Salience

Embedding

⊕

𝝵

(x)

Figure 2: Model architecture of SEASON. The

proposed modules are highlighted with bold lines.

SEASON adds a salience predictor on top of the en-

coder, maps (the expectation of) salience degrees to

corresponding embeddings, and adds these salience em-

beddings to the key vectors of cross attention.

recent NLP models with attention mechanism (Lin

et al.,2016;Sukhbaatar et al.,2019;Pruthi et al.,

2020;Beltagy et al.,2020;Wang et al.,2022). To

modify the summarization process with selective

attention, previous studies either adjust the atten-

tion scores based on content selection probabilities

directly (Hsu et al.,2018;Saito et al.,2020;Li

et al.,2021), or appending selected content in the

input (Saito et al.,2020;Dou et al.,2021). Recent

studies show that the latter method with sentence-

level content selection performs better (Dou et al.,

2021).

Different from prior studies,

SEASON

maps

salience degrees to distinct embeddings and adds

them to the encoder outputs as key vector for cross-

attention. This gives our model the ﬂexibility to

decide how much signal to accept from the salience

guidance for supervising the abstractive summa-

rization process. This strategy achieves better per-

formance in comparison with previous salience-

guided selective attention methods.

3 SEASON

In this work, we employ a Transformer-based

encoder-decoder model for abstractive summariza-

tion. As shown in Fig. 2, our model

SEASON

en-

capsulates salience prediction and text summariza-

tion in a single network. We perform multi-task

end-to-end training, and inference via one forward

pass. During training, the model jointly learns to

predict the degree of salience for each sentence

and is guided with ROUGE-based ground-truth

salience allocation to generate the abstractive sum-

mary. During inference,

SEASON

predicts the ex-

pected salience allocation intermediately with the

encoder outputs, and uses this predicted informa-

tion to guide the decoder to generate the summary.

3.1 Problem Formulation

Our assumption comes from an intuition that know-

ing the content salience allocation helps the model

to pay attention to important content and generate

more informative summaries. Although the con-

tent salience allocation is a built-in attribute of the

source document, it is hard for the model to lever-

age this attribute without direct supervision (Li

et al.,2020;Saito et al.,2020;Dou et al.,2021).

Let

be the sequence of input tokens in the

source document, and

be the sequence of the sum-

mary tokens, where every token

is in the

vocabulary

. We use

, where

j∈ {1, . . . , N}

to represent the salience degree of the

-th sen-

tence in the input document. We deﬁne

the sentence index for the

-th token, where

oi∈

{1, . . . , N}

. The salience allocation is deﬁned as

ζ(x) = [f(zo1),...,f(zo|x|)]

The problem can

be formulated as follows:

P(y|x) =

|y|

k=1

pθ(yk|y<k,x, ζ(x)).(1)

In Eq. 1, each token prediction is conditioned on

the previously decoded summary tokens, the input

tokens in the source document, and the allocation

of salience of the source document.

3.2 Salience Allocation Prediction

To predict salience degrees of input sentences, we

slightly modify the encoder input sequence by

adding a special token at the beginning of each

sentence, obtaining their last-layer hidden states as

sentence representations:

[hsent

1,...,hsent

n] = Encoder(ˆx),(2)

2f(·)

is a function that maps the sentence salience degree

to an embedding vector. In our implementation, we use the

ground-truth salience embedding for training, and the expected

embedding over the inferred salience distribution for testing.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SalienceAllocationasGuidanceforAbstractiveSummarizationFeiWangy,KaiqiangSongz,HongmingZhangz,LifengJinz,SangwooChozWenlinYaoz,XiaoyangWangz,MuhaoChenyandDongYuzyUniversityofSouthernCalifornia;zTecentAILab,Seattle{fwang598,muhaoche}@usc.edu{riversong,hongmzhang,lifengjin,swcho,wenlinyao,shawnxywang...

展开>> 收起<<

Salience Allocation as Guidance for Abstractive Summarization Fei Wangy Kaiqiang Songz Hongming Zhangz Lifeng Jinz Sangwoo Choz Wenlin Yaoz Xiaoyang Wangz Muhao Chenyand Dong Yuz.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Salience Allocation as Guidance for Abstractive Summarization Fei Wangy Kaiqiang Songz Hongming Zhangz Lifeng Jinz Sangwoo Choz Wenlin Yaoz Xiaoyang Wangz Muhao Chenyand Dong Yuz

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: