Readability Controllable Biomedical Document Summarization Zheheng Luo and Qianqian Xieand Sophia Ananiadou NaCTeM The University of Manchester

2025-04-29 0 0 631.34KB 14 页 10玖币

侵权投诉

Readability Controllable Biomedical Document Summarization

Zheheng Luo and Qianqian Xie∗and Sophia Ananiadou

NaCTeM, The University of Manchester

{zheheng.luo, qianqian.xie, sophia.ananiadou}@manchester.ac.uk

Abstract

Different from general documents, it is recog-

nised that the ease with which people can un-

derstand a biomedical text is eminently varied,

owing to the highly technical nature of biomed-

ical documents and the variance of readers’ do-

main knowledge. However, existing biomed-

ical document summarization systems have

paid little attention to readability control, leav-

ing users with summaries that are incompat-

ible with their levels of expertise. In recog-

nition of this urgent demand, we introduce

a new task of readability controllable sum-

marization for biomedical documents, which

aims to recognise users’ readability demands

and generate summaries that better suit their

needs: technical summaries for experts and

plain language summaries (PLS) for laypeo-

ple. To establish this task, we construct a cor-

pus consisting of biomedical papers with tech-

nical summaries and PLSs written by the au-

thors, and benchmark multiple advanced con-

trollable abstractive and extractive summariza-

tion models based on pre-trained language

models (PLMs) with prevalent controlling and

generation techniques. Moreover, we propose

a novel masked language model (MLM) based

metric and its variant to effectively evaluate

the readability discrepancy between lay and

technical summaries. Experimental results

from automated and human evaluations show

that though current control techniques allow

for a certain degree of readability adjustment

during generation, the performance of exist-

ing controllable summarization methods is far

from desirable in this task.

1 Introduction

Automatic summarization for biomedical docu-

ments (Guo et al.,2020;DeYoung et al.,2021) such

as clinical literature (Wang et al.,2020b;DeYoung

et al.,2021), provides an efﬁcient way for read-

ers to acquire desirable biomedical information

∗Corresponding author

Figure 1: Example of our task. Summaries are gener-

ated according to users’ demand for readability.

quickly. Unlike general documents, biomedical

documents have characteristics of mounting scien-

tiﬁc jargon (Plavén-Sigray et al.,2017), and com-

plex language structures (Friedman et al.,2002).

Therefore, readers such as non-experts and profes-

sionals would seek textual information on differ-

ent readability levels, since the variance of their

biomedical knowledge affects their ease of under-

standing biomedical papers. For example, an in-

domain expert might require accurate and clear

technical summaries with medical jargon and pro-

fessional language, to quickly grasp the main con-

tributions of biomedical papers. In contrast, layper-

son readers usually require plain language sum-

maries with less technical terms and more con-

text of the research, which are easier to under-

stand. Nevertheless, current biomedical summa-

rization systems are only able to offer technical

abstracts (Sotudeh et al.,2020;DeYoung et al.,

2021;Xie et al.,2022b,a;Bishop et al.,2022) or

lay language summaries (Guo et al.,2020;Chan-

drasekaran et al.,2020), fail to generate compatible

summaries for various users according to their lev-

els of expertise without considering the readability

as an aspect to be controlled during summary gener-

ation (He et al.,2020). We argue that it is urgent to

develop biomedical summarization approaches that

arXiv:2210.04705v3 [cs.CL] 1 May 2023

can not only condense biomedical documents into

concise summaries but also adjust the readability

level of summaries to improve the dissemination

of scientiﬁc information.

Our research aims to tackle the problem, and

thus propose a novel task of readability control-

lable biomedical document summarization, which

is to automatically recognize users’ readability de-

mands and generate summaries that are compatible

with their expertise level and needs, as shown in

Figure 1. Speciﬁcally, in a binary readability level

controlling setting, it is to produce technical sum-

maries for experts, while plain language summaries

(PLS) for laypeople. The task is challenging since:

1) it requires the model to accurately recognize dif-

ferent readability demands from limited guiding

signals, 2) it requires a suitable selection of con-

tent from long biomedical documents for various

readers guided by their readability demands, 3) it

requires the model to learn not only lexical and

syntactic adjustment but also paraphrasing accord-

ing to users’ needs. Since professionals pay more

attention to clarity and accuracy while non-experts

prefer summaries that are easier to understand.

To approach this task, we build the ﬁrst cor-

pus consisting of 28,124 biomedical literature with

technical and plain language summaries written by

the authors, then conduct a thorough analysis of the

collected data including statistics, readability met-

rics, and textual features. Next, we examine sev-

eral controlling techniques on prevalent pre-trained

language models (PLMs) and evaluate their per-

formance on our dataset. Apart from automatic

assessment, we also bring in the human evaluation

due to the inefﬁcacy of current metrics for read-

ability and text generation. To better characterize

readability differences between technical summary

and PLS, we further propose a novel masked noun

phrase-based text complexity metric and its variant

based on the masked language model (MLM). It

is superior in modelling the semantic structure of

biomedical texts compared to traditional metrics

and existing MLM-based metrics.

Overall, our main contributions are summarised

as follows: (1) We introduce a novel task of read-

ability controllable biomedical document summa-

rization. (2) We build a corpus

with 28,124

biomedical papers with their technical and plain

language summaries, which will facilitate further

can be downloaded from

http://www.nactem.ac.uk/

readability/

exploration in this task. (3) We propose an MLM-

based text complexity metric, which surpasses ex-

isting readability evaluation metrics on our dataset.

(4) We examined controlling techniques including

prompts and multi-heads on both extractive and ab-

stractive methods to adjust readability during sum-

marization and found the performance is far from

satisfying. To the best of our knowledge, this is the

ﬁrst effort to consider readability as a controllable

attribute in scientiﬁc document summarization.

2 Related Work

2.1 Biomedical Text Summarization

Neural networks and PLMs have been explored

for biomedical document summarization in recent

years, due to their success in general text summa-

rization (Cohan et al.,2018;Liu and Lapata,2019a;

Zhang et al.,2019;Wang et al.,2021). Sotudeh

et al. (2020) improved radiology report summa-

rization by incorporating medical ontology into a

sequence-to-sequence summarizer. Wallace et al.

(2020) investigated the BART model (Lewis et al.,

2020) with domain speciﬁc pre-training strategies

and input decorations for multi-document summa-

rization of randomized controlled trials (RCTs).

Progress in biomedical summarization has also

been advanced due to the emergence of in-domain

corpora. Cohan et al. (2018) and Wang et al.

(2020b) compiled a large amount of biomedical

literature with their abstracts as summaries. DeY-

oung et al. (2021) investigated if systematic reviews

could be summarised from their cited clinical trials.

Guo et al. (2020) mixed summarization and sim-

pliﬁcation by generating plain language summary

conditioned on abstracts of systematic reviews.

2.2 Controllable Text Summarization

Recent efforts on controllable text summarization

mostly focus on news articles. Fan et al. (2018) has

leveraged PLMs with special tokens prepended to

the input, to control the length, entities, and style

of the generated summary. Zheng et al. (2020)

and He et al. (2020) further extended prompts, key-

words and entities as guiding markers. Chan et al.

(2021) proposed the constrained markov decision

process (CMDP) based method to control attributes

of summarization. Other works have tried exerting

control in decoding. HydraSum (Goyal et al.,2021)

distributed different values of an attribute into mul-

tiple decoders and leveraged a gate mechanism to

gain control over properties such as abstractness

and length. Amplayo et al. (2021) and Amplayo

and Lapata (2021) focused on the aspect control

of opinion summarization on reviews. To the best

of our knowledge, our work is the ﬁrst effort to

consider readability as a controllable attribute in

scientiﬁc document summarization, which is im-

portant for speciﬁc-domain, especially biomedical

science.

2.3 Readability Metrics

Readability is deﬁned as the ease with which a

reader can understand a piece of text. Many fac-

tors are involved in determining readability, such as

lexical and syntactic sophistication, discourse co-

hesion, and background knowledge (Crossley et al.,

2017). Prior work on lay summarization (Guo et al.,

2020) evaluated their corpus by traditional readabil-

ity formulas like Flesch-Kincaid Grade Level (Kin-

caid et al.,1975) which is inefﬁcient in revealing

the readability differences in scientiﬁc writings.

Martinc et al. (2021b) has shown the potential of

the PLM in estimating text readability. Devaraj

et al. (2021) used an MLM-based metric to better

classify technical abstracts and PLS of medical re-

views. In this work, we propose an advanced MLM-

based metric to manifest the readability differences

among summaries in our corpus and evaluate the

output of tested models.

3 Task Overview

Deﬁnition.

The objective of this task is to gen-

erate summaries of biomedical documents on

different readability levels based on users’ de-

mands. Let

D={d1, d2,· · · , dk}

denotes the

set of source documents, each document

di=

{xi,1, xi,2,· · · , xi,n}

can be represented by the se-

quence of

tokens,

stands for the target sum-

mary of document

that is represented by the se-

quence of

tokens:

{si,1, si,2,· · · , si,m}, m n

means the readability level the user might want.

The task can be formulated as a conditional gener-

ative problem as follows:

P(S|D, r) =

P(Si|di, r)(1)

which maximizes the probability of generating S

when given the document set Dand the readability

demand r. In this work, since the exploration of

readability controlling summarization is still in an

initial stage, we start with single document input

Dataset docs

avg.

doc

length

avg.

abs

length

avg.

PLS

length

PubMed

133,000

3,016 203 -

CDSR 7,805 - 714 374

Ours 28,124 6,697 287 204

Table 1: Statistics of our PLOS datasets com-

pared with existing biomedical summarization corpora

PubMed (Cohan et al.,2018) and CDSR (Guo et al.,

2020)

with a binary readability control between "tech-

nical" and "plain language" and leave more ﬁne-

grained control to future work. We consider

mean the demand for technical summary that is

suitable for experts, while

means the demand

for plain language summary (PLS) for laypeople

readers. Thus, we have both technical target sum-

mary

and plain language target summary

for

each input document

, to train the model. Addi-

tionally, a technical summary and a PLS generated

from the same document by the same model will

be referred to as a pair of summaries in this paper.

Evaluation.

The most commonly used metric for

evaluating summarization models is ROUGE (Lin,

2004), which has served as a standard in var-

ious text generation tasks. However, a recent

study (Bhandari et al.,2020) has shown that

ROUGE scores do not always agree with human

evaluation when assessing generated summaries.

Also, traditional readability metrics are found un-

able to show the signiﬁcant readability difference

between the technical summary and their simpli-

ﬁed counterparts (Devaraj et al.,2021). Thus, we

conducted both automatic and human evaluations

to assess the readability and general qualities of

generated summaries.

4 Dataset Description

4.1 Data Compilation

We constructed the corpus consisting of peer-

reviewed biomedical research papers with the tech-

nical summaries and PLSs from journals including

PLOS

Biology, PLOS Computational Biology,

PLOS Genetics, PLOS Medicine, PLOS Neglected

Tropical Diseases, and PLOS Pathogens, cover a

broad range of biomedical research subjects. The

PLSs are placed under the section Author Summary

in the format of the PLOS articles and written by

the authors following the requirement of PLOS

2https://journals.plos.org/plosone/

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ReadabilityControllableBiomedicalDocumentSummarizationZhehengLuoandQianqianXieandSophiaAnaniadouNaCTeM,TheUniversityofManchester{zheheng.luo,qianqian.xie,sophia.ananiadou}@manchester.ac.ukAbstractDifferentfromgeneraldocuments,itisrecog-nisedthattheeasewithwhichpeoplecanun-derstandabiomedicaltextise...

展开>> 收起<<

Readability Controllable Biomedical Document Summarization Zheheng Luo and Qianqian Xieand Sophia Ananiadou NaCTeM The University of Manchester.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Readability Controllable Biomedical Document Summarization Zheheng Luo and Qianqian Xieand Sophia Ananiadou NaCTeM The University of Manchester

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: