
can not only condense biomedical documents into
concise summaries but also adjust the readability
level of summaries to improve the dissemination
of scientific information.
Our research aims to tackle the problem, and
thus propose a novel task of readability control-
lable biomedical document summarization, which
is to automatically recognize users’ readability de-
mands and generate summaries that are compatible
with their expertise level and needs, as shown in
Figure 1. Specifically, in a binary readability level
controlling setting, it is to produce technical sum-
maries for experts, while plain language summaries
(PLS) for laypeople. The task is challenging since:
1) it requires the model to accurately recognize dif-
ferent readability demands from limited guiding
signals, 2) it requires a suitable selection of con-
tent from long biomedical documents for various
readers guided by their readability demands, 3) it
requires the model to learn not only lexical and
syntactic adjustment but also paraphrasing accord-
ing to users’ needs. Since professionals pay more
attention to clarity and accuracy while non-experts
prefer summaries that are easier to understand.
To approach this task, we build the first cor-
pus consisting of 28,124 biomedical literature with
technical and plain language summaries written by
the authors, then conduct a thorough analysis of the
collected data including statistics, readability met-
rics, and textual features. Next, we examine sev-
eral controlling techniques on prevalent pre-trained
language models (PLMs) and evaluate their per-
formance on our dataset. Apart from automatic
assessment, we also bring in the human evaluation
due to the inefficacy of current metrics for read-
ability and text generation. To better characterize
readability differences between technical summary
and PLS, we further propose a novel masked noun
phrase-based text complexity metric and its variant
based on the masked language model (MLM). It
is superior in modelling the semantic structure of
biomedical texts compared to traditional metrics
and existing MLM-based metrics.
Overall, our main contributions are summarised
as follows: (1) We introduce a novel task of read-
ability controllable biomedical document summa-
rization. (2) We build a corpus
1
with 28,124
biomedical papers with their technical and plain
language summaries, which will facilitate further
1
can be downloaded from
http://www.nactem.ac.uk/
readability/
exploration in this task. (3) We propose an MLM-
based text complexity metric, which surpasses ex-
isting readability evaluation metrics on our dataset.
(4) We examined controlling techniques including
prompts and multi-heads on both extractive and ab-
stractive methods to adjust readability during sum-
marization and found the performance is far from
satisfying. To the best of our knowledge, this is the
first effort to consider readability as a controllable
attribute in scientific document summarization.
2 Related Work
2.1 Biomedical Text Summarization
Neural networks and PLMs have been explored
for biomedical document summarization in recent
years, due to their success in general text summa-
rization (Cohan et al.,2018;Liu and Lapata,2019a;
Zhang et al.,2019;Wang et al.,2021). Sotudeh
et al. (2020) improved radiology report summa-
rization by incorporating medical ontology into a
sequence-to-sequence summarizer. Wallace et al.
(2020) investigated the BART model (Lewis et al.,
2020) with domain specific pre-training strategies
and input decorations for multi-document summa-
rization of randomized controlled trials (RCTs).
Progress in biomedical summarization has also
been advanced due to the emergence of in-domain
corpora. Cohan et al. (2018) and Wang et al.
(2020b) compiled a large amount of biomedical
literature with their abstracts as summaries. DeY-
oung et al. (2021) investigated if systematic reviews
could be summarised from their cited clinical trials.
Guo et al. (2020) mixed summarization and sim-
plification by generating plain language summary
conditioned on abstracts of systematic reviews.
2.2 Controllable Text Summarization
Recent efforts on controllable text summarization
mostly focus on news articles. Fan et al. (2018) has
leveraged PLMs with special tokens prepended to
the input, to control the length, entities, and style
of the generated summary. Zheng et al. (2020)
and He et al. (2020) further extended prompts, key-
words and entities as guiding markers. Chan et al.
(2021) proposed the constrained markov decision
process (CMDP) based method to control attributes
of summarization. Other works have tried exerting
control in decoding. HydraSum (Goyal et al.,2021)
distributed different values of an attribute into mul-
tiple decoders and leveraged a gate mechanism to
gain control over properties such as abstractness