pression, a task aiming to reduce sentence length
from source sentences while retaining basic mean-
ing (Jing,2000;Knight and Marcu,2000;McDon-
ald,2006). For example, the compression task
has been formulated as integer linear programming
optimization using syntactic trees (Clarke and La-
pata,2006a), or as a sequence labelling optimiza-
tion problem using the recurrent neural networks
(RNN) (Filippova et al.,2015;Klerke et al.,2016;
Kamigaito et al.,2018). They explicitly or im-
plicitly use dependency grammar. Pre-trained lan-
guage models such as ELMo (Peters et al.,2018)
and BERT (Devlin et al.,2019) can encode fea-
tures apart from dependency parsing (Kamigaito
and Okumura,2020), bringing prediction and ref-
erence sentences closer.
All methods rely on parallel datasets labelling
parts to be deleted. However, the deleting part
in sentence compression differs from that in revi-
sion. Filippova and Altun (2013) created Google
dataset from titles and first sentence of news arti-
cles. The information retained in the first sentence
depends on the title. While this creation is useful
for reducing excessive information, the deleted part
is probably not wordiness.
Deleting does not solve everything in revision.
We can revise "in this report I will conduct a study
of ants and the setup of their colonies" to "in this
report I will study ants and their colonies", tak-
ing advantage of noun-and-verb homograph. How-
ever, a more concise version "this report stud-
ies ants" (Commnet) requires changing "study" to
third-person singular.
3.2 Replacing as in Paraphrase Generation
Word choice matters as well, thus we revise by
paraphrasing to stronger words. Paraphrase gen-
eration changes a sentence grammatically and re-
selects words, while retaining meaning. Paraphras-
ing matters in academic writing, for it helps avoid
plagiarism. Rule-based or statistical machine para-
phrasing substitutes words by finding synonyms
from lexical databases, and decodes syntax accord-
ing to template sentences. This rigid method may
undermine creativity (Bui et al.,2021). Pre-trained
neural language models like GPT (Radford et al.,
2019) or BART (Lewis et al.,2020) paraphrase
more accurately (Hegde and Patil,2020). Through
paraphrasing, we can replace verb phrase "con-
duct a study" to verb "study" in the example above,
rather than delete and rely on noun-and-verb homo-
graphs to keep the sentence syntactically correct.
Machine revision is a kind of paraphrase gen-
eration, and vice versa is not true. Current para-
phrase generation does not require concision in gen-
erated sentences. Automatically annotated datasets
for paraphrasing include ParaNMT (Wieting and
Gimpel,2018), Twitter (Lan et al.,2017), or re-
purposed noisy datasets such as MSCOCO (Lin
et al.,2014) and WikiAnswers (Fader et al.,2013).
We may adapt paraphrase parallel datasets to train
revising models, as investigated in Section 5.
3.3 Other related tasks
Summarization produces a shorter text of one or
several documents, while retaining most of mean-
ing (Paulus et al.,2018). This is similar to sen-
tence compression. In practice, summarization
welcomes novel words, allows specifying output
length (Kikuchi et al.,2016), and removes much
more information than sentence compression does.
Datasets include XSum (Narayan et al.,2018) , CN-
N/DM (Hermann et al.,2015), WikiHow (Koupaee
and Wang,2018), NYT (Sandhaus,2008), DUC-
2004 (Over et al.,2007), and Gigaword (Rush et al.,
2015), where summaries are generally shorter than
one-tenth of documents. On the other hand, sen-
tence summarization (Chopra et al.,2016) uses
summarization methods on sentence compression
datasets, retaining more information and possibly
generating new words.
Text simplification modifies vocabulary and syn-
tax for easier reading, while retaining approxi-
mate meaning (Omelianchuk et al.,2021). Hand-
crafted syntactic rules (Siddharthan,2006;Car-
roll et al.,1999;Chandrasekar et al.,1996) and
aligned sentences-driven simplification (Yatskar
et al.,2010) have been explored. Corpora such
as Turk (Xu et al.,2016) and PWKP (Zhu et al.,
2010) are compiled from Wikipedia and Simple
English Wikipedia (Coster and Kauchak,2011).
Rules for simplification may deviate from that for
revision, e.g., text simplification sometimes encour-
ages prepositional phrases (Xu et al.,2016). Still,
adapting these approaches may benefit academic
revising for concision.
Fluency editing (Napoles et al.,2017) not only
corrects grammatical errors but paraphrases text to
be more native sounding as well. Its paraphrasing
section is constrained such that outputs represent a
higher level of English proficiency than inputs. As
a constrained paraphrase task, fluency editing may