Bilingual Synchronization: Restoring Translational Relationships with
Editing Operations
Jitao Xu†Josep Crego‡François Yvon†
†Université Paris-Saclay, CNRS, LISN, 91400, Orsay, France
‡SYSTRAN, 5 rue Feydeau, 75002, Paris, France
{jitao.xu,francois.yvon}@limsi.fr, josep.crego@systrangroup.com
Abstract
Machine Translation (MT) is usually viewed
as a one-shot process that generates the tar-
get language equivalent of some source text
from scratch. We consider here a more gen-
eral setting which assumes an initial target se-
quence, that must be transformed into a valid
translation of the source, thereby restoring
parallelism between source and target. For
this bilingual synchronization task, we con-
sider several architectures (both autoregressive
and non-autoregressive) and training regimes,
and experiment with multiple practical settings
such as simulated interactive MT, translating
with Translation Memory (TM) and TM clean-
ing. Our results suggest that one single generic
edit-based system, once fine-tuned, can com-
pare with, or even outperform, dedicated sys-
tems specifically trained for these tasks.
1 Introduction
Neural Machine Translation (NMT) systems have
made tangible progress in recent years (Bahdanau
et al.,2015;Vaswani et al.,2017), as they started to
produce usable translations in production environ-
ments. NMT is generally viewed as a one-shot ac-
tivity process in autoregressive approaches, which
generates the target translation based on the sole
source side input. Recently, Non-autoregressive
Machine Translation (NAT) models have proposed
to perform iterative refinement decoding (Lee et al.,
2018;Ghazvininejad et al.,2019;Gu et al.,2019),
where translations are generated through an itera-
tive revision process, starting with a possibly empty
initial hypothesis.
This paper focuses on the revision part of the ma-
chine translation (MT) process and consider bilin-
gual synchronization (Bi-sync), which we define
as follows: given a pair of a source (
f
) and a target
(
˜e
) sentences, which may or may not be mutual
translations, the task is to compute a revised ver-
sion
e
of
˜e
, such that
e
is an actual translation of
f
. This is necessary when the source side of an ex-
isting translation is edited, requiring to update the
target and keep both sides synchronized. Bi-sync
subsumes standard MT, where the synchronization
starts with an empty target (
˜e
= []). Other interest-
ing cases occur when parts of the initial target can
be reused, so that the synchronization only requires
a few changes.
Bi-sync encompasses several tasks: synchroniza-
tion is needed in interactive MT (IMT, Knowles and
Koehn,2016) and bilingual editing (Bronner et al.,
2012), with
˜e
the translation of a previous version
of
f
; in MT with lexical constraints (Hokamp and
Liu,2017), where
˜e
contains target-side constraints
(Susanto et al.,2020;Xu and Carpuat,2021); in
Translation Memory (TM) based approaches (Bulte
and Tezcan,2019), where
˜e
is a TM match for a
similar example; in automatic post-editing (APE)
(do Carmo et al.,2021), where ˜e is an MT output.
We consider here several implementations of
sequence-to-sequence models dedicated to these sit-
uations, contrasting an autoregressive model with a
non-autoregressive approach. The former is similar
to Bulte and Tezcan (2019), where the source sen-
tence and the initial translation are concatenated as
one input sequence; the latter uses the Levenshtein
Transformer (LevT) of Gu et al. (2019). We also
study various ways to generate appropriate training
samples (
f
,
˜e
,
e
). Our experiments consider sev-
eral tasks, including TM cleaning, which attempts
to fix and synchronize noisy segments in a parallel
corpus. This setting is more difficult than Bi-sync,
as many initial translations are already correct and
need to be left unchanged. Our results suggest that
one single AR system, once fine-tuned, can favor-
ably compare with dedicated systems for each of
these tasks. To recap, our main contributions are
(a) the generalization of several tasks subsumed by
a generic synchronization objective, allowing us
to develop a unified perspective about otherwise
unrelated subdomains of MT; (b) the design of a
arXiv:2210.13163v1 [cs.CL] 24 Oct 2022