
University of Cape Town’s WMT22 System: Multilingual Machine
Translation for Southern African Languages
Khalid N. Elmadani Francois Meyer Jan Buys
Department of Computer Science
University of Cape Town
{ahmkha009,myrfra008}@myuct.ac.za, jbuys@cs.uct.ac.za
Abstract
The paper describes the University of Cape
Town’s submission to the constrained track of
the WMT22 Shared Task: Large-Scale Ma-
chine Translation Evaluation for African Lan-
guages. Our system is a single multilingual
translation model that translates between En-
glish and 8 South / South East African Lan-
guages, as well as between specific pairs of
the African languages. We used several tech-
niques suited for low-resource machine trans-
lation (MT), including overlap BPE, back-
translation, synthetic training data generation,
and adding more translation directions during
training. Our results show the value of these
techniques, especially for directions where
very little or no bilingual training data is avail-
able.1
1 Introduction
Southern African languages are underrepresented
in NLP research, in part because most of them are
low-resource languages: It is not always possible
to find high-quality datasets that are large enough
to train effective deep learning models (Kreutzer
et al.,2021). The WMT22 Shared Task on Large-
Scale Machine Translation Evaluation for African
Languages (Adelani et al.,2022) presented an op-
portunity to apply one of the most promising recent
developments in NLP — multilingual neural ma-
chine translation — to Southern African languages.
For many languages, the parallel corpora released
for the shared task are the largest publicly available
datasets yet. For some translation directions (e.g.
between Southern African languages), no parallel
corpora were previously available.
In this paper we present our submission to the
shared task. Our system is a Transformer-based
encoder-decoder (Vaswani et al.,2017) that trans-
lates between English and 8 South / South East
1
Our model is available at https://github.com/Khalid-
Nabigh/UCT-s-WMT22-shared-task.
African languages (Afrikaans, Northern Sotho,
Shona, Swati, Tswana, Xhosa, Xitsonga, Zulu) and
in 8 additional directions (Xhosa to Zulu, Zulu to
Shona, Shona to Afrikaans, Afrikaans to Swati,
Swati to Tswana, Tswana to Xitsonga, Xitsonga
to Northern Sotho, Northern Sotho to Xhosa). We
trained a single model with shared encoder and de-
coder parameters and a shared subword vocabulary.
We applied several methods aimed at improving
translation performance in a low-resource setting.
We experimented with BPE (Sennrich et al.,2016b)
and overlap BPE (Patil et al.,2022), the latter of
which increases the representation of low-resource
language tokens in the shared subword vocabulary.
We used initial multilingual and bilingual models to
generate back-translated sentences (Sennrich et al.,
2016a) for subsequent training.
First, we trained a model to translate between En-
glish and the 8 Southern African languages. Then
we added the 8 additional translation directions and
continued training. For some of these additional
directions no parallel corpora were available, so we
generated synthetic training data with our existing
model. By downsampling some of the parallel cor-
pora to ensure a balanced dataset, we were able to
train our model effectively in the new directions,
while retaining performance in the old directions.
We describe the development of our model and
report translation performance at each training
stage. Our final results compare favourably to
existing works with overlapping translation direc-
tions. While there is considerable disparity in per-
formance across languages, our model nonetheless
achieves results that indicate some degree of effec-
tive MT across all directions (most BLEU scores
are above 10 and most chrF++ scores are above 40).
We also discuss our findings regarding techniques
for low-resource MT. We found overlap BPE and
back-translation to improve performance for most
translation directions. Furthermore, our results con-
firm the value of multilingual models, which proves
arXiv:2210.11757v1 [cs.CL] 21 Oct 2022