also other neural approaches) to translationese
classification substantially outperform handcrafted
feature-engineering based approaches using SVMs.
However, to date, two important questions remain:
(i)
it is not clear whether the substantial perfor-
mance differences are due to learned vs. hand-
crafted features, the classifiers (SVM, the BERT
classification head, or full BERT), or the combina-
tion of both, and
(ii)
what the neural feature and
representation learning approaches actually learn
and how that explains the superior classification.
The contributions of our paper are as follows:
1.
we address
(i)
by carefully crossing fea-
tures and classifiers, feeding BERT-based
learned features to feature-engineering mod-
els (SVMs), feeding the BERT classification
head with hand-crafted features, and by mak-
ing BERT architectures learn handcrafted fea-
tures, as well as feeding embeddings of hand-
crafted features into BERT. Our experiments
show that SVMs using BERT-learned fea-
tures perform on a par with our best BERT-
translationese classifiers, while BERT using
handcrafted features only performs at the level
of feature-engineering-based classifiers. This
shows that it is the features and not the clas-
sifiers, that lead to the substantial (up to 20%
points accuracy absolute) difference in perfor-
mance.
2.
we present the first steps to address
(ii)
us-
ing integrated gradients, an attribution-based
approach, on the BERT models trained in var-
ious settings. Based on striking similarities
in attributions between BERT trained from
scratch and BERT pretrained on handcrafted
features and fine-tuned on text data, as well as
comparable classification accuracies, we find
evidence that the hand-crafted features do not
bring any additional information over the set
learnt by BERT. it is therefore likely that the
hand-crafted features are a (possibly partial)
subset of the features learnt by BERT. Inspect-
ing the most attributed tokens, we present evi-
dence of ’Clever Hans’ behaviour: at least part
of the high classification accuracy of BERT
is due to names of places and countries, sug-
gesting that part of the classification is topic-
and not translationese-based. Moreover, some
top features suggest that there may be some
punctuation-based spurious correlation in the
data.
2 Related Work
Combining learned and hand-crafted features.
(Kaas et al.,2020;Prakash and Tayyar Madabushi,
2020;Lim and Tayyar Madabushi,2020) combine
BERT-based and manual features in order to im-
prove accuracy. (Kazameini et al.,2020;Ray and
Garain,2020;Zhang and Yamana,2020) concate-
nate BERT pooled output embeddings with hand-
crafted feature vectors for classification, often us-
ing an SVM, where the handcrafted feature vector
might be further encoded by a neural network or
used as it is. Our work differs in that we do not
combine features from both models but swap them
in order to decide whether it is the features, the clas-
sifiers or the combination that explains the perfor-
mance difference between neural and feature engi-
neering based models. Additionally, our approach
allows us to examine whether or not representa-
tion learning learns features similar to hand-crafted
features.
Explainability for the feature-engineering ap-
proach to translationese classification.
To date,
explainability in translationese research has mainly
focused on quantifying handcrafted feature impor-
tance. Techniques include inspecting SVM feature
weights (Avner et al.,2016;Pylypenko et al.,2021),
correlation (Rubino et al.,2016), information gain
(Ilisei et al.,2010), chi-square (Ilisei et al.,2010),
decision trees or random forests (Rubino et al.,
2016;Ilisei et al.,2010), ablating features and ob-
serving the change in accuracy (Baroni and Bernar-
dini,2005;Ilisei et al.,2010), training separate
classifiers on each individual feature (or feature set)
and comparing accuracies (Volansky et al.,2015;
Avner et al.,2016). For n-grams, the difference in
frequencies between the original and translationese
classes (Koppel and Ordan,2011;van Halteren,
2008), and the contribution to the symmetrized
Kullback-Leibler Divergence between the classes
(Kurokawa et al.,2009) have been used.
Explainability for the neural approach to trans-
lationese classification.
To date, explainability
methods for neural networks have not been widely
explored. Pylypenko et al. (2021) quantify to which
extent handcrafted features can explain the variance
in the predictions of neural models, such as BERT,
LSTMs, and a simplified Transformer, by training
per-feature linear regression models to output the
predicted probabilities of the neural models and
computing the
R2
measure. They find that most of