Multilingual Auxiliary Tasks Training Bridging the Gap between Languages for Zero-Shot Transfer of Hate Speech Detection Models Syrielle MontariolArij RiabiDjamé Seddah

2025-05-02 0 0 283.18KB 17 页 10玖币
侵权投诉
Multilingual Auxiliary Tasks Training: Bridging the Gap between
Languages for Zero-Shot Transfer of Hate Speech Detection Models
Syrielle MontariolArij RiabiDjamé Seddah
INRIA Paris, France
firstname.lastname@inria.fr
Abstract
Zero-shot cross-lingual transfer learning has
been shown to be highly challenging for tasks
involving a lot of linguistic specificities or
when a cultural gap is present between lan-
guages, such as in hate speech detection. In
this paper, we highlight this limitation for
hate speech detection in several domains and
languages using strict experimental settings.
Then, we propose to train on multilingual aux-
iliary tasks – sentiment analysis, named en-
tity recognition, and tasks relying on syntac-
tic information – to improve zero-shot trans-
fer of hate speech detection models across lan-
guages. We show how hate speech detection
models benefit from a cross-lingual knowledge
proxy brought by auxiliary tasks fine-tuning
and highlight these tasks’ positive impact on
bridging the hate speech linguistic and cultural
gap between languages.
1 Introduction
Given the impact social media hate speech can
have on our society as a whole – leading to many
small-scale Overton window effects – the NLP
community has devoted considerable efforts to
automatic hate speech detection using machine
learning-based approaches, and proposed differ-
ent benchmarks and datasets to evaluate their tech-
niques (Dinakar et al.,2011;Sood et al.,2012;
Waseem and Hovy,2016;Davidson et al.,2017;
Fortuna and Nunes,2018;Kennedy et al.,2020).
However, these systems are designed to be ef-
ficient at a given point in time for a specific type
of online content they were trained on. As hate
speech varies significantly diachronically (Florio
et al.,2020) and synchronically (Yin and Zubiaga,
2021), hate speech detection models need to be
constantly adapted to new contexts. For example,
as noted by Markov et al. (2021), the occurrence
of new hate speech domains and their associated
*These authors contributed equally.
lexicons and expressions can be triggered by real-
world events, from local scope incidents to world-
wide crisis.
1
New annotated datasets are needed to
optimally capture all these domain-specific, target-
specific hate speech types. The possibility of creat-
ing and constantly updating exhaustively annotated
datasets, adapted to every possible language and
domain, is chimerical. Thus, the task of hate speech
detection is often faced with low-resource issues.
In this low-resource scenario for a given target
language and domain, if annotated data is available
in another language, the main option for most NLP
tasks is to perform zero-shot transfer using a mul-
tilingual language model (Conneau et al.,2020).
However, in our case, hate speech perception is
highly variable across languages and cultures; for
example, some slur expressions can be considered
not offensive in one language, denoting an infor-
mal register nonetheless, but will be considered
offensive, if not hateful, in another (Nozza,2021).
Despite the cross-lingual transfer paradigm being
extensively used in hate speech detection to cope
with the data scarcity issue (Basile and Rubagotti,
2018;van der Goot et al.,2018;Pamungkas and
Patti,2019;Ranasinghe and Zampieri,2020) or
even the use of models trained on a translation of
the initial training data (Rosa et al.,2021) , this
strong hate speech cultural and linguistic variation
can lower the transferability of hate speech detec-
tion models across languages in a zero-shot setting.
To overcome this limitation, in the absence of
training data or efficient translation models for a
target language, the cultural and linguistic infor-
mation specific to this language needs to be found
elsewhere. In this paper, we propose to capture
this information by fine-tuning the language model
on resource-rich tasks in both the transfer’s source
and target language. Indeed, even though hate-
annotated datasets are not available in both lan-
1
e.g. Hate speech towards Chinese communities spiked in
2020 with the emergence of the COVID-19 Pandemic.
arXiv:2210.13029v2 [cs.CL] 25 Oct 2022
guages, it is likely that similarly annotated data
in the source and target language exist for other
tasks. A language model jointly fine-tuned for this
other task in the two languages can learn some pat-
terns and knowledge, bridging the gap between the
languages, and helping the hate speech detection
model to be transferred between them.
In summary, our work focuses on zero-shots
cross-language multitask architectures where an-
notated hate speech data is available only for one
source language, but some annotated data for other
tasks can be accessed in both the source and target
languages. Using a multitask architecture (van der
Goot et al.,2021b) on top of a multilingual model,
we investigate the impact of auxiliary tasks oper-
ating at different sentence linguistics levels (POS
Tagging, Named Entity Recognition (NER), Depen-
dency Parsing and Sentiment analysis) on the trans-
fer effectiveness. Using Nozza (2021)’s original
set of languages and datasets (hate speech against
women and immigrants, from Twitter datasets in
English, Italian and Spanish), our main contribu-
tions are as follows.
Building strictly comparable corpora across
languages,
2
leading to a thorough evaluation
framework, we highlight cases where zero-
shot cross-lingual transfer of hate speech de-
tection models fails and diagnose the effect of
the choice of the multilingual language model.
We identify auxiliary tasks with a positive
impact on cross-lingual transfer when trained
jointly with hate speech detection: sentiment
analysis and NER. The impact of syntactic
tasks is more mitigated.
Using the HateCheck test suite (Röttger et al.,
2021,2022), we identify which hate speech
classes of functionalities suffer the most from
cross-lingual transfer, highlighting the impact
of slurs; and which ones benefit from joint
training with multilingual auxiliary tasks.
2 Related Work
Intermediate task training.
In order to improve
the efficiency of a pre-trained language model for a
given task, this model can undergo preliminary
fine-tuning on an intermediate task before fine-
tuning again on the downstream task. This idea
2
Our comparable datasets are available at
https://gi
thub.com/ArijRB/Multilingual-Auxiliary-T
asks-Training-Bridging-the-Gap-between-L
anguages-for-Zero-Shot-Transfer-of-/.
was formalized as Supplementary Training on In-
termediate Labeled-data Tasks (STILT) by Phang
et al. (2018), who perform sequential task-to-task
pre-training. More recently, Pruksachatkun et al.
(2020) perform a survey of intermediate and target
task pairs to analyze the usefulness of this inter-
mediary fine-tuning, but only in a monolingual set-
ting. Phang et al. (2020) turn towards cross-lingual
STILT. They fine-tune a language model on nine
intermediate language-understanding tasks in En-
glish and apply it to a set of non-English target
tasks. They show that machine-translating interme-
diate task data for training or using a multilingual
language model does not improve the transfer com-
pared to English training data. However, to the best
of our knowledge, using intermediate task training
data on both the source and the target language for
transfer has not been tested in the literature.
Auxiliary tasks for hate speech detection.
Auxiliary task training for hate speech detection
has been done almost exclusively with the senti-
ment analysis task (Bauwelinck, Nina and Lefever,
Els,2019;Aroyehun and Gelbukh,2021), and only
in monolingual scenarios. But additional informa-
tion is sometimes added to the hate speech clas-
sifier differently. Gambino and Pirrone (2020),
among the best systems on the HaSpeeDe task of
EVALITA 2020, use POS-tagged text as input of
the classification systems, which is highly bene-
ficial for Spanish and a bit less for German and
English. Furthermore, the effect of syntactic infor-
mation is also investigated by Narang and Brew
(2020), using classifiers based on the syntactic
structure of the text for abusive language detection.
Markov et al. (2021) evaluate the impact of manu-
ally extracted POS, stylometric and emotion-based
features on hate speech detection, showing that
the latter two are robust features for hate speech
detection across languages.
Zero-shot cross-lingual transfer for hate speech
detection
Due to the lack of annotated data on
many languages and domains for hate speech detec-
tion, zero-shot cross-lingual transfer has been tack-
led a lot in the literature. Among the most recent
work, Pelicon et al. (2021) investigates the impact
of a preliminary training of a classification model
on hate speech data languages different from the
target language; they show that language models
pre-trained on a small number of languages benefit
more of this intermediate training, and often out-
performs massively multilingual language models.
To perform cross-lingual experiment, Glavaš et al.
(2020) create a dataset with aligned examples in
six different languages, avoiding the issue of hate
speech variation across languages that we tackle in
this paper. On their aligned test set, they show the
positive impact of intermediate masked language
model fine-tuning on abusive corpora in the target
language. Using aligned corpora allows the authors
to focus on the effect of the intermediate finetuning
without the noise of inter-language variability. On
the contrary, in our case, we investigate the issue
of limited transferability of hate speech detection
models across languages. Nozza (2021), on which
this paper builds upon, demonstrates the limitation
of cross-lingual transfer for domain-specific hate
speech – in particular, hate speech towards women
– and explains it by showing examples of cultural
variation between languages. Some notable hate
speech vocabulary in one language may be used as
an intensifier in another language.
3
Stappen et al.
(2020) perform zero- and few-shots cross-lingual
transfer on some of the datasets we use in this paper,
with an attention-based classification model; but
contrarily to us, they do not distinguish between
the hate speech targets.
3 The Bottleneck of Zero-shot
Cross-lingual Transfer
3.1 Hate speech corpora
We use the same hate speech datasets as Nozza
(2021), who relied on them to point out the lim-
itations of zero-shot cross-lingual transfer. The
corpora are in three languages: English (en), Span-
ish (es) and Italian (it); and two domains: hate
speech towards immigrants and hate speech to-
wards women. The corpora come from various
shared tasks; For English and Spanish, we use the
dataset from a shared task on hate speech against
immigrants and women on Twitter (HatEval). For
the Italian corpora, we use the automatic misog-
yny identification challenge (AMI) (Fersini et al.,
2018) for the women domain and the hate speech
detection shared task on Facebook and Twitter
(HaSpeeDe) (Bosco et al.,2018) for the immigrants
domain. Links to the resources are listed in Table
6in Appendix A.
3
Nozza (2021) gives the example of the Spanish word
puta often used as an intensifier without any misogynistic
connotation, while it translates to a slang version of “prostitute”
in English.
The hate speech detection task is a binary classi-
fication task where each dataset is annotated with
two labels: hateful and non hateful. We train bi-
nary classification models on the train sets in each
language and predict on the test set of each lan-
guage, investigating two settings: 1) monolingual,
i.e, training and testing on the same language and
domain for hate speech; 2) zero-shot, cross-lingual,
i.e. training on one and testing on another. We
evaluate the models using macro-F1 as metric.
3.2 Original baseline results
The original results reported by Nozza (2021) can
be found in the first rows of Table 1. In the table,
we highlight in brown zero-shot cross-lingual cases
where the macro-F1 score drops by more than 25%
compared to the monolingual setting: these are
cases for which we consider that the cross-lingual
transfer failed. We observe the phenomenon that
raised the issue of zero-shot cross-lingual transfer:
in the women domain, the models trained on Span-
ish and Italian in a zero-shot setting have much
lower scores compared to the monolingual results;
4 out of the 6 cross-lingual cells are highlighted in
brown. One possible cause, as explained by Nozza
(2021), is the presence of language-specific offen-
sive interjections that lead the model to wrongly
classify text as hateful towards women.
On a side note, models trained and tested on the
English corpus on the immigrants domain have par-
ticularly low scores (macro-F1 of 36.8 in the mono-
lingual setting). This phenomenon was also ob-
served by Nozza (2021) and Stappen et al. (2020),
and is explained by the authors by the presence
of specific words and hashtags that were used for
scraping the tweets and that lead the model to over-
fit, linked with a large discrepancy between the
train and test set.
3.3 Experimental settings
Building comparable corpora.
We started this
work to investigate the failure of cross-lingual hate
speech datasets for the women domain highlighted
by Nozza (2021). However, these experiments
were not realized in comparable settings; the cor-
pora do not have the same size in the different
languages and domains. Our goal is to confirm
these results under a strictly comparable setting,
and a multi-seed robust experimental framework.
Therefore, we build comparable corpora in each
language and domain to ensure the comparability
of the transfer settings. We reduce all datasets to
Model
Src
lang
immigrants women
en es it en es it
m-BERT
Nozza (2021)
en 36.8 63.3 59.0 55.9 54.6 44.9
es 59.6 63.0 68.3 55.8 83.9 33.7
it 63.5 66.6 77.7 54.5 46.3 80.8
Comparable corpus size and new random split
m-BERT
en 72.5 48.5 63.8 75.2 41.7 43.4
es 59.4 80.9 58.5 54.5 76.9 40.5
it 62.8 54.8 76.3 46.3 53.6 88.3
XLM-R
en 75.3 51.9 70.1 76.6 51.6 49.9
es 62.0 83.4 65.4 63.4 77.8 46.9
it 69.2 51.3 78.6 60.3 57.3 89.0
XLM-T
en 76.8 48.5 73.5 78.6 61.5 60.6
es 65.9 84.2 60.7 72.5 80.3 51.9
it 71.5 56.8 78.4 63.4 58.2 90.3
Table 1: Monolingual and cross-lingual hate speech de-
tection macro-F1 scores on all corpora. All results ex-
cept for the one from Nozza (2021) are macro-F1 (%)
averaged over 5 runs. All use 20 epochs. Numbers in
brown highlight cases when the loss in performance in
the zero-shot cross-lingual case compared to the mono-
lingual case is higher than 25%.
a total size of 2 591 tweets, the size of the small-
est one, sampling from each original split sepa-
rately; each train set has 1 618 tweets, each devel-
opment set 173, and each test set 800. We use
the Kolmogorov–Smirnov test to compare the sen-
tence length distribution (number of tokens) and
the percentage of hate speech between the sampled
and the original datasets, to make sure they stay
comparable. The sampling is done randomly until
the similarity conditions with the original dataset
are met. The original size for each dataset as well
as the sampling size for building the comparable
datasets and the percentage of hateful examples can
be found in Table 7and Table 8in Appendix A.
On top of this, before the sub-sampling of the
corpora, we merge the development, test and train
dataset for each language and domain before per-
forming a new random split. This allows us to
overcome the train-test discrepancy observed in the
English-immigrants dataset we mentioned above.
Pre-processing.
We process the datasets by re-
placing all mentions and URLs with specific to-
kens, and segmenting the hashtags into words.
4
Given the compositional nature of hashtags (a set
of concatenated words), hashtag segmentation is
frequently done as a pre-processing step in the lit-
erature when handling tweets (e.g. (Röttger et al.,
4Using the Python package wordsegment.
2021)); it can improve tasks such as tweet cluster-
ing (Gromann and Declerck,2017).
Models training.
For all our experiments, we
use the MACHAMP v0.2 framework
5
(van der Goot
et al.,2021b), a multi-task toolkit based on Al-
lenNLP (Gardner et al.,2018). We keep most of the
default hyperparameters of MACHAMP for all ex-
periments, which the authors optimized on a wide
variety of tasks. We fine-tune a multilingual lan-
guage model on the hate speech detection task for
each of the six training corpora described in the pre-
vious section. We keep the best out of 20 epochs
for each run according to the macro-F1 score on
the development set.
Note that the new comparable test sets sampled
from the original corpora are relatively small (800
observations). To increase the robustness of the
results, we use five different seeds when fine-tuning
a language model on the hate speech detection task
and report the average macro-F1 over the five runs.
Language Models.
We use two general-domain
large-scale multilingual language models: m-
BERT (Devlin et al.,2019) following Nozza (2021)
and XLM-R (Conneau et al.,2020). The former
is the multilingual version of BERT, trained on
Wikipedia content in 104 languages, with 100M
parameters. The latter has the same architecture
as RoBERTa (Liu et al.,2019) with 550M parame-
ters and is trained on the publicly available 2.5 TB
CommonCrawl Corpus, covering 100 languages.
Then, we experiment with XLM-T (Barbieri
et al.,2021), an off-the-shelf XLM-R model fine-
tuned on 200 million tweets (1 724 million tokens)
scraped between 05/2018 and 03/2020, in more
than 30 languages, including our three target lan-
guages.
3.4 Setting a new baseline
We compare the scores for m-BERT from Nozza
(2021) to the scores obtained using our compara-
ble corpora, reported in Table 1. First, our experi-
ment with m-BERT on comparable corpora allows
us to highlight additional cases where zero-shot
cross-lingual transfer “fails” (macro-F1 dropping
by more than 25% compared to monolingual score)
in the immigrants domain, that were not visible
in the previous study due to variations in training
corpus size. On top of this, with the new splits,
5https://github.com/machamp-nlp/macha
mp, under the MIT license.
摘要:

MultilingualAuxiliaryTasksTraining:BridgingtheGapbetweenLanguagesforZero-ShotTransferofHateSpeechDetectionModelsSyrielleMontariolArijRiabiDjaméSeddahINRIAParis,Francefirstname.lastname@inria.frAbstractZero-shotcross-lingualtransferlearninghasbeenshowntobehighlychallengingfortasksinvolvingalotoflin...

展开>> 收起<<
Multilingual Auxiliary Tasks Training Bridging the Gap between Languages for Zero-Shot Transfer of Hate Speech Detection Models Syrielle MontariolArij RiabiDjamé Seddah.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:17 页 大小:283.18KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注