
transformer-based masked language models (MLM) pre-trained on a variety of text data are suitable
for general-purpose use cases.
Creating a domain-specific bias in a pre-training corpus has previously shown state-of-the-art results
(Gururangan et al., 2020). Thus, in this paper, we try to find the impact of hateful pre-training
on hate speech classification. The previous work has shown the positive impact of using Hateful
BERT for downstream hate speech identification tasks (Caselli et al., 2020; Sarkar et al., 2021).
However, it remains to be verified that the improvements were indeed due to the hateful nature of the
pre-training corpus or are simply a side effect of adaptation to target domain text. The past work in
the high-resource language is thus incomplete and does not provide sufficient evidence to analyze the
impact of hateful pre-training. To complete the analysis, we pre-train our models using both hateful
and non-hateful data from the target domain. Moreover, there is no previous work related to hateful
pre-training in low-resource Indic languages. Our work also tries to fill this gap for low-resource
languages.
While evaluating the impact of pre-training, we build some useful resources for Hindi and Marathi.
We introduce two new models MahaTweetBERT
2
and HindTweetBERT
3
pre-trained on 40 million
Marathi and Hindi tweets respectively. We use these models along with MuRIL Khanuja et al. (2021),
the state-of-the-art Indic multilingual BERT to generate baseline results. To extract the most hateful
and least hateful tweets from these 40 million corpora, we classify the tweets using previous state-of-
the-art models and choose tweets with the highest confidence (most hateful) and lowest confidence
(least hateful). We show that the selected data is indeed hateful by randomly choosing 2000 samples
each for both languages and labeling them manually. To actually see if hateful pre-training has
an impact, we compare the performances of models pre-trained on the most hateful, least hateful,
and random corpora against our baseline on downstream hate speech identification tasks. We show
that hateful pre-training is helpful when considered in isolation, however non-hateful or random
pre-training is equivalently good. The improvement in performance with hateful pre-training could be
a side effect of target domain adaptation and is not dependent upon the hatefulness of the pre-training
corpus. The hateful models are termed as MahaTweetBERT-Hateful
4
and HindTweetBERT-Hateful
5
.
The 40M tweets corpus is termed as L3Cube-MahaTweetCorpus and HindTweetCorpus for Marathi
and Hindi respectively. The datasets and models released as a part of this will be documented on
github6as well.
The main contributions of this work are as follows.
•
We show that hateful BERT is not always desirable for hate speech detection tasks, and the
BERT model pre-trained on non-hateful in-domain data yields similar or better performance.
•
We release pre-trained Twitter BERT models MahaTweetBERT and HindTweetBERT for
Marathi and Hindi. We also release MahaTweetBERT-Hateful and HindTweetBERT-Hateful,
the hateful version of the corresponding models. These models are fine-tuned versions
of current state-of-the-art MahaBERT and HindBERT models on corresponding language
tweets data (40 M sentences).
•
We release gold standard benchmark hate speech detection datasets HateEval-Mr and
HateEval-Hi with 2000 manually labeled tweets.
2 Related Work
Pre-trained models have obtained remarkable results in many areas of NLP. Although these pre-
trained models are well suited for generalized tasks, they have some limitations in domain-specific
tasks. To combat this, numerous domain-specific models have been developed based on the BERT
architecture (Devlin et al., 2018). Domain-specific NLP models are pre-trained on in-domain data
that is unique to a specific category of text. For example, BioBERT (Lee et al., 2020) is a model
that is trained on large-scale biomedical corpora. It outperforms the previous state-of-the-art models
2MahaTweetBERT
3HindTweetBERT
4MahaTweetBERT-Hateful
5HindTweetBERT-Hateful
6MarathiNLP
2