Hate Speech and Offensive Language Detection in Bengali Mithun Das Somnath Banerjee Punyajoy Saha Animesh Mukherjee Indian Institute of Technology Kharagpur West Bengal India

2025-05-06 0 0 451.81KB 11 页 10玖币

侵权投诉

Hate Speech and Offensive Language Detection in Bengali

Mithun Das, Somnath Banerjee, Punyajoy Saha, Animesh Mukherjee

Indian Institute of Technology Kharagpur, West Bengal, India

mithundas@iitkgp.ac.in, som.iitkgpcse@kgpian.iitkgp.ac.in,

punyajoys@iitkgp.ac.in, animeshm@cse.iitkgp.ac.in

Abstract

Social media often serves as a breeding ground

for various hateful and offensive content. Iden-

tifying such content on social media is crucial

due to its impact on the race, gender, or re-

ligion in an unprejudiced society. However,

while there is extensive research in hate speech

detection in English, there is a gap in hateful

content detection in low-resource languages

like Bengali. Besides, a current trend on so-

cial media is the use of Romanized Bengali

for regular interactions. To overcome the ex-

isting research’s limitations, in this study, we

develop an annotated dataset of 10K Bengali

posts consisting of 5K actual and 5K Roman-

ized Bengali tweets. We implement several

baseline models for the classiﬁcation of such

hateful posts. We further explore the inter-

lingual transfer mechanism to boost classiﬁca-

tion performance. Finally, we perform an in-

depth error analysis by looking into the mis-

classiﬁed posts by the models. While training

actual and Romanized datasets separately, we

observe that XLM-Roberta performs the best.

Further, we witness that on joint training and

few-shot training, MuRIL outperforms other

models by interpreting the semantic expres-

sions better. We make our code and dataset

public for others1.

1 Introduction

Social media websites like Twitter and Facebook

have brought billions of people together and given

them the opportunity to share their thoughts and

opinions rapidly. On the one hand, it has facilitated

communication and the growth of social networks;

on the other, it has been exploited to propagate mis-

information, violence, and hate speech (Mathew

et al.,2019;Das et al.,2020) against users based

on their gender, race, religion, or other character-

istics. If such content is left unaddressed, it may

1https://github.com/hate-alert/

Bengali_Hate

result in widespread conﬂict and violence, raising

concerns about the safety of human rights, the rule

of law, and freedom of speech, all of which are cru-

cial for the growth of an unprejudiced democratic

society (Rizwan et al.,2020). Organizations such

as Facebook have been blamed for being a forum

for instigating anti-Muslim violence in Sri Lanka

that resulted in the deaths of three individuals

, and

a UN report accused them of disseminating hate

speech in a way that contributed signiﬁcantly to the

plausible genocide of the Rohingya population in

Myanmar3.

In order to reduce the dissemination of such

harmful content, these platforms have developed

certain guidelines

that the users of these platforms

ought to comply with. If these rules aren’t followed,

the post can get deleted, or the user’s account might

get suspended. Even to diminish the harmful con-

tent from their forum, these platforms engage mod-

erators (Newton,2019) to manually review the

posts and preserve the platform as wholesome and

people-friendly. However, this moderation strat-

egy is conﬁned by the moderators’ speed, jargon,

capability to understand the development of slang,

and familiarity with multilingual content. More-

over, due to the sheer magnitude of data streaming,

it is also an ambitious endeavor to examine each

post manually and ﬁlter out such harmful content.

Hence, an automated technique for detecting hate

speech and offensive language is extremely neces-

sary and inevitable.

It has already been witnessed that Facebook vig-

orously eliminated a considerable amount of mali-

cious content from its platforms even before users

reported it (Robertson,2020). However, the hin-

drance is that these platforms can detect harmful

2https://tinyurl.com/sriLankaRiots

3https://www.reuters.com/investigates/

special-report/myanmar-facebook-hate

4https://help.twitter.

com/en/rules-and-policies/

hateful-conduct-policy

arXiv:2210.03479v1 [cs.CL] 7 Oct 2022

content in certain popular languages such as En-

glish, Spanish, etc. (Perrigo,2019) So far, several

investigations have been conducted to identify hate

speech automatically, focusing mainly on the En-

glish language; therefore, an effort is required to

determine and diminish such hateful content in low-

resource languages.

With more than 210 million speakers, Bengali

is the seventh most widely spoken language

, with

around 100 million Bengali speakers in Bangladesh

and 85 million in India. Apart from Bangladesh and

India, Bengali is spoken in many countries, includ-

ing the United Kingdom, the United States, and the

Middle East

. Also, a current trend on social media

platforms is that apart from actual Bengali, people

tend to write Bengali using Latin scripts(English

characters) and often use English phrases in the

same conversation. This unique and informal com-

munication dialect is called code-mixed Bengali

or Roman Bengali. Code-mixing makes it easier

for speakers to communicate with one another by

providing a more comprehensive range of idioms

and phrases. However, as emphasized by Chittaran-

jan et al. (Chittaranjan et al.,2014), this has made

the task of creating NLP tools more challenging.

Along with these challenges, the challenges spe-

ciﬁc to identifying hate speech in Roman Bengali

contain the following: Absence of a hate speech

dataset,Lack of benchmark models. Thus, there is

a need to develop open efﬁcient datasets and mod-

els to detect hate speech in Bengali. Although few

studies have been conducted in developing Ben-

gali hate speech datasets, most of these have been

crawled with comments from Facebook pages, and

all of them are in actual Bengali. Hence, there is a

need for developing more benchmarking datasets

considering other popular platforms. To address

these limitations, in this study, we make the follow-

ing contributions.

First, we create a gold-standard dataset of 10K

tweets among which 5K tweets are actual Ben-

gali and 5K tweets are Roman Bengali.

Second, we implement several baseline mod-

els to identify such hateful and offensive con-

tent automatically for both actual & Roman

Bengali tweets.

Third, we explore several interlingual transfer

mechanisms to boost the classiﬁcation perfor-

5https://www.berlitz.com/en-uy/blog/

most-spoken-languages-world

6https://www.britannica.com/topic/

Bengali-language

mance.

Finally, we perform in-depth error analysis

by looking into a sample of posts where the

models mis-classify some of the test instances.

2 Related Work

Over the past few years, research around automated

hate speech detection has been evolved tremen-

dously. The earlier effort in developing resources

for the hate speech detection was mainly focused

around English language (Waseem and Hovy,2016;

Davidson et al.,2017;Founta et al.,2018). Re-

cently, in an effort to create multilingual hate

speech datasets, several shared task competitions

have been organized (HASOC (Mandl et al.,2019),

OffensEval (Zampieri et al.,2019)„ TRAC (Kumar

et al.,2020), etc.), and multiple datasets such as

Hindi (Modha et al.,2021), Danish (Sigurbergs-

son and Derczynski,2020), Greek (Pitenis et al.,

2020), Turkish (Çöltekin,2020), Mexican Span-

ish (Aragón et al.,2019), etc. have been made

public. There is also some work to detect hate

speech in actual Bengali. Ismam et al. (Ishmam

and Sharmin,2019) collected and annotated 5K

comments from Facebook into six classes-inciteful,

hate speech,religious hatred,communal attack,

religious comments, and political comments. How-

ever,the dataset is not publicly available. Karim

et al. (Karim et al.,2021) provided a dataset of

8K hateful posts collected from multiple sources

such as Facebook, news articles, blogs, etc. One

of the problems with this dataset is that all com-

ments are part of any hate class(personal,geopolit-

ical,religious, and political), so we cannot build

hate speech detection models using this dataset to

screen out hate speech. Romim et al. (2021) cu-

rated a dataset of 30K comments, making it one of

the most extensive datasets for hateful statements.

The author achieved 87.5% accuracy on their test

dataset using the SVM model. However, these

datasets do not consider Roman Bengali posts, a

prevalent communication method on social media

nowadays.

With regards to the detection systems, earlier

methods examined simple linguistic features such

as character and word n-grams, POS tags, tf-idf

with a traditional classiﬁer such as LR, SVM, De-

cision Tree, etc (Davidson et al.,2017). With the

development of larger datasets, researchers have

shifted to data-hungry complex models such as

deep learning (Pitsilis et al.,2018;Zhang et al.,

Actual Roman Total

Hateful 825 510 1,335

Offensive 1,341 2,063 3,404

Normal 2,905 2,534 5,439

Total 5,071 5,107 10,178

Table 1: Dataset Statistics of both Actual and Roman

tweets.

2018) and graph embedding techniques to enrich

the classiﬁer performance.

Recently, transformer-based (Vaswani et al.,

2017) language models such as BERT, XLM-

RoBERTa (Devlin et al.,2019) are becoming quite

popular in several downstream tasks. It has al-

ready been observed that these transformer-based

models outperform several earlier deep learning

models (Mathew et al.,2021). Having observed

these transformer-based models’ superior perfor-

mance, we focus on building these models for our

classiﬁcation task.

Further, researchers have begun to explore few

shot classiﬁcations. One of the most popular tech-

niques for few-shot classiﬁcation is transfer learn-

ing - where a model (pre-trained in a similar do-

main) is further ﬁne-tuned on a few labeled samples

in the target domain (Alyafeai et al.,2020). Keep-

ing these experiments in mind, we also examine

the ability of transfer learning capabilities between

actual and Roman Bengali data.

3 Dataset Creation

In this section, we provide the data collection pro-

cedure, annotation strategies we have followed and

the statistics of the collected dataset.

3.1 Dataset collection and sampling

In this paper, we collect our dataset from

Twit-

ter

. Despite Hatebase.org maintaining the most

extensive collection of multilingual hateful words,

it still lacks such lexicon base for Bengali

. To

sample Bengali (actual and romanized) tweets for

annotation, we create a lexicon of 74 abusive

terms8). These lexicons consist of derogatory key-

words/slurs targeting individuals or different pro-

tected communities. We also include words based

on the name of the targeted communities. The

choice to add names of targeted communities is

made in order to extract random hateful/offensive

7https://hatebase.org/

8https://tinyurl.com/bengaliHate

tweets that do not contain any abusive words. Us-

ing Twitter API, we searched for tweets containing

phrases from the lexicons, which resulted in a sam-

ple of 500K tweets for actual Bengali and 150K

tweets for Roman Bengali. To evade problems re-

lated to user distribution bias, as highlighted by

Arango et al. (Arango et al.,2019), we limit a max-

imum of 75 tweets per user. We also do not use

more than 500 tweets per month to avoid event-

speciﬁc tweets in our dataset.

3.2 Annotation procedure

We employed four undergraduate students for our

annotation task. All undergraduate students are

Computer Science majors and native Bengali speak-

ers. They have been recruited voluntarily through

departmental emails and compensated via an Ama-

zon gift card. Two Ph.D. students led the anno-

tation process as expert annotators. Both expert

annotators had previous experience working with

malicious content on social media. Each tweet in

our dataset contains two kinds of annotations: ﬁrst

whether the text is hate speech, offensive speech, or

normal; second, the target communities in the text.

This additional annotation of the target community

can help us measure bias in the model. Table 3lists

the target groups we have considered.

Annotation guidelines:

The annotation scheme

stated below constitute the main guidelines for the

annotators, while a codebook ensured common un-

derstanding of the label descriptions. We construct

our codebook (which consists the annotation guide-

lines for identifying hateful and offensive tweets

based on the deﬁnitions summarized as follows.

-Hate speech:

Hate speech is a language used

to express hatred toward a targeted individual

or group or is intended to be derogatory, hu-

miliating, or insulting to the group members

based on attributes such as race, religion, eth-

nic origin, sexual orientation, disability, caste,

geographic location or gender.

-Offensive:

Offensive speech uses profanity,

strongly impolite, rude, or vulgar language

expressed with ﬁghting or hurtful words to

insult a targeted individual or group.

-Normal:

This contains tweets that do not fall

into the above categories.

3.3 Dataset creation steps

As a ﬁrst step for creating the dataset, we required

a pilot gold-label dataset to instruct the annota-

tors. Initially, the expert annotators annotated 100

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

HateSpeechandOffensiveLanguageDetectioninBengaliMithunDas,SomnathBanerjee,PunyajoySaha,AnimeshMukherjeeIndianInstituteofTechnologyKharagpur,WestBengal,Indiamithundas@iitkgp.ac.in,som.iitkgpcse@kgpian.iitkgp.ac.in,punyajoys@iitkgp.ac.in,animeshm@cse.iitkgp.ac.inAbstractSocialmediaoftenservesasabreedi...

展开>> 收起<<

Hate Speech and Offensive Language Detection in Bengali Mithun Das Somnath Banerjee Punyajoy Saha Animesh Mukherjee Indian Institute of Technology Kharagpur West Bengal India.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Hate Speech and Offensive Language Detection in Bengali Mithun Das Somnath Banerjee Punyajoy Saha Animesh Mukherjee Indian Institute of Technology Kharagpur West Bengal India

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: