PolyHope/ Balouchzahi et al.
•Subtask A - Binary Hope Speech Detection:
In this task, each tweet will be identified as either Hope or Not
Hope,
•Subtask B - Multiclass Hope Speech Detection:
In this task, each tweet will be classified into fine-grained
hope categories: Generalized Hope, Realistic Hope, and Unrealistic Hope, along with Not Hope tweets.
The rest of the paper is organized as follows: Hope is defined in detail in 2; Existing hope speech detection corpora,
limitations, and techniques used for hope speech detection are discussed in 3. RELATED WORK and steps in
dataset creation are presented in 4. DATASET DEVELOPMENT; 5. BENCHMARKS and 6. RESULTS describe
the baselines and results, respectively, followed by the performance analysis of baselines in 7. ERROR ANALYSIS.
Eventually, 8. The DISCUSSION describes the dataset’s characteristics and limitations, and we conclude the paper in 9.
CONCLUSION AND FUTURE WORKS.
2 Definitions
Hope was studied in psychology as cognitive-based [
20
] and emotion-based [
21
] models. According to Snyder et
al. (1991) [
20
], hope was described as a cognitive-based model and defined in terms of a goal-setting framework,
where a person is motivated to remain engaged with a future outcome and can anticipate a way to reach that outcome.
Conversely, Averill et al. (2012) [
21
] described hope as an emotion-associated model that depends on the perceived
likelihood of achieving an outcome.
There are diverse definitions of hope reported in the literature. Hope is defined as an integral part of being a human [
12
,
22
] and usually a future-oriented thinking [
14
]. It encourages the person to transform his/her intentions to act and
prevent despair and depression [9]. Verhaeghe et al. (2007) [11] describe hope as a psychological process of adapting
to some unfortunate or unexpected event and situation. Eaves et al. (2016) [
17
] believe that hope is a dynamic and
multi-faceted mindset that can be considered a biological and supernatural medicine that directly impacts human health.
In the other definition, Maretha (2021) [
23
] presents hope as personal feelings co-related with mental activities of desire
and claims that it is an encouragement accompanied by desire, and the tendency that arises is in the form of real and
unreal expectations. Snyder (2002) [
24
], more generally, describes hope as a desire for something to happen or to be
true, which is commonly associated with promise, potential, support, reassurance, suggestions, or inspiration during
periods of illness, anger, stress, loneliness, and depression.
Hence, we conclude that hope is “a future-oriented expectation, desire or wish towards a general or specific
event/outcome phenomenon that has a significant impact on human behavior, decision, and emotions."
3 RELATED WORK
Hope is a partially subjective term that both psychologists and philosophers are struggling to define it [
24
]. Hope
analysis can be located within social media tasks’ growing interest. However, most of the ongoing research on social
media is focused on controlling and eliminating harmful content such as hate speech, abuse and offensive, misogyny
detection, and false information or emotion analysis tasks [
7
]. Hope speech detection as a Natural Language Processing
(NLP) task was introduced by Chakravarthi et al. (2020) [
7
] and Palakodety et al. (2020) [
8
] by proposing two
multilingual corpora that classify each YouTube comment into Hope and Not Hope categories. Details of these corpora
are presented in Table 1. The existing hope speech corpora and their limitations, followed by the techniques for hope
speech detection, are described below:
3.1 Hope speech detection corpora
War-torn regions reveal a lot about the sentiments of people suffering and striving for peace. A comprehensive report [
8
]
on Kashmir (disputed territory) revealed instances of hope speech in YouTube comments after the Pulwama terror attack
on February 14, 2019. Palakodety et al. (2020) [
8
] constructed a multilingual dataset of YouTube comments in English
and Hindi written in Roman and Devanagari scripts, respectively. They used a combination of polyglot embeddings
from FastText (100-dimensions), sentiment score, and n-grams (1-3) with Logistic Regression (LR) to achieve the best
averaged-macro F1-score of 78.51 (
±
2.24%). They modeled hope speech detection as a positive comment mining
task (positive/negative sentiments) which shows a very shallow understanding of hope as a subject. In reality, hope is a
broad phenomenon with various emotions.
Chakravarthi et al. (2020) [
7
] ignited the other spark of hope speech detection in social media platforms by developing
a HopeEDI corpus from YouTube comments in Dravidian and English languages. Initially, the corpus consisted of
3