
Regarding the second aspect of stigma, prior
work in psychology has developed ways to evalu-
ate specific stereotypes towards individuals with
mental illness. Specifically, the widely used attri-
bution model developed by Corrigan et al. (2003)
defines nine dimensions of stigma
3
about people
with mental illness: blame, anger, pity, help, dan-
gerousness, fear, avoidance, segregation, and coer-
cion. The model uses a questionnaire (AQ-27) to
evaluate the respondent’s stereotypical perceptions
towards people with mental health conditions (Cor-
rigan et al.,2003). To the best of our knowledge,
no prior work has examined how these stereotypes
4
differ towards people with mental health conditions
from different gender groups.
Bias research in NLP.
There is a large body
of prior work on bias in NLP models, particularly
focusing on gender, race, and disability (Garrido-
Muñoz et al.,2021;Blodgett et al.,2020;Liang
et al.,2021). Most of these works study bias in
a single dimension as intersectionality is difficult
to operationalize (Field et al.,2021), though a
few have investigated intersections like gender and
race (Tan and Celis,2019;Davidson et al.,2019).
Our methodology follows prior works that used
contrastive sentence pairs to identify bias (Nan-
gia et al.,2020;Nadeem et al.,2020;Zhao et al.,
2018;Rudinger et al.,2018), but unlike existing
research, we draw our prompts and definitions of
stigma directly from psychology studies (Corrigan
et al.,2003;Schwarzer et al.,2011).
Mental health related bias in NLP.
There has
been little work examining mental health bias in
existing models. One relevant work evaluated
mental health bias in two commonly used word
embeddings, GloVe and Word2Vec (Straw and
Callison-Burch,2020). Our project expands upon
this work as we focus on more recent MLMs, in-
cluding general-purpose MLM RoBERTa, as well
as MLMs pretrained on health and mental health
corpora, MentalRoBERTa (Ji et al.,2021) and Clin-
icalLongformer (Li et al.,2022). Another line of
work studied demographic-related biases in mod-
els and datasets used for identifying depression in
3
We use stigma in this paper to refer to public stigma,
which can be more often reflected in language than other types
of stigma: self stigma and label avoidance.
4
Dimensions of stigma refers to the nine dimensions of
public stigma of mental health, stereotypes towards people
with mental health conditions refers to specific stereotypical
perceptions. For example, “dangerousness” is a dimension of
stigma and “people with schizophrenia are dangerous” is a
stereotype.
social media texts (Aguirre et al.,2021;Aguirre
and Dredze,2021;Sherman et al.,2021). These
works focus on extrinsic biases – biases that surface
in downstream applications, such as poor perfor-
mance for particular demographics. Our paper dif-
fers in that we focus on intrinsic bias in MLMs – bi-
ases captured within a model’s parameters – which
can lead to downstream extrinsic biases when such
models are applied in the real world.
3 Methodology
We develop a framework grounded in social psy-
chology literature to measure MLMs’ gendered
mental health biases. Our core methodology
centers around (1) curating mental-health-related
prompts and (2) comparing the gender associations
of tokens generated by the MLMs.
5
In this section,
we discuss methods for the two research questions
introduced in § 2.
3.1 RQ1: General Gender Associations with
Mental Health Status
RQ1 explores whether models associate mental ill-
ness more with a particular gender. To explore
this, we conduct experiments in which we mask
out the subjects
6
in the sentences, then evaluate the
model’s likelihood of filling in the masked subjects
with male, female, or gender-unspecified words,
which include pronouns, nouns, and names. The
overarching idea is that if the model is consistently
more likely to predict a female subject, this would
indicate that the model might be encoding preexist-
ing societal presuppositions that women are more
likely to have a mental health condition. We an-
alyze these likelihoods quantitatively to identify
statistically significant patterns in the model’s gen-
der choices.
Prompt Curation.
We manually construct three
sets of simple prompts that reflect different stages
of seeking healthcare. These stages are grounded
in the Health Action Process Approach (HAPA)
(Schwarzer et al.,2011), a psychology theory that
models how individuals’ health behaviors change.
We develop prompt templates in three different
stages to explore stigma at different parts of the
5
We choose to use mask-filling, as opposed to generating
free text or dialogue responses about mental health, because
mask-filling provides a more controlled framework: there are
a finite set of options to define the mask in a sentence, which
makes it easier to analyze and interpret the results.
6
"Subject" refers to the person being described, which may
or may not be the grammatical subject of the sentence.