Social-Group-Agnostic Word Embedding Debiasing via the Stereotype Content Model Ali Omrani andBrendan Kennedy andMohammad Atari andMorteza Dehghani

2025-04-26 0 0 202.43KB 6 页 10玖币
侵权投诉
Social-Group-Agnostic Word Embedding Debiasing
via the Stereotype Content Model
Ali Omrani and Brendan Kennedy and Mohammad Atari and Morteza Dehghani
University of Southern California
{aomrani,btkenned,atari,mdehghan}@usc.edu
Abstract
Existing word embedding debiasing meth-
ods require social-group-specific word pairs
(e.g., “man”–“woman”) for each social at-
tribute (e.g., gender), which cannot be used
to mitigate bias for other social groups, mak-
ing these methods impractical or costly to in-
corporate understudied social groups in debi-
asing. We propose that the Stereotype Con-
tent Model (SCM), a theoretical framework de-
veloped in social psychology for understand-
ing the content of stereotypes, which struc-
tures stereotype content along two psycho-
logical dimensions — “warmth” and “compe-
tence” — can help debiasing efforts to become
social-group-agnostic by capturing the under-
lying connection between bias and stereo-
types. Using only pairs of terms for warmth
(e.g., “genuine”–“fake”) and competence (e.g.,
“smart”–“stupid”), we perform debiasing with
established methods and find that, across gen-
der, race, and age, SCM-based debiasing per-
forms comparably to group-specific debiasing.
1 Introduction
The societal impacts of Natural Language Process-
ing (NLP) have stimulated research on measuring
and mitigating the unintended social-group biases
encoded in language models (Hovy and Spruit,
2016). However, the majority of this important
line of work is atheoretical in nature and “fails to
engage critically with what constitutes ‘bias’ in the
first place” (Blodgett et al.,2020). Although there
is a multitude of approaches to word embedding
debiasing (Bolukbasi et al.,2016;Zhao et al.,2018;
Dev and Phillips,2019), most of these approaches
rely on group-specific bias subspaces. Recently,
Gonen and Goldberg (2019) found that, for ex-
ample, gender bias in word embeddings is more
systematic than simply debiasing along the “gen-
der” subspace. Combining well-established models
of bias and stereotyping from social psychology
with word embedding debiasing efforts, we follow
Blodgett et al. (2020) in proposing a theory-driven
debiasing approach that does not rely on a particu-
lar social group such as gender or race. We show
that, by relying on a theoretical understanding of
social stereotypes to define a group-agnostic bias
subspace, word embeddings can be adequately de-
biased across multiple social attributes.
Group-specific debiasing, which debiases along
subspaces defined by social groups (e.g., gender or
race), is not only atheoretical but also unscalable.
For example, previous works’ excessive focus on
gender bias in word embeddings has driven the de-
velopment of resources for gender debiasing (e.g.,
equality sets for gender), but biases associated with
other social groups remain understudied. This is
due to the fact that resources developed for one
social group (e.g., gender) do not translate easily
to other groups. Furthermore, group-specific de-
biasing is limited in terms of generalizability: as
shown by Agarwal et al. (2019), stereotype con-
tent in word embeddings is deep-rooted, and thus
is not easily removed using explicit sets of group-
specific words. In contrast, a social-group-agnostic
approach would not have such restrictions.
Social-group bias mitigation, as a societal prob-
lem, can benefit from social psychological theories
to understand the underlying structure of language-
embedded biases rather than attending to ad hoc
surface patterns. The Stereotype Content Model
(SCM) (Fiske et al.,2002) is a theoretical frame-
work developed in social psychology to understand
the content and function of stereotypes in inter-
personal and intergroup interactions. The SCM
proposes that human stereotypes are captured by
two primary dimensions of warmth (e.g., trustwor-
thiness, friendliness) and competence (e.g., capabil-
ity, assertiveness). From a socio-functional, prag-
matic perspective, people’s perception of others’
intent (i.e., warmth) and capability to act upon
their intentions (i.e., competence) affect their sub-
sequent emotion and behavior (Cuddy et al.,2009).
arXiv:2210.05831v1 [cs.CL] 11 Oct 2022
Depending on historical processes, various social
groups may be located in different stereotypic quad-
rants (high vs. low on warmth and competence)
based on this two-dimensional model.
Here, we propose that SCM-based debiasing can
provide a theory-driven and scalable solution for
mitigating social-group biases in word embeddings.
In our experiments, we find that by debiasing with
respect to the subspace defined by warmth and
competence, our SCM-based approach performs
comparably with group-specific debiasing. Our ap-
proach fares well both in terms of bias reduction
and the preservation of embedding utility (i.e., the
preservation of semantic and syntactic information)
(Bolukbasi et al.,2016), while having the advan-
tage of being social-group-agnostic.
2 Background
2.1 Post hoc Word Embedding Debiasing
Our work builds on post hoc debiasing, removing
biases by modifying pre-trained word embeddings.
Most work we review focuses on gender-related
debiasing (e.g., Bolukbasi et al.,2016;Zhao et al.,
2018;Dev and Phillips,2019); importantly, we
focus our work on other social categories as well,
bringing attention to these understudied groups.
Originally, Bolukbasi et al. (2016) proposed
Hard Debiasing (HD) for gender bias. HD re-
moves the gender component from inherently non-
gendered words and enforces an equidistance prop-
erty for inherently gendered word pairs (equality
sets). Two follow-ups to this work include Manzini
et al. (2019), which formulated a multiclass ver-
sion of HD for attributes such as race, and Dev and
Phillips (2019), which introduced Partial Projec-
tion, a method that does not require equality sets
and is more effective than HD in reducing bias. Ex-
tending these approaches to other social attributes
is not trivial because a set of definitional word pairs
has to be curated for each social group, which is a
dynamic and context-dependent task because these
pairs are dependent on historical moment.
Gonen and Goldberg (2019) provided evidence
that gender bias in word embeddings is deeper than
previously thought, and methods based on project-
ing words onto a “gender dimension” only hide
bias superficially. They showed that after debias-
ing, most words maintain their relative position in
the debiased subspace. In this work, we address
the shortcomings highlighted by Gonen and Gold-
berg and Agarwal et al. with a theory-driven bias
subspace, rather than algorithmic improvement.
2.2 Bias and the Stereotype Content Model
The bias found in language models is rooted in hu-
man biases (Caliskan and Lewis,2022); thus, to
alleviate such biases, we should ground our debi-
asing approaches in social psychological theories
of stereotyping (Blodgett et al.,2020). The Stereo-
type Content Model (SCM) (Fiske et al.,2002;
Cuddy et al.,2009) is a social psychological theory
positing that stereotyping of different social groups
can be captured along two orthogonal dimensions,
“warmth” and “competence. The warmth dimen-
sion of stereotypes has to do with people’s inten-
tions in interpersonal interactions, while the com-
petence dimension has to do with assessing others’
ability to act on those intentions. While there are
a number of other social psychological theories
capturing outgroup biases (e.g., Zou and Cheryan,
2017;Koch et al.,2016), SCM has been shown
to predict emotional and behavioral reactions to
societal outgroups.
2.3 The SCM and Language
SCM is a well-established theoretical frameworks
of stereotyping, and has begun to be applied in NLP.
Recently Nicolas et al. (2021) developed dictionar-
ies to measure warmth and competence in textual
data. Each dictionary was initialized with a set of
seed words from the literature which was further
expanded using WordNet (Miller,1995) to increase
the coverage of stereotypes collected from a sam-
ple of Americans. Fraser et al. (2021) showed that,
in word embeddings, SCM dictionaries capture the
group stereotypes documented in social psycholog-
ical research. Recently, Mostafazadeh Davani et al.
(2021) applied SCM dictionaries to quantify social
group stereotypes embedded in language, demon-
strating that patterns of prediction biases can be
explained using social groups’ warmth and compe-
tence embedded in language.
3 Methods & Evaluation
There are two components to each post hoc de-
biasing approach: the
Bias Subspace
, which de-
termines the subspace over which the algorithms
operate, and the
Algorithm
, which is how the word
embeddings are modified with respect to the bias
subspace. In this section, we review the concept
of bias subspaces, established algorithms for de-
biasing, and how bias is quantified in word em-
摘要:

Social-Group-AgnosticWordEmbeddingDebiasingviatheStereotypeContentModelAliOmraniandBrendanKennedyandMohammadAtariandMortezaDehghaniUniversityofSouthernCalifornia{aomrani,btkenned,atari,mdehghan}@usc.eduAbstractExistingwordembeddingdebiasingmeth-odsrequiresocial-group-specicwordpairs(e.g.,“man”–“wom...

展开>> 收起<<
Social-Group-Agnostic Word Embedding Debiasing via the Stereotype Content Model Ali Omrani andBrendan Kennedy andMohammad Atari andMorteza Dehghani.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:6 页 大小:202.43KB 格式:PDF 时间:2025-04-26

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注