Social-Group-Agnostic Word Embedding Debiasing via the Stereotype Content Model Ali Omrani andBrendan Kennedy andMohammad Atari andMorteza Dehghani

2025-04-26 0 0 202.43KB 6 页 10玖币

侵权投诉

Social-Group-Agnostic Word Embedding Debiasing

via the Stereotype Content Model

Ali Omrani and Brendan Kennedy and Mohammad Atari and Morteza Dehghani

University of Southern California

{aomrani,btkenned,atari,mdehghan}@usc.edu

Abstract

Existing word embedding debiasing meth-

ods require social-group-speciﬁc word pairs

(e.g., “man”–“woman”) for each social at-

tribute (e.g., gender), which cannot be used

to mitigate bias for other social groups, mak-

ing these methods impractical or costly to in-

corporate understudied social groups in debi-

asing. We propose that the Stereotype Con-

tent Model (SCM), a theoretical framework de-

veloped in social psychology for understand-

ing the content of stereotypes, which struc-

tures stereotype content along two psycho-

logical dimensions — “warmth” and “compe-

tence” — can help debiasing efforts to become

social-group-agnostic by capturing the under-

lying connection between bias and stereo-

types. Using only pairs of terms for warmth

(e.g., “genuine”–“fake”) and competence (e.g.,

“smart”–“stupid”), we perform debiasing with

established methods and ﬁnd that, across gen-

der, race, and age, SCM-based debiasing per-

forms comparably to group-speciﬁc debiasing.

1 Introduction

The societal impacts of Natural Language Process-

ing (NLP) have stimulated research on measuring

and mitigating the unintended social-group biases

encoded in language models (Hovy and Spruit,

2016). However, the majority of this important

line of work is atheoretical in nature and “fails to

engage critically with what constitutes ‘bias’ in the

ﬁrst place” (Blodgett et al.,2020). Although there

is a multitude of approaches to word embedding

debiasing (Bolukbasi et al.,2016;Zhao et al.,2018;

Dev and Phillips,2019), most of these approaches

rely on group-speciﬁc bias subspaces. Recently,

Gonen and Goldberg (2019) found that, for ex-

ample, gender bias in word embeddings is more

systematic than simply debiasing along the “gen-

der” subspace. Combining well-established models

of bias and stereotyping from social psychology

with word embedding debiasing efforts, we follow

Blodgett et al. (2020) in proposing a theory-driven

debiasing approach that does not rely on a particu-

lar social group such as gender or race. We show

that, by relying on a theoretical understanding of

social stereotypes to deﬁne a group-agnostic bias

subspace, word embeddings can be adequately de-

biased across multiple social attributes.

Group-speciﬁc debiasing, which debiases along

subspaces deﬁned by social groups (e.g., gender or

race), is not only atheoretical but also unscalable.

For example, previous works’ excessive focus on

gender bias in word embeddings has driven the de-

velopment of resources for gender debiasing (e.g.,

equality sets for gender), but biases associated with

other social groups remain understudied. This is

due to the fact that resources developed for one

social group (e.g., gender) do not translate easily

to other groups. Furthermore, group-speciﬁc de-

biasing is limited in terms of generalizability: as

shown by Agarwal et al. (2019), stereotype con-

tent in word embeddings is deep-rooted, and thus

is not easily removed using explicit sets of group-

speciﬁc words. In contrast, a social-group-agnostic

approach would not have such restrictions.

Social-group bias mitigation, as a societal prob-

lem, can beneﬁt from social psychological theories

to understand the underlying structure of language-

embedded biases rather than attending to ad hoc

surface patterns. The Stereotype Content Model

(SCM) (Fiske et al.,2002) is a theoretical frame-

work developed in social psychology to understand

the content and function of stereotypes in inter-

personal and intergroup interactions. The SCM

proposes that human stereotypes are captured by

two primary dimensions of warmth (e.g., trustwor-

thiness, friendliness) and competence (e.g., capabil-

ity, assertiveness). From a socio-functional, prag-

matic perspective, people’s perception of others’

intent (i.e., warmth) and capability to act upon

their intentions (i.e., competence) affect their sub-

sequent emotion and behavior (Cuddy et al.,2009).

arXiv:2210.05831v1 [cs.CL] 11 Oct 2022

Depending on historical processes, various social

groups may be located in different stereotypic quad-

rants (high vs. low on warmth and competence)

based on this two-dimensional model.

Here, we propose that SCM-based debiasing can

provide a theory-driven and scalable solution for

mitigating social-group biases in word embeddings.

In our experiments, we ﬁnd that by debiasing with

respect to the subspace deﬁned by warmth and

competence, our SCM-based approach performs

comparably with group-speciﬁc debiasing. Our ap-

proach fares well both in terms of bias reduction

and the preservation of embedding utility (i.e., the

preservation of semantic and syntactic information)

(Bolukbasi et al.,2016), while having the advan-

tage of being social-group-agnostic.

2 Background

2.1 Post hoc Word Embedding Debiasing

Our work builds on post hoc debiasing, removing

biases by modifying pre-trained word embeddings.

Most work we review focuses on gender-related

debiasing (e.g., Bolukbasi et al.,2016;Zhao et al.,

2018;Dev and Phillips,2019); importantly, we

focus our work on other social categories as well,

bringing attention to these understudied groups.

Originally, Bolukbasi et al. (2016) proposed

Hard Debiasing (HD) for gender bias. HD re-

moves the gender component from inherently non-

gendered words and enforces an equidistance prop-

erty for inherently gendered word pairs (equality

sets). Two follow-ups to this work include Manzini

et al. (2019), which formulated a multiclass ver-

sion of HD for attributes such as race, and Dev and

Phillips (2019), which introduced Partial Projec-

tion, a method that does not require equality sets

and is more effective than HD in reducing bias. Ex-

tending these approaches to other social attributes

is not trivial because a set of deﬁnitional word pairs

has to be curated for each social group, which is a

dynamic and context-dependent task because these

pairs are dependent on historical moment.

Gonen and Goldberg (2019) provided evidence

that gender bias in word embeddings is deeper than

previously thought, and methods based on project-

ing words onto a “gender dimension” only hide

bias superﬁcially. They showed that after debias-

ing, most words maintain their relative position in

the debiased subspace. In this work, we address

the shortcomings highlighted by Gonen and Gold-

berg and Agarwal et al. with a theory-driven bias

subspace, rather than algorithmic improvement.

2.2 Bias and the Stereotype Content Model

The bias found in language models is rooted in hu-

man biases (Caliskan and Lewis,2022); thus, to

alleviate such biases, we should ground our debi-

asing approaches in social psychological theories

of stereotyping (Blodgett et al.,2020). The Stereo-

type Content Model (SCM) (Fiske et al.,2002;

Cuddy et al.,2009) is a social psychological theory

positing that stereotyping of different social groups

can be captured along two orthogonal dimensions,

“warmth” and “competence.” The warmth dimen-

sion of stereotypes has to do with people’s inten-

tions in interpersonal interactions, while the com-

petence dimension has to do with assessing others’

ability to act on those intentions. While there are

a number of other social psychological theories

capturing outgroup biases (e.g., Zou and Cheryan,

2017;Koch et al.,2016), SCM has been shown

to predict emotional and behavioral reactions to

societal outgroups.

2.3 The SCM and Language

SCM is a well-established theoretical frameworks

of stereotyping, and has begun to be applied in NLP.

Recently Nicolas et al. (2021) developed dictionar-

ies to measure warmth and competence in textual

data. Each dictionary was initialized with a set of

seed words from the literature which was further

expanded using WordNet (Miller,1995) to increase

the coverage of stereotypes collected from a sam-

ple of Americans. Fraser et al. (2021) showed that,

in word embeddings, SCM dictionaries capture the

group stereotypes documented in social psycholog-

ical research. Recently, Mostafazadeh Davani et al.

(2021) applied SCM dictionaries to quantify social

group stereotypes embedded in language, demon-

strating that patterns of prediction biases can be

explained using social groups’ warmth and compe-

tence embedded in language.

3 Methods & Evaluation

There are two components to each post hoc de-

biasing approach: the

Bias Subspace

, which de-

termines the subspace over which the algorithms

operate, and the

Algorithm

, which is how the word

embeddings are modiﬁed with respect to the bias

subspace. In this section, we review the concept

of bias subspaces, established algorithms for de-

biasing, and how bias is quantiﬁed in word em-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Social-Group-AgnosticWordEmbeddingDebiasingviatheStereotypeContentModelAliOmraniandBrendanKennedyandMohammadAtariandMortezaDehghaniUniversityofSouthernCalifornia{aomrani,btkenned,atari,mdehghan}@usc.eduAbstractExistingwordembeddingdebiasingmeth-odsrequiresocial-group-specicwordpairs(e.g.,manwom...

展开>> 收起<<

Social-Group-Agnostic Word Embedding Debiasing via the Stereotype Content Model Ali Omrani andBrendan Kennedy andMohammad Atari andMorteza Dehghani.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Social-Group-Agnostic Word Embedding Debiasing via the Stereotype Content Model Ali Omrani andBrendan Kennedy andMohammad Atari andMorteza Dehghani

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: