How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions Hritik BansalDa YinMasoud Monajatipoor Kai-Wei Chang

2025-05-06 0 0 1008.54KB 13 页 10玖币

侵权投诉

How well can Text-to-Image Generative Models understand Ethical

Natural Language Interventions?

Hritik Bansal∗Da Yin∗Masoud Monajatipoor Kai-Wei Chang

Computer Science Department, University of California, Los Angeles

{hbansal,da.yin,kwchang}@cs.ucla.edu,

monajati@ucla.edu

Abstract

Text-to-image generative models have

achieved unprecedented success in generating

high-quality images based on natural language

descriptions. However, it is shown that these

models tend to favor speciﬁc social groups

when prompted with neutral text descriptions

(e.g., ‘a photo of a lawyer’). Following Zhao

et al. (2021), we study the effect on the

diversity of the generated images when adding

ethical intervention that supports equitable

judgment (e.g., ‘if all individuals can be a

lawyer irrespective of their gender’) in the

input prompts. To this end, we introduce

an Ethical NaTural Language Interventions

in Text-to-Image GENeration (ENTIGEN)

benchmark dataset to evaluate the change

in image generations conditional on ethical

interventions across three social axes – gender,

skin color, and culture. Through ENTIGEN

framework, we ﬁnd that the generations

from minDALL·E, DALL·E-mini and Stable

Diffusion cover diverse social groups while

preserving the image quality. Preliminary

studies indicate that a large change in the

model predictions is triggered by certain

phrases such as ‘irrespective of gender’ in the

context of gender bias in the ethical interven-

tions. We release code and annotated data

at https://github.com/Hritikbansal/

entigen_emnlp.

1 Introduction

Recent Text-to-Image generative models (Ramesh

et al.,2021,2022;Ding et al.,2021;Saharia et al.,

2022;Nichol et al.,2021;Rombach et al.,2022)

can synthesize high-quality photo-realistic images

conditional on natural language text descriptions

in a zero-shot fashion. For instance, they can gen-

erate an image of ‘an armchair in the shape of an

avocado’ which appears rarely in the real world.

However, despite the unprecedented zero-shot abil-

ities of the text-to-image generative models, recent

∗∗ Equal Contribution

Text-to-Image Generative Model

a photo of a

[profession]

a photo of a [profession] if all individuals

can be a [profession] irrespective of their

[gender/skin color]

a photo of a

person wearing

a [object]

a photo of a person wearing a [object] if all

individuals can wear a [object]

irrespective of their [gender/skin color]

Original Original with Ethical Intervention CLIP and Human Evaluation

a photo of a

bride a photo of a bride from diverse cultures

Man

Woman

Gender Bias

Light

Dark

Skin Color Bias

Western

Non-

Western

Cultural Bias

Figure 1: We study the change in the text-to-image

model generations across various groups (man/woman,

light-skinned/dark-skinned, Western/Non-Western) be-

fore and after adding ethical interventions (in purple)

during text-to-image generation. To analyze the bias in

model outputs, we use CLIP and Human to annotate so-

cial groups of the model generations. We present a few

generated results in Appendix Fig. 4-8.

experiments with small-scale instantiations (such

as minDALL

E) have shown that prompting the

model with neutral texts (‘a photo of a lawyer’),

devoid of any cues towards a social group, still gen-

erates images that are biased towards white males

(Cho et al.,2022).

In our work, we consider three bias axis – 1)

{man, woman} grouping across gender axis, 2)

{light-skinned, dark-skinned} grouping across skin

color axis, and 3) {Western, Non-Western} group-

ing across cultural axis.

The existence of any gen-

der

and skin color bias

(see Ethical Statements

for more discussion) causes potential harms to un-

derrepresented groups by amplifying bias present

in the dataset (Birhane et al.,2021;Barocas et al.,

2018). Hence, it is essential for a text-to-image

Unlike Cho et al. (2022), we choose to perform analysis

of the skin color bias and refrain from any racial associations

based on an individual’s appearance.

In gender bias analysis, we refer to gender as the ‘gen-

der expression’ of an individual i.e., how they express their

identity via “clothing, hair, mannerisms, makeup” rather their

gender identity i.e., how individuals experience their own

gender (Dev et al.,2021).

We refer to skin color as the ‘observed skin color’ of an

individual i.e.,“the skin color others perceive you to be".

arXiv:2210.15230v1 [cs.CL] 27 Oct 2022

system to generate diverse set of images.

To this end, we study if the presence of addi-

tional knowledge that supports equitable judgment

help in diversifying model generations. Being part

of text input, this knowledge acts as an ethical in-

tervention augmented to the original prompt (Zhao

et al.,2021)

. Ethical interventions provide models

with ethical advice and do not emanate any visual

cues towards a speciﬁc social group. For instance,

in the context of generating ‘a photo of a lawyer’

that tends to be biased towards ‘light-skinned man’,

we wish to study if prompting the model with ethi-

cally intervened prompt (e.g., ‘a photo of a lawyer

if all individuals can be a lawyer irrespective of

their gender’) can diversify the outputs.

We introduce an

thical

ural Language

nterventions in Text-to-Image

GEN

eration (ENTI-

GEN) benchmark dataset to study the change in the

perceived societal bias of the text-to-image genera-

tive models in the presence of ethical interventions.

ENTIGEN covers prompts to study the bias across

three axes – gender, skin color and culture. The neu-

tral prompts in ENTIGEN dataset are intervened

with corresponding ethical knowledge as illustrated

in Figure 1. We evaluate ENTIGEN on publicly

available models – minDALL

E (Kim et al.,2021),

DALL

E-mini (Dayma et al.,2021), and Stable Dif-

fusion (Rombach et al.,2022) automatically with

CLIP model (Radford et al.,2019) and manually

with human annotators from MTurk.

Through our experiments, (1) we show that a

few ethical interventions lead to the diversiﬁcation

of the image generations across different groups

while preserving the image generation quality. In-

terestingly, in some cases, we observe the bias can

be ﬂipped towards the originally underrepresented

groups with ethical interventions (Appendix Figure

6). (2) Moreover, we ﬁnd that the interventions

containing keywords such as ‘irrespective of gen-

der’ and ‘culture’ tend to trigger a large change

in model generations. We further analyze the pre-

training data to analyze the context in which these

keywords are used to better understand how they

may affect the diversity in generation.

2 Dataset and Evaluation Methods

In this section, we introduce the process of building

an ethical intervention benchmark ENTIGEN and

In this paper, we consider ethical intervention presented in

natural languages. Some other works consider the intervention

is an adversarial trigger (Wallace et al.,2019;Sheng et al.,

2020).

evaluating the images generated by text-to-image

generative models.

2.1 ENTIGEN Benchmark Construction

Initially, we determine three axes of societal bias to

be studied: gender, skin color, and culture. Speciﬁ-

cally, to investigate the gender and skin color bias,

ENTIGEN consists of prompts belonging to the

category of professions and objects. For assessing

the cultural bias, it consists of prompts surround-

ing the wedding ceremonies as it is ubiquitous and

diverse across different regions (Bell et al.,1997;

Xu and Xu,2018;Acharya et al.,2020).

Based on the selected axes, we design neutral

prompts without any ethical interventions as the

original prompts. Subsequently, we append ethical

interventions to the original prompts that can per-

haps amend the model’s behaviour towards more

diverse generations. For instance,

1. Original Prompt:

a photo of a person wear-

ing a [object]

2. With Ethical Intervention:

a photo of a per-

son wearing a [object]if all individuals can

wear a [object] irrespective of their gender

‘If all individuals can wear a [object] irrespective of

their gender’ is an ethical intervention that guides

diverse outputs in terms of gender. We require the

ethical interventions to not give away any visual

cues to eliminate the effect of any explicit guidance.

We further include irrelevant interventions in

ENTIGEN. These interventions also provide ethi-

cal advice, but do not correspond to any social axes

we study in ENTIGEN. For example, ‘if honesty is

the best policy’ is an irrelevant intervention since it

is unrelated to gender, skin color and culture. Ide-

ally, these interventions cannot help in diversifying

image generations on either of studied social axes.

In total, we create 246 prompts based on an at-

tribute set containing diverse professions, objects,

and cultural scenarios.5

2.2 Image Generation.

Each prompt in ENTIGEN is used to generate 9

images from each text-to-image generation model

9 times. We choose the publicly available models,

minDALL·E, DALL·E-mini, and Stable Diffusion

The list of profession, objects and cultural attributes is

present in Appendix Table 5.

for analysis. It is mainly because these three mod-

els can generate high-quality images efﬁciently. We

provide more details in Appendix B.

2.3 Evaluation Metrics.

We evaluate the diversity among the generated im-

ages of the models. We focus on the gap between

the number of images associated with the differ-

ent groups (mentioned in §1) which measure the

demographic disparity across various social axes.

Speciﬁcally, for one of the prompts (e.g., ‘a photo

of a [profession] if all genders can be a [profes-

sion]’) ﬁlled with each attribute

(e.g., police ofﬁ-

cer) in category

(e.g., profession), we count

k,a

(number of images with man) and

k,b

(number

of images with woman), associated with the two

groups

(man) and

(woman) across a speciﬁc

social axis

(gender). Finally, the diversity score

for axis gtowards its groups for category Pis:

diversityg

P=Pk∈P|sg

k,a −sg

k,b|

Pk∈P(sg

k,a +sg

k,b),(1)

where

is one of {gender, skin color, culture},

is one of {profession, object, wedding} and

can

be any attribute according to the category

select. The generations that could not have been

assigned gender or skin color due to uncertainity

in the judgements of the agents are not included in

this metric.

Smaller scores represent more diverse

outputs. The normalization factor in the denomi-

nator of the Eq.

(1)

allows us to compare model

generations from two different prompts – original

and ethically intervened as they could have differ-

ent number of image generations that belong to

either of the two social groups. To quantify the

bias and its direction, given one speciﬁc attribute

, we directly compute the normalized difference

of the two counts,

biasg

k=sg

k,a −sg

k,b/sg

k,a +sg

k,b,(2)

belonging to two groups

and

Greater absolute

value of

biasg

indicates greater bias and vice versa.

Built upon these metrics, CLIP-based and human

evaluations are used to assess output diversity and

bias. Due to limited budget, we select part of the

Details on assigning a social group to a model generation

are in Appendix C.

E.g.,

is man, light-skinned and Western for gender, skin

color and culture axes.

is woman, dark-skinned and Non-

Western.

professions and objects for human annotators to

evaluate.

For the entire set of images, we use auto-

matic CLIP-based evaluation

as a complementary

method. Appendix Cprovides more details about

our evaluations.

Note that we are aware of the possibility

that CLIP model may be biased towards certain

groups (Zhang et al.,2022). We measure the con-

sistency between the gender and skin color deter-

mined by the CLIP model and human annotators

in the images generated for a subset of attributes.

We ﬁnd that CLIP-based determinations agree with

the human annotations with a rate of 78-85% for

gender recognition while for skin color, the rate

is down to 67-78%. We ﬁnally decide to apply

CLIP-based evaluation on gender axis only as the

predictions on gender are more consistent with the

humans.

3 Results

3.1 CLIP-based Results

We investigate the effect of the ethical interventions

on the gender diversity score Eq.

(1)

for the profes-

sion category in Table 1(Column 3-5). We observe

that gender-speciﬁc ethical intervention causes the

promotion of gender diversity (Row 2-3) for all the

models. We also ﬁnd that the prompt with ‘irrespec-

tive of their gender’ improves the gender diversity

score much more than the prompt simply stating

that ‘all genders can be [profession]’. Addition-

ally, we observe that an ethical intervention with

respect to skin color does not have signiﬁcant effect

on the gender diversity of the model generations

(Row 4-5). Even though the irrelevant interventions

should not change the diversity scores, we observe

that diversity scores are affected by their presence

(Row 6-7). We present the gender diversity score

evaluated through CLIP for the object category in

Appendix Table 6. To ensure the reliability of our

evaluation, we also perform human annotations for

better assessment.

3.2 Human Evaluation Results

We present human evaluation results for the profes-

sion category in Table 1(Column 5-8). We observe

that axis-speciﬁc ethical instructions with ‘irrespec-

tive of {gender, skin color}’ produce better diver-

professions: police ofﬁcer, doctor; objects: suit, scarf,

makeup; cultural scenarios: bride, groom, wedding.

We do not apply CLIP-based evaluation on cultural bias

axis because human annotators rated all the images generated

with prompts about cultural scenarios.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

HowwellcanText-to-ImageGenerativeModelsunderstandEthicalNaturalLanguageInterventions?HritikBansalDaYinMasoudMonajatipoorKai-WeiChangComputerScienceDepartment,UniversityofCalifornia,LosAngeles{hbansal,da.yin,kwchang}@cs.ucla.edu,monajati@ucla.eduAbstractText-to-imagegenerativemodelshaveachievedunpr...

展开>> 收起<<

How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions Hritik BansalDa YinMasoud Monajatipoor Kai-Wei Chang.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions Hritik BansalDa YinMasoud Monajatipoor Kai-Wei Chang

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: