
images as a grid (e.g., a
3×3
grid of
n= 9
images);
these images have the same prompt and hyperpa-
rameters but different seeds. We use Pillow (Clark,
2015) to split a collage into
n
individual images and
assign them with the correct metadata and unique
filenames. Finally, we compress all images in DIF-
FUSIONDB using lossless WebP (Google,2010).
2.3 Identifying NSFW Content
The Stable Diffusion Discord server prohibits gen-
erating NSFW images (StabilityAI,2022a). Also,
Stable Diffusion has a built-in NSFW filter that
automatically blurs generated images if it detects
NSFW content. However, we find DIFFUSIONDB
still includes NSFW images that were not detected
by the built-in filter or removed by server moder-
ators. To help researchers filter these images, we
apply state-of-the-art NSFW classifiers to compute
NSFW scores for each prompt and image. Re-
searchers can determine a suitable threshold to fil-
ter out potentially unsafe data for their tasks.
NSFW Prompts. We use a pre-trained multilin-
gual toxicity prediction model to detect unsafe
prompts (Hanu and Unitary team,2020). This
model outputs the probabilities of a sentence be-
ing toxic, obscene, threat, insult, identity attack,
and sexually explicit. We compute the text NSFW
score by taking the maximum of the probabilities
of being toxic and sexually explicit (Fig. 3 Top).
NSFW Images. We use a pre-trained Efficient-
Net classifier to detect images with sexual con-
tent (Schuhmann et al.,2022). This model predicts
the probabilities of five image types: drawing, hen-
tai, neutral, sexual, or porn. We compute the image
NSFW score by summing the probabilities of hen-
tai, sexual, and porn. We use a Laplacian convolu-
tion kernel with a threshold of
10
to detect images
that have already been blurred by Stable Diffusion
and assign them a score of
2.0
(Fig. 3 Bottom). As
Stable Diffusion’s blur effect is strong, our blurred
image detector has high precision and recall (both
100% on 50k randomly sampled images).
NSFW Detector Accuracy. To access the accu-
racy of these two pre-trained state-of-the-art NSFW
detectors, we randomly sample 5k images and 2k
prompt texts and manually annotate them with two
binary NSFW labels (one for image and one for
prompt) and analyze the results. As the percent-
age of samples predicted as NSFW (score > 0.5) is
small, we up-sample positive samples for annota-
Fig. 3: To help researchers filter out potentially unsafe
data in DIFFUSIONDB, we apply NSFW detectors to
predict the probability that an image-prompt pair con-
tains NSFW content. For images, a score of
2.0
indi-
cates the image has been blurred by Stable Diffusion.
tion, where we have an equal number of positive
and negative examples in our annotation sample.
After annotation, we compute the precisions and
recalls. Because we have up-sampled positive pre-
dictions, we adjust the recalls by multiplying false
negatives by a scalar to adjust the sampling bias.
The up-sampling does not affect precisions. Fi-
nally, the precisions, recalls and adjusted recalls
are
0.3604
,
0.9565
, and
0.6661
for the prompt
NSFW detector, and
0.315
,
0.9722
, and
0.3037
for the image NSFW detector. Our results suggest
two detectors are progressive classifiers. The lower
adjusted recall of the prompt NSFW detector can
be attributed to several potential factors, including
the use of a fixed binary threshold and the poten-
tial discrepancy in the definition of NSFW prompts
between the detector and our annotation process.
2.4 Organizing DIFFUSIONDB
We organize DIFFUSIONDB using a flexible file
structure. We first give each image a unique file-
name using Universally Unique Identifier (UUID,
Version 4) (Leach et al.,2005). Then, we or-
ganize images into 14,000 sub-folders—each in-
cludes 1,000 images. Each sub-folder also includes
a JSON file that contains 1,000 key-value pairs
mapping an image name to its metadata. An exam-
ple of this image-prompt pair can be seen in Fig. 2.
This modular file structure enables researchers to
flexibly use a subset of DIFFUSIONDB.
We create a metadata table in Apache Parquet
format (Apache,2013) with 13 columns:
unique
image name
,
image path
,
prompt
,
seed
,
CFG
scale
,
sampler
,
width
,
height
,
username hash
,
timestamp
,
image NSFW score
, and
prompt NSFW
3