
unique perspective to this end. Most existing stud-
ies on humor recognition formulate the problem
as a binary classification task and try to recognize
jokes via a set of linguistic features (Yang et al.,
2015;Purandare and Litman,2006;Zhang and Liu,
2014). One of the common problems those works
face is the construction of negative instances, which
are often sampled from a different domain (e.g.,
news). In contrast, the CAH task does not suffer
from this problem.
Perhaps the closest setting to ours is humorous fill-
in-the-blank (Hossain et al.,2017;Garimella et al.,
2020), where users complete a joke however they
see fit. However, our setting is a lot more restricted:
players choose (rank) an answer from a small set of
options, enabling comparisons that would be hard
to test on other corpora.
From a humor-theory point of view, we believe
CAH serves as an interesting example of frame
blends and frame shifts (Hofstadter and Gabora,
1989;Coulson,2001), where a speaker’s mental
model suddenly shifts to new situations, or two
distinct situations create a hybrid. CAH provides a
relatively clean setting to explore this phenomenon,
as the jokes are short, with simple syntax and nar-
rative structure.
To the best of our knowledge, CAH has only been
explored in the literature through pedagogical, eth-
ical or sociological lenses (e.g., (Strmic-Pawl and
Wilson,2016)), not computational or linguistic
ones. We note the data contains offensive humor,
and should be very carefully used as training data.
However, we believe it is important to study offen-
sive humor too and understand its role in generating
and reinforcing social boundaries and inequalities.
2 Data
The dataset consists of games played on
the online CAH labs website,
https://lab.
cardsagainsthumanity.com
. The players played
the game voluntarily, for fun; they are not our anno-
tators or workers. In each round a user is presented
with a random prompt card, 10 potential punch-
lines cards, and chooses the funniest punchline.
The raw data had 298,955 past games (i.e., we did
not perform any additional experimentation our-
selves).There are 581 unique black prompt cards
and 2,128 white punchline cards, including cards
from the official CAH game and expansions, re-
sulting in 1,236,368 possible unique jokes (where
Figure 1: Card counts. Log scale histogram of prompt
and punchline cards occurrence frequency (i.e., how
many times cards appeared). Prompts have a more
relatively uniform distribution, but both prompts and
punchline cards have a “tail” of rare cards. The spikes
of frequent cards are presumably due to cards from the
standard game, as opposed to experimental cards or ex-
pansions.
a joke is the result of filling in the blank of the
prompt card with a punchline). Each round is ef-
fectively unique due to the large number of com-
binations. The data we received from CAH did
not include any demographic or geographic char-
acteristics, user identifiers or personally identifi-
able information. 5% of games were skipped by
users and were excluded, as were a minority of
prompts that required picking more than one punch-
line. Data is available upon request to CAH at
mail@cardsagainsthumanity.com.
2.1 Data analysis and observations
The frequency of different prompts or punchlines
presented to users is not a uniform distribution (Fig
1). The odds of a punchline card being picked
and winning is also unevenly distributed – perhaps
unsurprisingly, some punchlines are funnier than
others (Fig 2). The data is sparse: the number of po-
tential games is immense (
7.06 ×1054)
. Viewed at
the level of unique jokes (prompt+punchline com-
bined), only 784,974 appear at least once across
the games, out of the 1.23M possible (60%), with
few repeats. If we consider only cases where we
have feedback (a “winning pick”), then we have
only 248,896 jokes with feedback, and of these
77% were picked only once, out of 300,000 games.
A further 17% were picked only twice.
2.1.1 Popular punchlines
Across all games, all punchlines appeared at least
14 times, with
µ= 1149
,
σ= 334
. We considered
a punchline successful if its win rate is over 20%