
pw/awLContext-Situated Pun for hunts, deer
hedges/
edges 1Why is the hunter so good at hunting deer?
Because he hunts life on the hedges
husky/husk 0 -
catch/
catch 1He hunts deer but the catch is that they
rarely show up.
pine/
pine 1Hunting deer in the forest always makes
him pine for the loss.
boar/
bore 1He is so mundane about hunting deer,
but it is hardly a boar.
jerky/
jerky 1What do you call an erratic deer that is
being hunted? Jerky
Table 2: Example annotations from the VCUP dataset.
Labels Lindicate whether the annotator was able to
write a pun given the context and pun pair.
3VCUP Dataset
Motivation.
The largest and most commonly-
used dataset in the pun generation community is the
SemEval 2017 Task 7 dataset (Miller et al.,2017).
2
Under our setting of context-situated pun genera-
tion, we can utilize keywords from the puns them-
selves as context. However, the majority of pun
pairs only occur once in the the SemEval dataset,
while one given context could have been compati-
ble with many other pun pairs. For example, given
the context
beauty school, class
, the original pun in
the SemEval dataset uses the homographic pun pair
(makeup, makeup) and says: “If you miss a
class
at
beauty school
you’ll need a makeup session.” At
the same time, a creative human can use the hetero-
graphic pun pair (dyed, die) to instead generate “I
inhaled so much ash from the eye shadow palette at
the
beauty school class
– I might have dyed a little
inside.” Because of the limitation of the SemEval
dataset, we need a dataset that has a diverse set of
pun pairs combined with given contexts. Further-
more, the dataset should be annotated to indicate
whether the context words and pun pair combina-
tion is suitable to make context-situated puns.
Data Preparation.
We sample puns that contain
both sense annotations and pun word annotations
from SemEval Task 7. We show two examples
of heterographic puns and homographic puns and
their annotations from the SemEval dataset in Ta-
ble 1. From this set, we sample from the 500 most
frequent (
pw
,
aw
) pairs and randomly sample 100
2https://alt.qcri.org/semeval2017/
task7/
. The data is released under CC BY-NC 4.0 license
(
https://creativecommons.org/licenses/
by-nc/4.0/legalcode).
unique context words
C
.
3
Combining the sampled
pun pairs and context words, we construct 4,552
(C,pw,aw) instances for annotation.
Annotation.
For our annotation task, we asked
annotators to indicate whether they can come up
with a pun, using pun pair (
pw
,
aw
), that is situated
in a given context
C
and supports both senses
Spw
and
Saw
. If an annotator indicated that they could
create such a pun, we then asked the annotator to
write down the pun they came up with. Meanwhile,
we asked annotators how difficult it is for them to
come up with the pun from a scale of 1 to 5, where 1
means very easy and 5 means very hard.
4
To aid in
writing puns, we also provided four T5-generated
puns as references. 5
We deployed our annotation task on Amazon Me-
chanical Turk using a pool of 250 annotators with
whom we have collaborated in the past, and have
been previously identified as good annotators. Each
HIT contained three (
C
,
pw
,
aw
) tuples and we paid
one US dollar per HIT.
6
To ensure dataset quality,
we manually checked the annotations and accepted
HITs from annotators who tended not to skip all
the annotations (i.e., did not mark everything as
“cannot come up with a pun”). After iterative com-
munication and manual examination, we narrowed
down and selected three annotators that we marked
as highly creative to work on the annotation. To
check inter-annotator agreement, we collected mul-
tiple annotations for 150 instances and measured
agreement using Fleiss’ kappa (Fleiss and Cohen,
1973) (
κ= 0.43
), suggesting moderate agreement.
Statistics.
After annotation, we ended up with
2,753 (
C
,
pw
,
aw
) tuples that are annotated as
compatible and 1,798 as incompatible. For the
2,753 compatible tuples, we additionally collected
human-written puns from annotators. The number
of puns we collected exceeds the number of an-
notated puns in SemEval 2017 Task 7 which have
annotated pun word and alternative word sense an-
notations (2,396 puns). The binary compatibility
labels and human-written puns comprise our re-
sulting dataset,
V
CUP (
C
ontext Sit
U
ated
P
uns).
Table 2shows examples of annotations in CUP.
3
We sample a limited number of context words to keep the
scale of data annotation feasible.
4Full annotation guidelines in Appendix D.
5
Annotators find it extremely hard to come up with puns
from scratch. Generated texts greatly ease the pain.
6This translates to be well over $15/hr.