
ment in the US, the same slang word has been used
to express more negative senses in the UK.
Recent work has quantified semantic variation
in non-standard language of online communities
using word and sense embedding models and dis-
covered that community characteristics (e.g., com-
munity size, network density) are relevant factors in
predicting the strength of this variation (Del Tredici
and Fernández,2017;Lucy and Bamman,2021).
However, it is not clear how slang senses vary
among different communities and what might be
the driving forces behind this variation.
As an initial step to model semantic variation
in slang, we focus on regional semantic variation
between the US and the UK by considering a re-
gional inference task illustrated in Figure 2: Given
an emerging slang sense (e.g., “An outstanding ex-
ample”) for a slang word (e.g., beast), infer which
region (e.g., US vs. UK) it might have originated
from based on its historical meanings and usages.
Our premise is that a model capturing the basic prin-
ciples of slang semantic variation should be able
to trace or infer the regional identities of emerging
slang meanings over time.
2 Theoretical Hypotheses
We consider two theoretical hypotheses for char-
acterizing regularity in slang semantic variation:
communicative need and semantic distinction.
Communicative need.
Prior work has sug-
gested that slang may be driven by culture-
dependent communicative need (Sornig,1981). We
refer to communicative need as how frequently a
meaning needs to be communicated or expressed.
Following recent work (e.g., Kemp and Regier
2012;Ryskina et al. 2020), we estimate commu-
nicative need based on usage frequencies from
Google Ngram
1
over the past two centuries.
2
In the
context of slang semantic variation, certain things
might be more frequently talked about in one re-
gion (or country) over another. As such, we might
expect these differential needs to drive meaning
differentiation in slang terms. For example, a US-
specific slang sense for beast describes the subway
line #2 of the New York City transit network, most
likely due to the specific need for communicating
that information in the US (as opposed to the UK).
1https://books.google.com/ngrams
2
We acknowledge that experiment-based methods for es-
timating need exist (see Karjus et al.,2021), but these alter-
native methods are difficult to operationalize at scale and in
naturalistic settings required for our analysis.
Semantic distinction.
We also consider an al-
ternative hypothesis termed semantic distinction
motivated by the social functions of slang (c.f.,
Labov,1972;Hovy,1990)—language that is used
to show and reinforce group identity (Eble,2012).
Under this view, slang senses may develop inde-
pendently in each region and form a semantically
cohesive set of meanings that reflect the cultural
identity of a region. As a result, emerging slang
senses are more likely to be in close semantic prox-
imity with historical slang senses from the same
region.
3
For example, the slang beast has formed
a cluster of senses in the US that describes some-
thing virtuous while senses in the UK often de-
scribe criminals. An emerging sense such as “An
outstanding example” would be considered more
likely to originate from the US due to its similarity
with the historical US senses of beast. Here we
operationalize semantic distinction by models of
semantic chaining from work on historical word
meaning extension (Ramiro et al.,2018;Habibi
et al.,2020), where each region develops a distinct
chain of related regional senses over history.
We evaluate these theories using slang sense
entries from Green’s Dictionary of Slang (GDoS,
Green,2010) over the past two centuries. Anal-
ysis on GDoS entries is appropriate because 1) a
more diverse set of topics is covered compared to
domain-specific slang found in online communities
(e.g., Reddit), and 2) the region and time metadata
associated with individual sense entries support a
diachronic analysis on slang semantic variation. To
preview our results, we show that both communica-
tive need and semantic distinction are relevant fac-
tors in predicting slang semantic variation, with an
exemplar-based chaining model offering the most
robust results overall. Meanwhile, the relative im-
portance of the two factors is time-dependent and
fluctuates over different periods of history.
3 Related Work
3.1 Variation in online language
Previous work in computational social science on
online social media has explored lexical varia-
tion (Eisenstein et al.,2014;Nguyen et al.,2016) by
studying the differences in word choice among dif-
ferent online communities. It has also been shown
3
It is worth nothing that communicative need and semantic
distinction may not be completely orthogonal. In fact, differ-
ences in communicative need may drive semantic distinction.
However, we consider these hypotheses as alternative ones
because they are motivated by different functions.