3
ning. The other one, more prominent, happens at much
smaller community sizes (∼200 people). It does not cor-
relate with any feature marked by Spanish law. Rather,
its proximity to Dunbar’s number (an empirical—yet de-
bated [13–15]—cognitive limit to the number of relation-
ships that animals can have [16–23]) suggests an organic
emergence of social complexity.
In Sec. II.A we describe the sociolinguistic dynamics
of interest to us and the empirical data available. In Sec.
II.B we introduce the mathematical model that we fit to
the data. The fitting procedure is explained in Sec. II.C.
We use the resulting model parameters to estimate the
rate at which the dynamics unfold in each geographical
region. In Sec. III.A we introduce our social complex-
ity spectrum. We explain how we define separations be-
tween potentially simple and potentially complex social
networks, and how we evaluate the goodness of these sep-
arations. Secs. III.A and III.B contain our main results.
Namely, that we identify two outstanding scales at which
leaps, or buildups of social complexity would happen ac-
cording to our criterion. Sec. III.C further explores the
spectrum of social complexity as a methodological tool.
Similarly to physics and optical spectroscopy, we observe
‘red-’ and ‘blue-shifts’ as the overall demographics has
changed over two decades. We illustrate how this can
help us refine the separation of simple and complex so-
cial networks. We wrap up the paper with a discussion of
our findings in Sec. IV. We further argue that the “com-
plexity spectrum” might be a powerful tool to uncover
scales of social relevance when applied to similar dynam-
ics.
II. METHODS
A. Sociolinguistic dynamics and data
We investigate the emergence of singular scales of so-
cial complexity by looking at how certain sociolinguis-
tic dynamics have unfolded at different speeds in regions
that, potentially, contain distinctly complex social net-
works. The dynamics that we use as a proxy is the co-
existence of Castillian Spanish and Galician. Both these
tongues are romance languages that coexist in the Au-
tonomous Community of Galicia, in north-western Spain.
Mutual intelligibility between them is large, allowing
broad bilingual communities and, potentially, a sustained
coexistence between the two of them. While Galician is
the vernacular, a shift towards Castillian Spanish has
been underway for centuries, and has been especially ac-
centuated during the 20th century.
The Galician Statistical Office (Instituto Galego de Es-
tat´ıstica, IGE) has tracked language use in different Gali-
cian regions and across demographic groups. In peri-
odic polls, informants would self-assess their language use
as ‘only Galician’, ‘mostly Galician’, ‘same use of both
tongues’, ‘mostly Spanish’, and ‘only Spanish’. We took
informants at either extreme of this scale as monolingual
individuals of the corresponding language, and grouped
the central categories as bilinguals. We are interested in
the fractions of speakers in these groups.
IGE polls are stratified by age, which allows us to build
a time-series of fractions of speakers by projecting age
groups in apparent time [24–28] (meaning that the frac-
tion of speakers of people of a certain age become es-
timators of the fraction of speakers when those people
were born). Additionally, data is split into 20 indepen-
dent Galician subregions, each of which is made up of
a collection of smaller counties—but data for individual
counties is not available. Hence, our empirical dataset of
sociolinguistic dynamics consists of 20 time series with
the fractions of monolingual Galician speakers, bilingual
speakers, and monolingual Spanish speakers (Fig. 2a-b).
We base our work on the IGE poll that allowed us of to
estimate the fractions of those born in 2001 [29].
B. Mathematical model
Beginning in the early 90s [30, 31], a growing commu-
nity of mathematicians, physicists, ecologists, and com-
plexity researchers started using systems of differential
equations to model possible trajectories of speakers of co-
existing languages over time [32]. A turning point was the
work by Abrams and Strogatz [33], who fitted their equa-
tions to data from dozens of cohabiting tongues. This
inspired a wave of new models whose stability and dy-
namical classes were thoroughly analyzed [5, 34–44], and
which could in some occasions be tested against empirical
data [4, 28, 44–52].
Different authors would emphasize distinct ingredients
that might affect language coexistence, such as their spa-
tial distribution, or bilingualism (elements that the origi-
nal model by Abrams and Strogatz did not contemplate).
We use one such variation that includes bilingualism [34],
whose stability and dynamics have been studied in detail
[5, 39, 42, 43], and that has been fitted to data of differ-
ent cohabiting languages, including the Galician-Spanish
case [4, 28, 52]. Contrary to some other models with
bilingualism, the one that we use is compatible with ei-
ther the stable coexistence of both tongues, or that one
language takes over and drives the other to extinction.
Thus, the model is agnostic regarding the stability of the
coexisting couple, and the empirical data can constrain
model parameters towards either outcome.
The model consists of a system of coupled differential
equations that tracks the time-evolution of the fraction,
x, of monolinguals of language X (here, Galician); of the
fraction, b, of bilinguals; and of the fraction, y, of mono-
linguals of language Y(here, Spanish). Population is
normalized such that x+y+b= 1, hence two equations