
2
interaction. In this work, the resulting payoff depends on
the actual agents’ strategies through a general matrix of
payoffs to account for the full space of two-player social
dilemmas with two strategies, cooperation and defection.
We generalize the results obtained in [32], by examin-
ing in detail the steady and metastable states and the
nature (continuous or discontinuous) of the phase tran-
sitions between absorbing, quasi-absorbing, and mixed
strategy states using the master and Fokker-Planck equa-
tions, for any system size and any effective temperature
describing how often a player makes irrational choices.
The work is organized as follows. In Section II we de-
fine the model (game dynamics, updating rule, and inter-
action network) and introduce the notation used through-
out this work. In Section III we study the most gen-
eral master equation describing the state of the system
and identify some steady-state solutions and discontinu-
ity points depending on the parameters of the system.
The mean-field case is thoroughly investigated in Section
IV by means of the corresponding Fokker-Planck equa-
tion, which we solve analytically by artificially removing
the singularities at the pure absorbing states of the sys-
tem. Monte-Carlo numerical simulations are provided
in Section V to corroborate the analytical predictions of
mean field, both for all-to-all interactions and for more
complex interaction networks. Finally, we summarize our
results in the Conclusion section.
II. MODEL DEFINITION
We consider a population of Nagents playing a 2 ×2
game, where each agent can adopt a strategy of coop-
eration (C) or defection (D), that can be changed de-
pending on her performance, her neighbors’ performance,
and some degree of randomness. The population connec-
tivity is structured in a connected and undirected net-
work represented by the adjacency matrix A, such that
Aµ,ν =Aν,µ = 1 if nodes µand νare neighbors, while
Aµ,ν =Aν,µ = 0 otherwise. We denote by Σ the set of all
nodes and by Vσ={ν∈Σ| Aσ,ν = 1}the set of neigh-
bors of a given node σ. The number of elements of Vσis
the degree of σ,kσ=PνAσ,ν . Throughout this work, in
addition to the complete graph (CG) describing all-to-all
interactions, we will consider different graph-structured
populations ranging from random regular graphs (RR),
Erd¨os–R´enyi random graphs (ER) [40], to scale-free net-
works (SF) using the Barab´asi-Albert model [41].
As any node σ∈Σ is always occupied by an agent, for
our discussion it is useful to use the Boolean variables cσ
and dσ, indicating if σholds a cooperator or a defector,
respectively. Then, it is readily seen that cσ, dσ∈ {0,1},
cσ+dσ= 1, and cσ·dσ= 0. As a consequence, in order
to specify the state Sof the system at a given time t, we
only need the set S={cσ|σ∈Σ}.
The dynamics, including Monte Carlos simulations,
unfolds in several steps:
(i) First, the network Aand an initial state S0are
selected.
(ii) All agents play the game with their neighbors. The
resulting payoff of a dyadic interaction is given by
the payoff matrix:
M=
C D
CR S
DT P
.(1)
The values R,S,T, and Pclassically represent
the reward for mutual cooperation (R), the sucker’s
payoff (S), the temptation to defect (T), and the
punishment for mutual defection (P). This way,
the payoff gσof an agent at node σdepends on
the parameters of the matrix M, her state, and the
state of her neighbors as
gσ=cσX
ν∈Vσ
(Rcν+Sdν) + dσX
ν∈Vσ
(T cν+P dν).(2)
(iii) After the play, an agent at σand one of her neigh-
bors at νare selected at random. The former copies
the strategy of the latter with a probability
pσ,ν =1
1 + exp −∆gσ,ν
θ,(3)
where θis a non-negative parameter playing the
role of an effective temperature (tuning the proba-
bility of an irrational choice) and
∆gσ,ν =gν−gσ
Tmax(kσ, kν)(4)
is a normalized payoff difference.
(iv) The time tand the state Sof the system are up-
dated: t→t+t0,S → S0, where t0is an arbitrary
unit of time.
(v) The steps (ii) to (iv) are repeated a desired number
of times.
For zero effective temperature (θ= 0) the copying
mechanism is (almost) deterministic: if ∆gσ,ν >0 then
node σalways copies the strategy of node ν(pσ,ν = 1),
while nothing changes when ∆gσ,ν <0 (pσ,ν = 0).
In the tie case ∆gσ,ν = 0, the copying probability is
pσ,ν =1
2. In this case (θ= 0) and for a very large
and well-mixed population, four different categories of
games have been extensively studied as a function of the
parameters R,S,T, and P: Harmony, Snowdrift, Stag
Hunt, and Prisoner’s Dilemma. The Harmony game rep-
resents a category of games satisfying R > S > P and
R > T > P where full cooperation is the only possible
stable outcome in a population [42], while in the Pris-
oner’s Dilemma, T > R > P > S, the evolutionary sta-
ble strategy is a whole population of defectors [43]. The
other two categories represent respectively the classes of