Tuning Rate of Strategy Revision in Population Games
Shinkyu Park
Abstract— We investigate a multi-agent decision problem in
population games where each agent in a population makes a
decision on strategy selection and revision to engage in repeated
games with others. The strategy revision is subject to time
delays which represent the time it takes for an agent revising
its strategy needs to spend before it can adopt a new strategy
and return back to the game. We discuss the effect of the time
delays on long-term behavior of the agents’ strategy revision. In
particular, when the time delays are large, the strategy revision
would exhibit oscillation and the agents spend substantial time
in “transitioning” between different strategies, which prevents
the agents from attaining the Nash equilibrium of the game. As
a main contribution of the paper, we propose an algorithm that
tunes the rate of the agents’ strategy revision and show such
tuning approach ensures convergence to the Nash equilibrium.
We validate our analytical results using simulations.
I. INTRODUCTION
Consider a multi-agent decision problem where each agent
selects a strategy to engage in repeated interactions with
other agents. Agents receive payoffs as a function of their
strategy profile – the distribution of their strategy selection –
and revise their strategy selection based on the payoffs. Such
problem setting is prevalent in many engineering applications
[1]–[10] ranging from demand response in smart grids to task
allocation in multi-robot systems research.
We adopt the population game formalism [11] to model the
process of the agents’ strategy revision and assess long-term
behavior of the revision process. Different from conventional
population game problems, in this work, we investigate the
scenario where the strategy revision is subject to time delays
for which every agent revising its strategy needs to spend a
fixed amount of time for transitioning to a new strategy. As
a case in point, in multi-robot task allocation applications
[8], [10], [12] where the strategy revision corresponds to
assignment of a new task to a robot, time delays in the
strategy revision reflect the requirement for each robot to
travel to a distant location to take on a new task.
When the agents’ strategy revision is subject to time
delays, as we discuss in the paper, their strategy profile
would exhibit oscillation implying that a portion of the
agent population persistently transitions between different
strategies. In the task allocation applications, this means that
the robots would spend substantial time on traveling between
distant locations which is a disadvantage since during the
traveling, they cannot carry out assigned tasks. The authors
of [8], [12] concretely discuss such phenomenon observed in
This work was supported by funding from King Abdullah University of
Science and Technology (KAUST).
Park is with the Electrical and Computer Engineering, King Abdullah
University of Science and Technology (KAUST), Thuwal 23955, Saudi
Arabia. shinkyu.park@kaust.edu.sa
their linear models, and empirical results presented in [10]
illustrate the oscillation of the strategy profile in population
game models.
As a main contribution, we propose an algorithm that
tunes the rate of the agents’ strategy revision to eliminate
the oscillation of the strategy profile, and prove that the
algorithm ensures convergence of the strategy profile to
the Nash equilibrium. Of relevance to this work are [13],
[14, Section 5.9] and references therein that investigate the
scenarios where payoff mechanisms underlying population
games are subject to time delays and hence the agents revise
their strategy selection based on time-delayed payoffs. Also,
[15] examines a relevant problem in which, when revising its
strategy, each agent needs to take on a sequence of sub-tasks
(sub-strategies) which would potentially cause time delays in
the agent’s strategy revision.
Our contributions are distinct from existing work in the
following aspects: (i) unlike [13], [14, Section 5.9] and
references therein, our problem formulation considers that
the agents’ strategy revision is subject to time delays, which
is different from the delays in payoff mechanisms, and (ii)
different from [8], [12], [15], our proposed approach ensures
convergence of the strategy profile to the Nash equilibrium,
which, in our problem formulation, requires that no agents
are in transition or taking on any sub-strategies. Below we
summarize the main results of the paper.
•We model and analyze the effect of time delays in the
strategy revision using the population game formalism
and discuss long-term behavior of the agents’ strategy
profile. In particular, our analysis explains how the rate
of the agents’ strategy revision affects the convergence
of the strategy profile.
•Based on the analysis, we propose an algorithm for tun-
ing the strategy revision rate. The algorithm judiciously
decreases the revision rate and ensures the convergence
of the strategy profile to the Nash equilibrium in con-
tractive population games. We illustrate our analytical
results using simulations.
The paper is organized as follows. In Section II, we intro-
duce the population game framework and strategy revision
model we adopt in this study. We also illustrate how the rate
of the agents’ strategy revision affects long-term behavior of
their strategy profile. In Section III, we propose an algorithm
that judiciously tunes the strategy revision rate and show that
the proposed algorithm ensures the convergence of the strat-
egy profile to the Nash equilibrium in the class of contractive
population games.1We illustrate our analytical results using
1See Definition 1 for its formal definition.
arXiv:2210.05472v2 [eess.SY] 10 Apr 2023