Tuning Rate of Strategy Revision in Population Games Shinkyu Park Abstract We investigate a multi-agent decision problem in

2025-04-24 1 0 533.01KB 8 页 10玖币

侵权投诉

Tuning Rate of Strategy Revision in Population Games

Shinkyu Park

Abstract— We investigate a multi-agent decision problem in

population games where each agent in a population makes a

decision on strategy selection and revision to engage in repeated

games with others. The strategy revision is subject to time

delays which represent the time it takes for an agent revising

its strategy needs to spend before it can adopt a new strategy

and return back to the game. We discuss the effect of the time

delays on long-term behavior of the agents’ strategy revision. In

particular, when the time delays are large, the strategy revision

would exhibit oscillation and the agents spend substantial time

in “transitioning” between different strategies, which prevents

the agents from attaining the Nash equilibrium of the game. As

a main contribution of the paper, we propose an algorithm that

tunes the rate of the agents’ strategy revision and show such

tuning approach ensures convergence to the Nash equilibrium.

We validate our analytical results using simulations.

I. INTRODUCTION

Consider a multi-agent decision problem where each agent

selects a strategy to engage in repeated interactions with

other agents. Agents receive payoffs as a function of their

strategy proﬁle – the distribution of their strategy selection –

and revise their strategy selection based on the payoffs. Such

problem setting is prevalent in many engineering applications

[1]–[10] ranging from demand response in smart grids to task

allocation in multi-robot systems research.

We adopt the population game formalism [11] to model the

process of the agents’ strategy revision and assess long-term

behavior of the revision process. Different from conventional

population game problems, in this work, we investigate the

scenario where the strategy revision is subject to time delays

for which every agent revising its strategy needs to spend a

ﬁxed amount of time for transitioning to a new strategy. As

a case in point, in multi-robot task allocation applications

[8], [10], [12] where the strategy revision corresponds to

assignment of a new task to a robot, time delays in the

strategy revision reﬂect the requirement for each robot to

travel to a distant location to take on a new task.

When the agents’ strategy revision is subject to time

delays, as we discuss in the paper, their strategy proﬁle

would exhibit oscillation implying that a portion of the

agent population persistently transitions between different

strategies. In the task allocation applications, this means that

the robots would spend substantial time on traveling between

distant locations which is a disadvantage since during the

traveling, they cannot carry out assigned tasks. The authors

of [8], [12] concretely discuss such phenomenon observed in

This work was supported by funding from King Abdullah University of

Science and Technology (KAUST).

Park is with the Electrical and Computer Engineering, King Abdullah

University of Science and Technology (KAUST), Thuwal 23955, Saudi

Arabia. shinkyu.park@kaust.edu.sa

their linear models, and empirical results presented in [10]

illustrate the oscillation of the strategy proﬁle in population

game models.

As a main contribution, we propose an algorithm that

tunes the rate of the agents’ strategy revision to eliminate

the oscillation of the strategy proﬁle, and prove that the

algorithm ensures convergence of the strategy proﬁle to

the Nash equilibrium. Of relevance to this work are [13],

[14, Section 5.9] and references therein that investigate the

scenarios where payoff mechanisms underlying population

games are subject to time delays and hence the agents revise

their strategy selection based on time-delayed payoffs. Also,

[15] examines a relevant problem in which, when revising its

strategy, each agent needs to take on a sequence of sub-tasks

(sub-strategies) which would potentially cause time delays in

the agent’s strategy revision.

Our contributions are distinct from existing work in the

following aspects: (i) unlike [13], [14, Section 5.9] and

references therein, our problem formulation considers that

the agents’ strategy revision is subject to time delays, which

is different from the delays in payoff mechanisms, and (ii)

different from [8], [12], [15], our proposed approach ensures

convergence of the strategy proﬁle to the Nash equilibrium,

which, in our problem formulation, requires that no agents

are in transition or taking on any sub-strategies. Below we

summarize the main results of the paper.

•We model and analyze the effect of time delays in the

strategy revision using the population game formalism

and discuss long-term behavior of the agents’ strategy

proﬁle. In particular, our analysis explains how the rate

of the agents’ strategy revision affects the convergence

of the strategy proﬁle.

•Based on the analysis, we propose an algorithm for tun-

ing the strategy revision rate. The algorithm judiciously

decreases the revision rate and ensures the convergence

of the strategy proﬁle to the Nash equilibrium in con-

tractive population games. We illustrate our analytical

results using simulations.

The paper is organized as follows. In Section II, we intro-

duce the population game framework and strategy revision

model we adopt in this study. We also illustrate how the rate

of the agents’ strategy revision affects long-term behavior of

their strategy proﬁle. In Section III, we propose an algorithm

that judiciously tunes the strategy revision rate and show that

the proposed algorithm ensures the convergence of the strat-

egy proﬁle to the Nash equilibrium in the class of contractive

population games.1We illustrate our analytical results using

1See Deﬁnition 1 for its formal deﬁnition.

arXiv:2210.05472v2 [eess.SY] 10 Apr 2023

simulations with a numerical example in Section IV. We

conclude the paper with discussions and future plans in

Section V.

Notation: We denote by Rnthe set of n-dimensional

real vectors and by Rn

+the set of (element-wise) non-

negative n-dimensional real vectors. Throughout the paper,

we adopt the Euclidean norm for matrices and vectors.

II. PROBLEM DESCRIPTION

We describe the foundation of the population game for-

malism [11] and explain how we adopt it in our multi-agent

decision problem to study the effect of time delays in strategy

revision.

A. Population Games and Evolutionary Dynamics

Consider that a population of decision-making agents are

engaged in repeated strategic interactions with one another.

Each agent has a ﬁnite set {1,· · · , n}of strategies available

and can select one strategy at a time. Based on its strategy

selection, the agent receives a payoff determined by a payoff

mechanism underlying the interactions.

Adopting the conventional notation and formalism, we de-

note the state of the population by x= (x1,· · · , xn)∈Rn

where xirepresents the portion of the population selecting

strategy iand xsums up to 1, i.e., Pn

i=1 xi= 1. A payoff

vector p= (p1,· · · , pn)∈Rnis determined by a continuous

function Fas p=F(x), where piis the payoff assigned to

the agents selecting strategy i. Assuming that there are a

large number of agents in the population, we can deﬁne the

set of viable population states as X={z∈Rn

+|Pn

i=1 zi=1}.

Contractive population games are deﬁned as follows.

Deﬁnition 1 (Contractive Population Game [16]): A

population game Fis called contractive if it holds that

(w−z)T(F(w)− F(z)) ≤0,∀w, z ∈X.(1)

Assumption 1: We assume that the differential map DF

of Fexists and is bounded: there is a positive constant BDF

satisfying kDF(z)k2≤BDF,∀z∈X.

Note that when DFexists, the requirement (1) can be cast

˜zTDF(z)˜z≤0,∀z∈X,˜z∈TX,(2)

where TXis the tangent space of X.

The Nash equilibrium of Fis deﬁned as follows.

Deﬁnition 2 (Nash Equilibrium): An element zNE in Xis

called the Nash equilibrium of a population game Fif it

holds that

(zNE −z)TF(zNE)≥0,∀z∈X.(3)

A population game Fcan have multiple Nash equilibria. We

denote by NE(F)the set of Nash equilibria. We adopt the

following example to illustrate our main results.2

2For a simple presentation of the paper, we use the RPS game for the

illustration purpose.

Example 1: Consider the Rock-Paper-Scissors (RPS)

game (n= 3) whose payoff function is given by

F(x) = 



−ax2+bx3

bx1−ax3

−ax1+bx2

,(4)

where a, b are positive constants. Note that (4) is a contrac-

tive game if b≥a, and the Nash equilibrium of Fis given

by (1/3,1/3,1/3) for any a, b > 0.

In our study, we consider that the agents are repeatedly

engaged in a same game and given an opportunity, they can

revise their strategy selection based on the Smith revision

protocol, originally presented in [9] and deﬁned as follows.3

ρij (p) = %[pj−pi]+:= (%(pj−pi)if pj≥pi

0otherwise ,(5)

where %is a positive constant satisfying4

j=1 %[pj−pi]+≤1and %[pj−pi]+≤1.(6)

According to (5), each agent switches its strategy selection i

to jwith probability %[pj−pi]+; otherwise, it stays with its

current strategy selection iwith probability 1−Pn

j=1 %[pj−

pi]+. Hence, higher the payoff associated with strategy j

more likely the agent selects it. The condition (6) is required

for such probabilistic strategy revision scheme to be well-

deﬁned.

The point in time at which the agents’ strategy revision

takes place is determined by identical and independent

Poisson processes with parameter λ. In particular, each agent

can revise its strategy selection at each jump time of an

associated Poisson process. Since the parameter λdetermines

the rate of the strategy revision, we refer to it as the strategy

revision rate.

Let x(t)=(x1(t),· · · , xn(t)) be the population state at

time instant tdriven by the revision protocol (5), the Poisson

processes, and payoffs determined by a population game F.

By same discussions presented in [11, Chapter 10], as the

population size tends to inﬁnity, the solution to the following

state equation approximates the population state x(t)with

arbitrary accuracy.

˙xi(t) = Vi(x(t), p(t))

:= λPn

j=1 xj(t)%[pi(t)−pj(t)]+

−xi(t)Pn

j=1 %[pj(t)−pi(t)]+(7)

Adopting the same naming convention as in [17], we refer

to (7) as Evolutionary Dynamics Model (EDM).

B. δ-Passivity Tools for Convergence Analysis

A focal research theme in population games is in estab-

lishing convergence of the population state x(t), governed

by (7), to the Nash equilibrium set NE(F)of an underlying

3Although the analysis we present in this paper can be extended to other

class of strategy revision protocols. However, for a concise presentation of

our main result, we adopt the Smith protocol.

4Such constant %exists since Fis a continuous function and Xis a

compact subset of Rn.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TuningRateofStrategyRevisioninPopulationGamesShinkyuParkAbstractWeinvestigateamulti-agentdecisionprobleminpopulationgameswhereeachagentinapopulationmakesadecisiononstrategyselectionandrevisiontoengageinrepeatedgameswithothers.Thestrategyrevisionissubjecttotimedelayswhichrepresentthetimeittakesforan...

展开>> 收起<<

Tuning Rate of Strategy Revision in Population Games Shinkyu Park Abstract We investigate a multi-agent decision problem in.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Tuning Rate of Strategy Revision in Population Games Shinkyu Park Abstract We investigate a multi-agent decision problem in

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: