
26, 27, 28, 29, 30, 31], potential [32, 33, 34], and monotone (or, more generally, variationally stable)
games [35, 36, 37, 38, 39, 40, 41]. Interest in two-player zero-sum games, for example, stems in part
from emerging applications in machine learning, e.g., the training of GANs [42, 43].
A natural generalization of games are
pseudo-games
or
abstract economies
, rst introduced by
Arrow and Debreu [44] in the context of markets, in which the actions taken by each player aect not
only the other players’ payos, as in games, but also the other players’ strategy sets. In other words,
players (simultaneously) take actions that can restrict the other players’ strategy sets. Such a model
is not technically a game, giving rise to the pseudo-game terminology. Pseudo-games generalize
games, and hence are even more widely applicable. Some recently studied applications include
adversarial classication [45, 46], energy resource allocation [47, 48], environmental protection [49,
50], cloud computing [51, 52], adversarial classication, ride sharing services [53], transportation
[54], wireless and network communication [55, 56].
The solution concept par excellence for pseudo-games is the
generalized Nash equilibrium
(GNE)
[44, 57, 58], i.e., a strategy prole at which each player’s strategy is feasible and no player can
improve their payos by unilaterally deviating to another strategy in the strategy set determined by
the other players’ strategies. The aforementioned applications speak to the importance of developing
algorithms that can approximate GNEs eciently in the largest class of pseudo-games possible.
Work in this direction, on the
generalized Nash equilibrium problem (GNEP)
, is progressing;
see, for example, [59, 57, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77]. Nonetheless,
there are still few, if any [78], GNE-nding algorithms with computational guarantees, even for
restricted classes of pseudo-games.
Exploitability
, or the Nikoida-Isoda potential [79], is dened as the sum of the players’ payo-
maximizing unilateral deviations w.r.t. a given strategy prole. Minimizing exploitability is a logical
approach to computing GNE, especially in cases where exploitability is convex, such as in the
pseudo-games that result from replacing each player’s payo function in a monotone pseudo-game
by a second-order Taylor approximation [80]. When the players’ payos are also Lipschitz-smooth
and strongly concave (in their own action), this approximation can be quite good, because the error
in the approximation will be bounded by the Lipschitz-smoothness and strong concavity constant.
Minimizing exploitability can thus be an eective way of nding GNEs in pseudo-games; however, to
our knowledge, no convergence rates for this approach are known, and few methods have been proposed
for pseudo-games whose exploitability is non-convex.
In this paper, we develop ecient exploitability-minimization algorithms that converge to variational
equilibrium (VE), a renement of GNE, exactly in a large class of pseudo-games with jointly convex
constraints (which includes, but is not limited to, two-player zero-sum,
n
-player pairwise zero-sum,
and a large class of monotone and bilinear pseudo-games, as well as Cournot oligopoly games); and
approximately, in Lipschitz-smooth pseudo-games with jointly convex constraints, where exact
computation is intractable. Our algorithms apply not only to games with discrete action spaces
but also to games with continuous action spaces; previous exploitability-minimization algorithms
concern only nite games [81, 82].
The
cumulative regret
of two strategy proles is dened as the sum, across all players, of the
change in utility due to unilateral deviations from one strategy prole to the other. Our algorithms
and their analysis rely on the following key insights: assuming jointly convex constraints (i.e.,
symmetric and convex-valued joint constraint correspondences) the exploitability-minimization
problem is equivalent to solving a cumulative regret min-max optimization problem, which although
non-convex-concave in general can nonetheless be solved by
gradient descent ascent (GDA)
methods
1
[25, 85] using a parametric variant of Moreau’s envelope theorem that we introduce
(Theorem 4.1). We apply this logic to develop two algorithms for Lipschitz-smooth pseudo-games
with jointly convex constraints. To the best of our knowledge, this makes ours the rst exploitability-
minimization methods with convergence rate guarantees in this class of pseudo-games.
The rst (EDA; Algorithm 1) is an extragradient method that converges in average iterates to an
ε
-VE, and hence an
ε
-GNE, in
O(1
/ε)
iterations in pseudo-games with convex-concave cumulative
regret, where
ε
is the exploitability of the game (Theorem 3.4). This class of pseudo-games includes
zero-sum, potential, Cournot, a large class of monotone pseudo-games, and a class of bilinear
1
When the objective function is convex-concave, a saddle point is guaranteed to exist, in which case this
optimization problem can be solved using simultaneous-GDA-style algorithms (e.g., [21, 83, 84]).
2