
both. We call this general setting where neither convexity with respect to
θ
nor concavity with
respect to
ω
is assumed, the nonconvex-nonconcave setting. This setting presents some substantial
challenges. First, min-max equilibria are not guaranteed to exist, i.e. for general objectives there
may be no
(θ?
,
ω?)
satisfying
(2)
; this happens even in very simple cases, e.g. when
Θ=Ω= [
0, 1
]
and
f(θ
,
ω) = (θ−ω)2
. Second, it is
NP
-hard to determine whether a min-max equilibrium
exists [
DSZ21
] and, as is easy to see, it is also
NP
-hard to compute Nash equilibria of the sequential-
move game (which do exist under compactness of the constraint sets). For these reasons, the
optimization literature has targeted the computation of local and/or approximate solutions in
this setting [
DP18
,
MR18
,
JNJ19
,
WZB19
,
DSZ21
,
MV21
]. This is the approach we also take in this
paper, targeting the computation of
(ε
,
δ)
-local min-max equilibria, which were proposed in [
DSZ21
].
These are approximate and local Nash equilibria of the simultaneous-move game, defined as
feasible points (θ?,ω?)which satisfy a relaxed and local version of (2), namely:
f(θ?,ω?)<f(θ,ω?) + ε, for all θ∈Θsuch that kθ−θ?k ≤ δ; (3)
f(θ?,ω?)>f(θ?,ω)−ε, for all ω∈Ωsuch that kω−ω?k ≤ δ. (4)
Besides being a natural concept of local, approximate min-max equilibrium, an attractive feature
of
(ε
,
δ)
-local min-max equilibria is that they are guaranteed to exist when
f
is
Λ
-smooth and the
locality parameter,
δ
, is chosen small enough in terms of the smoothness,
Λ
, and the approximation
parameter,
ε
, namely whenever
δ≤q2ε
Λ
. Indeed, in this regime of parameters the
(ε
,
δ)
-local
min-max equilibria are in correspondence with the approximate fixed points of the Projected
Gradient Descent/Ascent dynamics. Thus, the existence of the former can be established by invoking
Brouwer’s fixed point theorem to establish the existence of the latter. (Theorem 5.1 of [DSZ20]).
There are a number of existing approaches which would be natural to use to find a solution
(θ?
,
ω?)
satisfying
(3)
and
(4)
, but all run into significant obstacles. First, the idea of averaging,
which can be leveraged in the convex-concave setting to obtain provable guarantees for otherwise
chaotic algorithms, such as online gradient descent, no longer works, as it critically uses Jensen’s
inequality which needs convexity/concavity. On the other hand, negative results abound for
last-iterate convergence: [
HMC21
] show that a variety of zeroth, first, and second order methods
may converge to a limit cycle, even in simple settings. [
VGFP19
] study a particular class of
nonconvex-nonconcave games and show that continuous-time gradient descent-ascent (GDA)
exhibits recurrent behavior. Furthermore, common variants of gradient descent-ascent, such as
optmistic GDA (OGDA) or extra-gradient (EG), may be unstable even in the proximity of local
min-max equilibria, or converge to fixed points that are not local min-max equilibria [
DP18
,
JNJ19
].
While there do exist algorithms, such as Follow-The-Ridge proposed by [
WZB19
], which provably
exhibit local convergence to a (relaxation of) local min-max equilibrium, these algorithms do not
enjoy global convergence guarantees, and no algorithm is known with guaranteed convergence to
a local min-max equilibrium.
These negative theoretical results are consistent with the practical experience with min-
maximization of nonconvex-nonconcave objectives, which is rife with frustration as well. A
common experience is that the training dynamics of first-order methods are unstable, oscillatory
or divergent, and the quality of the points encountered in the course of training can be poor; see
e.g. [
Goo16
,
MPPSD16
,
DISZ18
,
MGN18
,
DP18
,
MR18
,
MPP18
,
ADLH19
]. In light of the failure of
essentially all known algorithms to guarantee convergence, even asymptotically, to local min-max
3