
mirror-prox method [19], dual-extrapolation method [24, 27], hybrid proximal extra-gradient (HPE)
method [18], extra-point method [7] are all based on the concept of extra-gradient.
Another type of method, known as optimistic gradient descent ascent method (OGDA), was first
proposed by Popov [31]:
xk+1 := PXxk−αF (xk)−η(F(xk)−F(xk−1)),(7)
for some positive α, η > 0, where PXdenotes the projection operator onto X. Unlike the update
in extra-gradient method (4) which uses an extra step, OGDA only requires one update (one
projection) per iteration and uses the information from the previous iterate xk−1instead. The
optimal convergence of OGDA, in both monotone and strongly monotone VI, is established by
Mokhtari et al. [15, 16]. The extra-point method proposed by Huang and Zhang [7] extends the
concepts of the extra-gradient method, OGDA, Nesterov’s acceleration in optimization [21], and
the “heavy-ball” method by Polyak [30] and combines them in a unifying update scheme. If the
parameters associated to these different components satisfy a certain constraint set, it is shown that
optimal iteration complexity is guaranteed. There is another line of work that studies variants of
extra-gradient type methods [38, 11, 9] and proximal point methods [35, 12, 29] with the anchoring
update, where in each iteration the initial iterate is used as the component of convex combination.
The iterates produced are shown to converge among these different methods [39], at a rate same
as the optimal convergence rate (to the solution), and the iteration complexities are improved by
constant orders compared to vanilla extra-gradient method.
The above methods are known as the first-order methods. The lower bound of the iteration
complexity for the first-order methods applied to monotone VI is Ω 1
, as established by Ne-
mirovsky and Yudin [20], while for strongly monotone VI, it is Ω κln 1
, shown by Zhang et
al. [40] in the context of strongly-convex-strongly-concave saddle-point problems. Methods such
as extra-gradient method, mirror-prox method, dual-extrapolation method [24, 27], HPE, OGDA,
extra-point method have been proven to achieve these lower bounds, hence optimal.
The work of Taji et al. [34] is among the first to consider second-order methods for solving VI. A
linearized VI subproblem with operator F(xk) + ∇F(xk)(x−xk) is solved in each iteration and the
merit function f(x) = max
x0∈XhF(x), x −x0i− µ
2kx−x0k2is used to prove the global convergence, with
an additional local quadratic convergence. However, no explicit iteration complexity is established
for second-order methods until recently. Following the line of research in [34], Huang and Zhang
[6] specifically consider unconstrained strongly-convex-strongly-concave saddle point problem and
incorporate the idea of cubic regularization (originally proposed by Nesterov in the context of
optimization [26]), proving the global iteration complexity Oκ2+κL2
µln 1
, where L2is the
Lipschitz constant of the Hessian information, in addition to the local quadratic convergence.
Another line of research on second-order methods was started by Monteiro and Svaiter [17]. They
propose a Newton Proximal Extragradient (NPE) method, which can be viewed as a special case
of the HPE with large step size. In HPE, the first step solves approximately the proximal point
update (3) (denote as xk+0.5), while the second step is a regular extra-gradient step. The “large
step size” condition, which is key to guarantee a superior convergence rate, requires:
1
γk≥θ
kxk+0.5−xkk(8)
4