An Approximation-Based Regularized Extra-Gradient Method for Monotone Variational Inequalities Kevin HuangShuzhong Zhang_2

2025-04-30 0 0 1.05MB 29 页 10玖币

侵权投诉

An Approximation-Based Regularized Extra-Gradient Method for

Monotone Variational Inequalities

Kevin Huang∗Shuzhong Zhang†

October 11, 2022

Abstract

In this paper, we propose a general extra-gradient scheme for solving monotone variational

inequalities (VI), referred to here as Approximation-based Regularized Extra-gradient method

(ARE). The ﬁrst step of ARE solves a VI subproblem with an approximation operator satisfying

apth-order Lipschitz bound with respect to the original mapping, further coupled with the

gradient of a (p+ 1)th-order regularization. The optimal global convergence is guaranteed by

including an additional extra-gradient step, while a pth-order superlinear local convergence is

shown to hold if the VI is strongly monotone. The proposed ARE is inclusive and general,

in the sense that a variety of solution methods can be formulated within this framework as

diﬀerent manifestations of approximations, and their iteration complexities would follow through

in a uniﬁed fashion. The ARE framework relates to the ﬁrst-order methods, while opening

up possibilities to developing higher-order methods speciﬁcally for structured problems that

guarantee the optimal iteration complexity bounds.

Keywords: variational inequality, extra-gradient method, tensor method, composite operators.

1 Introduction

Let X ⊂ Rnbe a convex set; F(x) : Rn7→ Rnbe a vector mapping. The following problem is

known as the variational inequality problem (VI):

Find x∗∈ X such that F(x∗)>(x−x∗)≥0 for all x∈ X.

As a notation we denote the solution set as

VIX(F(x)) := {x∗|x∗∈ X such that F(x∗)>(x−x∗)≥0 for all x∈ X}.

We assume VIX(F(x)) is non-empty throughout this paper. The study of ﬁnite-dimensional VI

problems dates back to 1960’s where the complementarity problem was developed to solve for var-

ious equilibria, such as economic equilibrium, traﬃc equilibrium, and in general Nash equilibrium.

∗Department of Industrial and System Engineering, University of Minnesota, huan1741@umn.edu

†Department of Industrial and System Engineering, University of Minnesota, zhangs@umn.edu

arXiv:2210.04440v1 [math.OC] 10 Oct 2022

For a comprehensive study of the applications, theories and algorithms of VI, readers are referred

to the celebrated monograph by Facchinei and Pang [5].

In this paper, we are interested in a speciﬁc class of VI, where the operator Fis monotone:

hF(x)−F(y), x −yi ≥ µkx−yk2,∀x, y ∈ X (1)

for some µ≥0. If there exists some µ > 0 such that (1) holds, it is referred to strongly monotone

and VIX(F(x)) is a singleton. The earliest methods developed to solve VI of this type are the

projection method due to Sibony [33]:

xk+1 := arg min

x∈X hF(xk), x −xki+γk

2kx−xkk2,(2)

and the proximal point method due to Martinet [14]:

xk+1 ∈VIX(F(x) + γk(x−xk)),(3)

for positive {γk}k≥0. These two methods form the basis of most, if not all, methods developed for

monotone VI in the research community thus far.

Korpelevich [10] ﬁrst introduced an extra-step in the update as follows:











xk+0.5:= arg min

x∈X hF(xk), x −xki+γk

2kx−xkk2,

xk+1 := arg min

x∈X hF(xk+0.5), x −xki+γk

2kx−xkk2.

(4)

The iteration complexity of the extra-gradient method (4) is later established by Tseng [37]. In

particular, if the operator is strongly monotone (µ > 0), it is Oκln 1

for an -solution, where

κ=L

µis the condition number for Lipschitz continuous operator with constant L. This is a

signiﬁcant improvement over Oκ2ln 1

 of the vanilla projection method (2), and it is in fact

optimal among ﬁrst-order methods (i.e. using only the information of F(·)) applied to such class of

problems (with lower bound recently established by Zhang et al. [40]). Many algorithms developed

for monotone VI thereafter adopt this concept of extra-step update and can be considered as

variants of the extra-gradient method, such as modiﬁed forward-backward method [36], mirror-

prox method [19], dual-extrapolation method [24, 27], hybrid proximal extra-gradient method [18],

extra-point method [7].

To facilitate the discussion, let us ﬁrst introduce a few terminologies that will be used throughout

the paper. The term “pth-order method” will be used following the convention of optimization.

In particular, by considering F(x) = ∇f(x) speciﬁcally as a gradient mapping of some function

f(x), the ﬁrst-order method in VI refers to using only the information from the operator F(·), and

the pth-order method refers to using the (p−1)th-order derivative of the operator: ∇p−1F. As

a result, the term “gradient” will also be used to refer to F(·) due to the background of VI in

solving saddle-point and optimization models. In the pth-order method, the Lipschitz continuity of

∇p−1F(x) is assumed with constant Lp:

k∇p−1F(x)− ∇p−1F(y)k ≤ Lpkx−yk.(5)

In this paper, the proposed Approximation-based Regularized Extra-gradient method (ARE) can be

viewed as a generalization of the extra-gradient method (4) in some sense. While the intermediate

iterate xk+0.5in extra-gradient method (4) is updated by a gradient projection step, in ARE it is

replaced by solving a VI subproblem:

xk+0.5:= VIX(˜

F(x;xk) + γkx−xkkp−1(x−xk)),(6)

where ˜

F(x;xk) is an approximation mapping at xkthat satisﬁes a pth-order Lipschitz bound with

respect to F(x) (will be formally deﬁned later), and kx−xkkp−1(x−xk) is the gradient mapping of

a (p+ 1)th-order regularization. Therefore, we refer to the update in 6 as (p+ 1)th-order regularzied

VI subproblem. A common choice of ˜

F(x;xk) is the Taylor approximation of F(x) at xk, namely,

F(x;xk) :=

p−1

i=0

i!∇iF(xk)[x−xk]i.

Such choice of ˜

F(x;xk) not only recovers the extra-gradient method when p= 1, but also gives a

succinct update principle for higher-order methods when p > 1. However, the Taylor approximation

needs not be the only motivation for the ARE. We show that the key underlying condition is the

aforementioned pth-order Lipschitz bound, therefore any approximation satisfying this condition

can be considered as a valid method under the general framework of ARE. This not only generalizes

the existing methods but also opens up the possibilities of developing diﬀerent methods from those

in the literature, and we will discuss several such speciﬁc schemes in Section (6). By applying the

abstraction of “approximation” in ARE, a unifying and concise analysis is available to establish the

iteration complexity bound that can be readily speciﬁed to any concrete approximation in various

methods given the diﬀerent problem structures at hand.

The rest of the paper is organized as follows. Section 2 reviews relevant ﬁrst-order and higher-

order methods for solving monotone VI. Section 3 formally presents ARE and analyzes the global

convergence for both monotone and strongly monotone cases. Section 4 continues the discussion

with strongly monotone VI and establishes the local superlinear convergence. A modiﬁed ARE

algorithm is in place to guarantee both global linear and local superlinear convergence. Section 5

is devoted to the discussion of solving the VI subproblem with the approximation operator under

special cases. In Section 6, we present several structured ARE schemes given the original operator

is of the composite form F(x) = H(x) + G(x) and discuss their connections to the existing meth-

ods. We further discuss two specialized approximation concepts: the outer approximation and the

inner approximation, given the general composite form F(x) = H(G(x)). Numerical results from

preliminary experiments are demonstrated in Section 7, and we conclude the paper in Section 8.

2 Literature Review

Historically, Martinet [14] ﬁrst introduced the notion of proximal point method, which was later

studied and popularized by Rochafellar [32]. Tseng in [37] studied the linear convergence of proximal

point method, extra-gradient method, and matrix-splitting method, given a certain error bound

is satisﬁed. The strongly monotone operator can be an immediate example of such case. As a

matter of fact, subsequently developed methods such as modiﬁed forward-backward method [36],

mirror-prox method [19], dual-extrapolation method [24, 27], hybrid proximal extra-gradient (HPE)

method [18], extra-point method [7] are all based on the concept of extra-gradient.

Another type of method, known as optimistic gradient descent ascent method (OGDA), was ﬁrst

proposed by Popov [31]:

xk+1 := PXxk−αF (xk)−η(F(xk)−F(xk−1)),(7)

for some positive α, η > 0, where PXdenotes the projection operator onto X. Unlike the update

in extra-gradient method (4) which uses an extra step, OGDA only requires one update (one

projection) per iteration and uses the information from the previous iterate xk−1instead. The

optimal convergence of OGDA, in both monotone and strongly monotone VI, is established by

Mokhtari et al. [15, 16]. The extra-point method proposed by Huang and Zhang [7] extends the

concepts of the extra-gradient method, OGDA, Nesterov’s acceleration in optimization [21], and

the “heavy-ball” method by Polyak [30] and combines them in a unifying update scheme. If the

parameters associated to these diﬀerent components satisfy a certain constraint set, it is shown that

optimal iteration complexity is guaranteed. There is another line of work that studies variants of

extra-gradient type methods [38, 11, 9] and proximal point methods [35, 12, 29] with the anchoring

update, where in each iteration the initial iterate is used as the component of convex combination.

The iterates produced are shown to converge among these diﬀerent methods [39], at a rate same

as the optimal convergence rate (to the solution), and the iteration complexities are improved by

constant orders compared to vanilla extra-gradient method.

The above methods are known as the ﬁrst-order methods. The lower bound of the iteration

complexity for the ﬁrst-order methods applied to monotone VI is Ω 1

, as established by Ne-

mirovsky and Yudin [20], while for strongly monotone VI, it is Ω κln 1

, shown by Zhang et

al. [40] in the context of strongly-convex-strongly-concave saddle-point problems. Methods such

as extra-gradient method, mirror-prox method, dual-extrapolation method [24, 27], HPE, OGDA,

extra-point method have been proven to achieve these lower bounds, hence optimal.

The work of Taji et al. [34] is among the ﬁrst to consider second-order methods for solving VI. A

linearized VI subproblem with operator F(xk) + ∇F(xk)(x−xk) is solved in each iteration and the

merit function f(x) = max

x0∈XhF(x), x −x0i− µ

2kx−x0k2is used to prove the global convergence, with

an additional local quadratic convergence. However, no explicit iteration complexity is established

for second-order methods until recently. Following the line of research in [34], Huang and Zhang

[6] speciﬁcally consider unconstrained strongly-convex-strongly-concave saddle point problem and

incorporate the idea of cubic regularization (originally proposed by Nesterov in the context of

optimization [26]), proving the global iteration complexity Oκ2+κL2

µln 1

, where L2is the

Lipschitz constant of the Hessian information, in addition to the local quadratic convergence.

Another line of research on second-order methods was started by Monteiro and Svaiter [17]. They

propose a Newton Proximal Extragradient (NPE) method, which can be viewed as a special case

of the HPE with large step size. In HPE, the ﬁrst step solves approximately the proximal point

update (3) (denote as xk+0.5), while the second step is a regular extra-gradient step. The “large

step size” condition, which is key to guarantee a superior convergence rate, requires:

γk≥θ

kxk+0.5−xkk(8)

for some constant θ > 0. Note that since xk+0.5depends on γk, a certain procedure is required

to determine xk+0.5and γksuch that (8) also holds. By observing that the set of γksatisfying

the condition is in fact a closed interval, they develop a bisection method to iteratively reduce

the range of γkand solve for xk+0.5for each ﬁxed γkuntil the condition is satisﬁed. They show

that for monotone VI, NPE admits O1/2

3iteration complexity for ergodic mean of xk+0.5over

0≤k≤N−1, which is an improvement over the optimal ﬁrst-order complexity O1

. While

NPE can also be expressed in the form of second-order mirror-prox method, Bullins and Lai [3]

propose a “higher-order mirror-prox method”, extending the second-order mirror-prox method to

pth-order and establish O1/ 2

p+1 iteration complexity. They replace the linearization F(xk) +

∇F(xk)(x−xk) with the Taylor approximation of F(xk),

p−1

i=0

i!∇iF(xk)[x−xk]i, together with an

higher-order constraint on γkand xk+0.5similar to (8). They also demonstrate an explicit procedure

to instantiate the proposed method in unconstrained problem with p= 2 and a bisection method

to search for xk+0.5and γk. In [28], Ostroukhov et al. further extend the higher-order mirror-

prox method to strongly monotone VI by incorporating the restart procedure, which yields global

iteration complexity OLp

µ2

p+1 ln 1

. The local quadratic convergence is then guaranteed by

incorporating CRN-SPP proposed in [6]. Nesterov in [22] proposes solving constrained convex

optimization with cubic regularized Newton method and extends the results to monotone VI with

cubic regularized Newton modiﬁcation of the dual-extrapolation method [24]. The global iteration

complexity is shown to be O(1

) for monotone VI, with local quadratic convergence for strongly

monotone VI.

Recently, there are new developments of higher-order methods for VI that are closely related to

the work in this paper. Jiang and Mokhtari [8] propose the Generalized Optimistic Method, which

is a general pth-order variant of OGDA. Instead of using F(xk) to approximate the proximal point

update direction F(xk+1) with correction F(xk)−F(xk−1) as in OGDA (7), they propose to use a

general approximation P(xk+1;Ik) with correction F(xk)−P(xk;Ik−1), where P(x;Ik) can contain

pth-order information and Ikis the information up to kth iteration. Unlike ARE, the Generalized

Optimistic Method does not require an additional projection step in the update, nor does it require

restart to establish linear convergence for strongly monotone VI (see [28] and the discussion in

Section 3.2). However, it still requires to incorporate a bisection subroutine to solve the higher-

order subproblem, similar to the higher-order mirror-prox method [3], while ARE does not. Adil

et al. [1] propose a pth-order method that improves upon the higher-order mirror-prox method

in [3]. The improvement comes from incorporating the gradient of (p+ 1)th-order regularization

in the higher-order VI subproblem, which makes the bisection subroutine unnecessary, and the

global complexity is improved by a logarithmic factor. The special case of ARE, where the Taylor

approximation is used as the approximation operator ˜

F(x;xk) in the (p+ 1)th-order regularized VI

subproblem (6), coincides with the method proposed in [1] for solving monotone VI. In this paper,

we further develop the global and local convergence for strongly monotone VI (see Section 3.2 and

4) and discuss the possibilities beyond Taylor approximation in the subproblem, which is one of the

key underlying motivation behind the ARE framework. Lin and Jordan [13] propose a pth-order

generalization of Nesterov’s dual extrapolation method [24], referred to Perseus. Same as ARE

and [1], Perseus does not require bisection subroutines by solving the VI subproblem with higher-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AnApproximation-BasedRegularizedExtra-GradientMethodforMonotoneVariationalInequalitiesKevinHuang*ShuzhongZhangOctober11,2022AbstractInthispaper,weproposeageneralextra-gradientschemeforsolvingmonotonevariationalinequalities(VI),referredtohereasApproximation-basedRegularizedExtra-gradientmethod(ARE)....

展开>> 收起<<

An Approximation-Based Regularized Extra-Gradient Method for Monotone Variational Inequalities Kevin HuangShuzhong Zhang_2.pdf

共29页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

An Approximation-Based Regularized Extra-Gradient Method for Monotone Variational Inequalities Kevin HuangShuzhong Zhang_2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: