Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity the Case of Negative Comonotonicity Eduard Gorbunov1 2Adrien Taylor3Samuel Horv ath1Gauthier Gidel4 5

2025-05-06 0 0 744.45KB 28 页 10玖币

侵权投诉

Convergence of Proximal Point and Extragradient-Based Methods

Beyond Monotonicity: the Case of Negative Comonotonicity

Eduard Gorbunov 1 2 Adrien Taylor 3Samuel Horv´

ath 1Gauthier Gidel 4 5

Abstract

Algorithms for min-max optimization and vari-

ational inequalities are often studied under

monotonicity assumptions. Motivated by non-

monotone machine learning applications, we fol-

low the line of works (Diakonikolas et al.,2021;

Lee & Kim,2021;Pethick et al.,2022;B

ohm,

2022) aiming at going beyond monotonicity by

considering the weaker negative comonotonicity

assumption. In this work, we provide tight com-

plexity analyses for the Proximal Point (PP), Ex-

tragradient (EG), and Optimistic Gradient (OG)

methods in this setup, closing several questions

on their working guarantees beyond monotonicity.

In particular, we derive the ﬁrst non-asymptotic

convergence rates for PP under negative comono-

tonicity and star-negative comonotonicity and

show their tightness via constructing worst-case

examples; we also relax the assumptions for the

last-iterate convergence guarantees for EG and

OG and prove the tightness of the existing best-

iterate guarantees for EG and OG via construct-

ing counter-examples.

1. Introduction

The study of efﬁcient ﬁrst-order methods for solving

variational inequality problems (VIP) have known a surge

of interest due to the development of recent machine

learning (ML) formulations involving multiple objectives.

VIP appears in various ML tasks such as robust learn-

ing (Ben-Tal et al.,2009), adversarial training (Madry et al.,

2018), Generative Adversarial Networks (Goodfellow et al.,

Mohamed bin Zayed University of Artiﬁcial Intelligence, UAE

Moscow Institute of Physics and Technology, Russia (part of

this work was done while the author was a researcher at MIPT)

INRIA & D.I.

Ecole Normale Sup

erieure, CNRS & PSL Research

University, France

Universit

e de Montr

eal and Mila, Canada

5Canada CIFAR AI Chair. Correspondence to: Eduard Gorbunov

<eduard.gorbunov@mbzuai.ac.ae>.

Proceedings of the

40 th

International Conference on Machine

Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright

2023 by the author(s).

2014), or games with decision-dependent data (Narang

et al.,2022). In this work, we focus on unconstrained

VIPs

, which we state formally in the slightly more general

form of an inclusion problem:

ﬁnd x∗∈Rdsuch that 0∈F(x∗),(IP)

where

F:Rd⇒Rd

is some (possibly set-valued) mapping.

In the sequel, we use the slightly abusive shorthand notation

F(x)

to denote any particular image of

by the mapping

F, independently of Fbeing single-valued of not.

Among the main simple ﬁrst-order methods under

consideration for such problems, the extragradient

method (EG) (Korpelevich,1976) and the optimistic gradi-

ent method (OG) (Popov,1980) occupy an important place.

These two algorithms have been traditionally analyzed un-

der the assumption that the considered operator is monotone

and Lipschitz (Korpelevich,1976;Popov,1980) and are

often interpreted as an approximation to the proximal point

(PP) method (Nemirovski,2004;Mokhtari et al.,2019).

PP can be formally stated as an implicit iterative method

generating a sequence

x1, x2, . . . ∈Rd

when initiated at

some x0∈Rd:

xk+1 =xk−γF (xk+1),(PP)

for some well-chosen stepsize

γ∈R

. When

is single-

valued, one can instead use explicit methods such as EG:

exk=xk−γ1F(xk),

xk+1 =xk−γ2F(exk),∀k≥0,(EG)

or OG with the additional initialization ex0=x0:

exk=xk−γ1F(exk−1),∀k > 0,

xk+1 =xk−γ2F(exk),∀k≥0,(OG)

where

γ1, γ2∈R

are some well-chosen stepsizes. For

examples of the usage of extragradient-based methods in

practice, we refer to (Daskalakis et al.,2018) who use a

variant of OG with Adam (Kingma & Ba,2014) estima-

tors to train WGAN (Gulrajani et al.,2017) on CIFAR10

We refer to (Gidel et al.,2019) for the details on how these

formulations appear in the real-world problems.

arXiv:2210.13831v3 [math.OC] 18 Jul 2023

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity

(Krizhevsky et al.,2009), (Brown & Sandholm,2019) where

extragradient-based methods were applied in regret match-

ing, (Farina et al.,2019) for the application to counterfactual

regret minimization, and (Anagnostides et al.,2022) where

these methods were used for training agents to play poker.

Interestingly, until recently, the convergence rate for the

last iterate of neither EG nor OG were known even when

is (maximally) monotone and Lipschitz. First results in

this direction were obtained by Golowich et al. (2020b;a)

under some additional assumptions (namely the Jacobian

being Lipschitz). Later, Gorbunov et al. (2022b;c);

Cai et al. (2022b) closed this question by proving the tight

worst-case last iterate convergence rate of these methods

under monotonicity and Lipschitzness of F.

As some important motivating applications involve deep

neural networks, the operator

under consideration is typ-

ically not monotone. However, for general non-monotone

problems approximating ﬁrst-order locally optimal solutions

can be intractable (Daskalakis et al.,2021;Diakonikolas

et al.,2021). Thus, it is natural to consider assumptions on

structured non-monotonicity. Recently Diakonikolas et al.

(2021) proposed to analyse EG using a weaker assumption

than the traditional monotonicity. In the sequel, this as-

sumption is referred to as

-negative comonotonicity (with

ρ≥0). That is, for all x, y ∈Rd, the operator Fsatisﬁes:

⟨F(x)−F(y), x −y⟩≥−ρ∥F(x)−F(y)∥2.(1)

A number of works have followed the idea of Diakonikolas

et al. (2021) and considered

(1)

as their working assumption,

see, e.g., (Yoon & Ryu,2021;Lee & Kim,2021;Luo &

Tran-Dinh,2022;Cai et al.,2022a;Gorbunov et al.,2022a).

Albeit being a reasonable ﬁrst step toward the understand-

ing of the behavior of algorithms for

(IP)

beyond

being

monotone, it remains unclear by what means the

-negative

comonotonicity assumption is general enough to capture

complex non-monotone operators. This question is crucial

for developing a clean optimization theory that can fully

encompass ML applications involving neural networks.

To the best of our knowledge,

(-star)-negative comono-

tonicity is the weakest known assumption under which

extragradient-type methods can be analyzed for solving

(IP)

The ﬁrst part of this work is devoted to providing simple

interpretations of this assumption. Then, we close the prob-

lem of studying the convergence rate of the PP method

in this setting, the base ingredient underlying most algo-

rithms for solving

(IP)

(which are traditionally interpreted

as approximations to PP, see (Nemirovski,2004)). That

is, we provide upper and lower convergence bounds as well

as a tight conditions on its stepsize for PP under negative

comonotonicity. We eventually consider the last-iterate con-

vergence of EG and OG and provide an almost complete

picture in that case, listing the remaining open questions.

Before moving to the next sections, let us mention that many

of our results were discovered using the performance esti-

mation approach, ﬁrst coined by (Drori & Teboulle,2014)

and formalized by (Taylor et al.,2017c;a). The operator ver-

sion of the framework is due to (Ryu et al.,2020). We used

the framework through the packages PESTO (Taylor et al.,

2017b) and PEPit (Goujaud et al.,2022), thereby providing

a simple way to validate our results numerically.

1.1. Preliminaries

In the context of

(IP)

, we refer to

as being

-star-negative

comonotone (

ρ≥0

) – a relaxation

(1)

– if for all

x∈Rd

and x∗being a solution to (IP), we have:

⟨F(x), x −x∗⟩≥−ρ∥F(x)∥2.(2)

Furthermore, similar to monotone operators (see (Bauschke

et al.,2011) or (Ryu & Yin,2020) for details), we assume

that the mapping

is maximal in the sense that its graph is

not strictly contained in the graph of any other

-negative

comonotone operator (resp.,

-star-negative comonotone),

which ensures the corresponding proximal operator used

in the sequel to be well-deﬁned. Some examples of star-

negative comonotone operators are given in (Pethick et al.,

2022, Appendix C). Moreover, if

is star-monotone or

quasi-strongly monotone (Loizou et al.,2021), then

is also star-negative comonotone. The examples of star-

monotone/quasi-strongly monotone operators that are not

monotone are given in (Loizou et al.,2021, Appendix

A.6). Next, there are some studies of the eigenvalues

of the Jacobian around the equilibrium of GAN games

(Mescheder et al.,2018;Nagarajan & Kolter,2017;Berard

et al.,2019). These studies imply that the corresponding

variational inequalities are locally quasi-strongly monotone.

Finally, when

-Lipschitz it satisﬁes

⟨F(x), x −x∗⟩ ≥

−L∥x−x∗∥2

. If in addition

∥F(x)∥ ≥ η∥x−x∗∥

for

some

η > 0

(meaning that

changes not “too slowly”),

then

⟨F(x), x −x∗⟩ ≥ − L

η2∥F(x)∥2

, i.e., condition

(2)

holds with ρ=L

η2.

For the analysis of the EG and OG, we further assume

be L-Lipschitz, meaning that for all x, y ∈Rd:

∥F(x)−F(y)∥ ≤ L∥x−y∥.(3)

Note that in that case,

is a single-valued mapping. In this

case, IP transforms into a variational inequality:

ﬁnd x∗∈Rdsuch that F(x∗) = 0.(VIP)

1.2. Related Work

Last-iterate convergence rates in the monotone case.

Several recent theoretical advances focus on the last-iterate

For the example of star-negative comonotone operator that

is not negative comonotone we refer to (Daskalakis et al.,2020,

Section 5.1) and (Diakonikolas et al.,2021, Section 2.2).

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity

Table 1: Known and new

O(1

/N)

convergence results for PP,EG and OG. Notation: NC = negative comonotonicity, SNC = star-negative

comonotonicity,

-Lip. =

-Lipschitzness. Whenever the derived results are completely novel or extend the existing ones, we highlight

them in green.

Method Setup ρ∈Convergence Reference Counter-/Worst-case examples?

PP(1) NC [0,+∞)Last-iterate Theorem 3.1 Theorem 3.2 (worst-case example) & 3.3 (diverge for γ≤2ρ)

SNC [0,+∞)Best-iterate Theorem 3.1 Theorem 3.2 (worst-case example) & 3.3 (diverge for γ≤2ρ)

NC + L-Lip. [0,1

/16L)Last-iterate (Luo & Tran-Dinh,2022)✗

NC + L-Lip. [0,1

/8L)Last-iterate Theorem 4.2 Theorem 4.3 (diverge for ρ≥1

/2Land any γ1, γ2>0)

SNC + L-Lip. [0,1

/8L)Best-iterate (Diakonikolas et al.,2021)✗

SNC + L-Lip. [0,1

/2L)Best-iterate (Pethick et al.,2022) Theorem 3.4 (diverge for γ1=1

/Land ρ≥(1−Lγ2)

/2L)

SNC + L-Lip. [0,1

/2L)Best-iterate Theorem 4.2 (2) Theorem 4.3 (diverge for ρ≥1

/2Land any γ1, γ2>0)

NC + L-Lip. [0,8

/(27√6L))Last-iterate (Luo & Tran-Dinh,2022)✗

NC + L-Lip. [0,5

/62L)Last-iterate Theorem 4.4 Theorem 4.5 (diverge for ρ≥1

/2Land any γ1, γ2>0)

SNC + L-Lip. [0,1

/2L)Best-iterate (B¨

ohm,2022)✗

SNC + L-Lip. [0,1

/2L)Best-iterate Theorem 4.4 (2) Theorem 4.5 (diverge for ρ≥1

/2Land any γ1, γ2>0)

(1)

The best-iterate convergence result can be obtained (Iusem et al.,2003, Lemma 2), and the last-iterate convergence result can also

be derived from the non-expansiveness of PP update (Bauschke et al.,2021, Proposition 3.13 (iii)). At the moment of writing our

paper, we were not aware of these results.

(2) Although these results are not new for the best-iterate convergence of EG and OG, the proof techniques differ from prior works.

convergence of the methods for solving IP/VIP with mono-

tone operator

. In particular, He & Yuan (2015) derive

the last-iterate O(1

/N)rate3for PP and Gu & Yang (2020)

show its tightness. Under the additional assumption of Lips-

chitzness of

and of its Jacobian, Golowich et al. (2020b;a)

obtain last-iterate

O(1

/N)

convergence for EG and OG

and prove matching lower bounds for them. Next, Gor-

bunov et al. (2022b;c); Cai et al. (2022b) prove similar

upper bounds for EG/OG without relying on the Lipschitz-

ness (and even existence) of the Jacobian of

. Finally, for

this class of problems one can design (accelerated) methods

with provable

O(1

/N2)

last-iterate convergence rate (Yoon

& Ryu,2021;Bot et al.,2022;Tran-Dinh & Luo,2021;

Tran-Dinh,2022). Although

O(1

/N2)

is much better than

O(1

/N)

,EG/OG are still more popular due to their higher

ﬂexibility. Moreover, when applied to non-monotone prob-

lems the mentioned accelerated methods may be attracted

to “bad” stationary points, see, e.g., (Gorbunov et al.,2022c,

Example 1.1).

Best-iterate convergence under

-star-negative comono-

tonicity. The convergence of EG is also studied under

-star-negative comonotonicity (and

-Lipschitzness): Di-

akonikolas et al. (2021) prove best-iterate

O(1

/N)

conver-

gence of EG with

γ2< γ1

for any

ρ < 1

/8L

and Pethick

et al. (2022) derive a similar result for any

ρ < 1

/2L

. More-

over, Pethick et al. (2022) show that EG is not necessary

convergent when

γ1=1

and

ρ≥(1−Lγ2)

/2L

ohm

(2022) prove best-iterate

O(1

/N)

convergence of OG for

ρ < 1

/2L, i.e., for the same range of ρas in the best-known

result for EG.

Here and below we mean the rates of convergence in terms

of the squared residual

∥xN−xN−1∥2

in the case of set-valued

operators and ∥F(xN)∥2in the case of single-valued ones.

Last-iterate convergence under

-negative comonotonic-

ity. In a very recent work, Luo & Tran-Dinh (2022) prove

the ﬁrst last-iterate

O(1

/N)

convergence results for EG and

OG applied to solve VIP with

-negative comonotone

Lipschitz operator. Both results rely on the usage of

γ1=γ2

Next, for EG the result from (Luo & Tran-Dinh,2022) re-

quires

ρ < 1

/16L

and for OG the corresponding result is

proven for

ρ < 4

/(27√6L)

. In contrast, for the accelerated

(anchored) version of EG Lee & Kim (2021) prove

O(1

/N2)

last-iterate convergence rate for any

ρ < 1

/2L

, which is a

larger range of

than in the known results for EG/OG from

(Luo & Tran-Dinh,2022).

1.3. Contributions

⋄

Spectral viewpoint on negative comonotonicity. Our

work provides a spectral interpretation of negative comono-

tonicity, shedding some light on the relation between this

assumption and classical monotonicity, Lipschitzness, and

cocoercivity.

⋄

Closer look at the convergence of Proximal Point

method. We derive

O(1

/N)

last-iterate and best-iterate

convergence rates for PP under negative comonotonicity

and star-negative comonotonicity assumptions, respectively.

These results follow from existing ones (Iusem et al.,2003;

Bauschke et al.,2021). However, we go further and show

the tightness of the derived results via constructing matching

worst-case examples and also propose counter-examples for

the case when the stepsize is smaller than 2ρ.

⋄

New results for Extragradient-Based Methods. We

derive

O(1

/N)

last-iterate convergence of EG and OG

under milder assumptions on the negative comonotonicity

parameter

than in the prior work by Luo & Tran-Dinh

(2022), see the details in Table 1. We also provide

alternative analyses of the best-iterate convergence of EG

and OG under star-negative comonotonicity and recover

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity

the best-known results in this case (Pethick et al.,2022;

ohm,2022). Finally, we show that the range of allowed

cannot be improved for EG and OG via constructing

counter-examples for these methods.

⋄

Constructive proofs. We derive the proofs for the

last-iterate convergence of PP,EG, and OG as well

as worst-case examples for PP using using the perfor-

mance estimation technique (Drori & Teboulle,2014;

Taylor et al.,2017c;a). In particular, it required us to

extend some theoretical and program tools to handle

negative comonotone and star-negative comonotone

problems; see the details in App. Band Github-repository

https://github.com/eduardgorbunov/

Proximal_Point_and_Extragradient_based_

methods_negative_comonotonicity

, containing

the codes for generating worst-case examples for PP,

numerical veriﬁcation of the derived results and symbolical

veriﬁcation of certain technical derivations. We believe that

these tools are important on its own and can be applied in

future works studying the convergence of different methods

under negative comonotonicity.

2. A Closer Look at Negative Comonotonicity

Negative comonotonicity (also known as cohypomonotonic-

ity) was originally introduced as a relaxation of monotonic-

ity that is sufﬁcient for the convergence of PP (Pennanen,

2002). This assumption is relatively weak: one can show

that

-negative comonotone in a neighborhood of so-

lution

x∗

for large enough

, if the (possibly set-valued)

operator

F−1:Rd⇒Rd

has a Lipschitz localization

around

(0, x∗)∈GF−1

, where

GF−1

denotes the graph

F−1

(Pennanen,2002, Proposition 7). The next lemma

characterizes negative comonotone operators; it is techni-

cally very close to (Bauschke et al.,2011, Proposition 4.2)

(on cocoercive operators).

Lemma 2.1.

F:Rd⇒Rd

is maximally

-negative

comonotone (

ρ≥0

) if and only if operator

Id + 2ρF

expansive.

The proof of this lemma follows directly from the deﬁnition

of negative comonotonicity. Among others, it implies the

following result about the spectral properties of the Jacobian

of negative comonotone operator (when it exists).

Theorem 2.2. Let

F:Rd→Rd

be a continuously differ-

entiable. Then, the following statements are equivalent:

•Fis ρ-negative comonotone,

•Re(1

/λ)≥ −ρfor all λ∈Sp(∇F(x)),∀x∈Rd.

We notice that the above theorem holds for any continuously

differentiable operator

. In the case of the linear operator

-1

ρ-1

2ρ0

-i

2ρ

No ρ-negative comonotonicity

Figure 1: Visualization of Theorem 2.2. Red open disc

corresponds to the constraint

Re(1

/λ)<−ρ

that deﬁnes

the set such that all eigenvalues the Jacobian of

-negative

comonotone operator should lie outside this set.

, this result is known (Bauschke et al.,2021, Proposition

5.1). The condition

Re(1

/λ)≥ −ρ

means that

lies outside

the disc in

centered at

−1

/2ρ

and having radius

/2ρ

, see

Figure 1. In particular, for the case of twice differentiable

functions

-negative comonotonicity forbids the Hessian

to have eigenvalues in

(−1

/ρ,0)

, i.e., eigenvalues of the

Hessian have to be either negative with sufﬁciently large

absolute value or non-negative. An alternate interpretation

of Figure 1can be formally made in terms of scaled relative

graphs, see (Ryu et al.,2022); see also older references using

such illustrations (Eckstein,1989;Eckstein & Bertsekas,

1992), or (Giselsson & Boyd,2016, arXiv version 1to 3).

Finally, we touch the following informal question: to what

extent negative comonotone operators are non-monotone?

To formalize a bit we consider a way more simpler ques-

tion: can negative comonotone operator have isolated ze-

ros/solutions of VIP?Unfortunately, the answer is no.

Theorem 2.3 (Corollary 3.15 from (Bauschke et al.,2021)

F:Rd⇒Rd

is maximally

-negative comonotone, then

the solution set X∗=F−1(0) is convex.

Proof.

The proof follows from the observations provided by

Pennanen (2002). First, notice that

and its Yosida regular-

ization

(F−1+ρ·Id)−1

have the same set of the solutions:

((F−1+ρ·Id)−1)−1(0) = (F−1+ρ·Id)(0) = F−1(0)

Next, by deﬁnition

(1)

we have that maximal

-negative

comonotonicity of

implies maximal monotonicity of

F−1+ρ·Id

that is equivalent to maximal monotonicity of

(F−1+ρ·Id)−1

. Since the set of zeros of maximal mono-

We were not aware of the results from (Bauschke et al.,2021)

during the work on our paper.

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity

tone operator is convex (Bauschke et al.,2011, Proposition

23.39), we have the result.

Therefore, despite its apparent generality, negative comono-

tonicity is not satisﬁed (globally) for the many practical

tasks that have isolated optima. Nevertheless, studying the

convergence of traditional methods under negative comono-

tonicity can be seen as a natural step towards understanding

their behaviors in more complicated non-monotonic cases.

3. Proximal Point Method

In this section, we consider Proximal Point method (Mar-

tinet,1970;Rockafellar,1976), which is usually written as

xk+1 = (F+γId)−1xk

(where we assume here that

γ > 0

is large enough so that the iteration is well and uniquely de-

ﬁned) or equivalently:

xk+1 =xk−γF (xk+1).(PP)

In particular, for given values of

N∈N

R > 0

ρ > 0

, and

γ > 0

we focus on the following question: what guarantees

can we prove on

∥xN−xN−1∥2

(in particular: as a func-

tion of

), where

{xk}N

k=0

is generated by PP with stepsize

after

N≥1

iterations of solving IP with

F:Rd⇒Rd

being

-negative comonotone and

∥x0−x∗∥2≤R2

?This

kind of question can naturally be reformulated as an ex-

plicit optimization problem looking for worst-case prob-

lem instances, often referred to as performance estimation

problems (PEPs), as introduced and formalized in (Drori &

Teboulle,2014;Taylor et al.,2017c;a):

max

F,x0∥xN−xN−1∥2(4)

s.t. Fsatisﬁes (1),

∥x0−x∗∥2≤R2,0∈F(x∗),

xk+1 =xk−γF (xk+1), k = 0,1, . . . , N −1.

As we show in Appendix B,

(4)

can be formulated as a

semideﬁnite program (SDP). For constructing and solving

this SDP problem corresponding to

(4)

numerically, one

can use the PEPit package (Goujaud et al.,2022) (after

adding the class of

-negative comonotone operators to it),

which thereby allows constructing worst-case guarantees

and examples, numerically. Figure 2a shows the numerical

results obtained by solving

(4)

for different values of

We observe that worst-case value of

(4)

behaves as

O(1

/N)

similarly to the monotone case.

Motivated by these numerical results, we derive the follow-

ing convergence rates for PP.

Theorem 3.1. Let

F:Rd⇒Rd

be maximally

-star-

negative comonotone. Then, for any

γ > 2ρ

the iterates

(a) Worst-case ∥F(xN+1)∥2

(b) Worst-case trajectories

Figure 2: In (a), we report the solution of

(4)

for different

values of

and

. The plot illustrates that for the consid-

ered range of

and values of

PP converges as

O(1

/N)

terms of

∥F(xN)∥2

. In (b), we show the worst-case trajecto-

ries of PP for

N= 40

. The form of trajectories hints that the

worst-case operator is a rotation operator. For each particular

choice of

and

γ > 2ρ

we observed numerically that quan-

tities

ρ∥F(xk)∥2/∥xk−x∗∥

and

−⟨F(xk),xk−x∗⟩

/(∥F(xk)∥·∥xk−x∗∥)

remain the same during the run of the method (the

standard deviation of arrays

{ρ∥F(xk)∥2/∥xk−x∗∥}N

k=1

and

{−⟨F(xk),xk−x∗⟩

/(∥F(xk)∥·∥xk−x∗∥)}N

k=1

is of the order

10−6−

10−7

). Finally, in (c), we illustrate that these characteristics coin-

cide with

/√N γ(γ−2ρ)

as long as the total number of steps

sufﬁciently large (N≥max{ρ2/γ(γ−2ρ),1}).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ConvergenceofProximalPointandExtragradient-BasedMethodsBeyondMonotonicity:theCaseofNegativeComonotonicityEduardGorbunov12AdrienTaylor3SamuelHorv´ath1GauthierGidel45AbstractAlgorithmsformin-maxoptimizationandvari-ationalinequalitiesareoftenstudiedundermonotonicityassumptions.Motivatedbynon-monotonema...

展开>> 收起<<

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity the Case of Negative Comonotonicity Eduard Gorbunov1 2Adrien Taylor3Samuel Horv ath1Gauthier Gidel4 5.pdf

共28页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity the Case of Negative Comonotonicity Eduard Gorbunov1 2Adrien Taylor3Samuel Horv ath1Gauthier Gidel4 5

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: