ROBUST DATA-DRIVEN ACCELERATED MIRROR DESCENT Hong Ye TanSubhadip MukherjeeJunqi Tang Andreas HauptmannCarola-Bibiane Sch onlieb

2025-05-03 0 0 425.92KB 5 页 10玖币

侵权投诉

ROBUST DATA-DRIVEN ACCELERATED MIRROR DESCENT

Hong Ye Tan⋆Subhadip Mukherjee⋆†Junqi Tang⋆

Andreas Hauptmann‡§ Carola-Bibiane Sch¨

onlieb⋆

⋆University of Cambridge, DAMTP, Cambridge CB3 0WA, U.K.

†Department of Computer Science, University of Bath, U.K.

‡University of Oulu, Research Unit of Mathematical Sciences, P.O.Box 8000, 90014 University of Oulu

§University College London, Department of Mathematics, 25 Gordon St, London WC1H 0AY, U.K.

ABSTRACT

Learning-to-optimize is an emerging framework that

leverages training data to speed up the solution of certain

optimization problems. One such approach is based on the

classical mirror descent algorithm, where the mirror map is

modelled using input-convex neural networks. In this work,

we extend this functional parameterization approach by intro-

ducing momentum into the iterations, based on the classical

accelerated mirror descent. Our approach combines short-

time accelerated convergence with stable long-time behavior.

We empirically demonstrate additional robustness with re-

spect to multiple parameters on denoising and deconvolution

experiments.

Index Terms—Mirror descent, accelerated optimization,

learning-to-optimize, input-convex neural networks

1. INTRODUCTION

Learning-to-optimize is a general framework of methods that

seek to minimize some objective functions quickly (in as few

iterations as possible). This has been proposed in various

settings such as unsupervised learning or using primal-dual

methods, with applications such as imaging inverse problems

and neural network training [1, 2, 3]. A recent work focuses

on introducing a learned functional parameterization into the

classical mirror descent (MD) algorithm, obtaining faster con-

vergence rates with approximate convergence guarantees [4].

In this work, we propose a learned convex optimization

approach based on the classical accelerated mirror descent

(AMD) scheme. Our aim is to minimize a family of convex

functions fin some function class Fof qualitatively similar

problems, such as variational image denoising. In this work,

we will study and demonstrate convergence and robustness

of our proposed learned scheme, and compare it to existing

learned and classical optimization methods. Section 2 out-

lines the AMD scheme, and section 3 contains various exper-

imental results on the robustness of our proposed method.

1.1. Mirror Descent

Mirror descent is a generalization of gradient descent (GD),

ﬁrst introduced by Nemirovsky and Yudin [5]. By exploiting

the geometry of the cost function, MD achieves competitive

convergence rate bounds for certain problems, including on-

line learning and tomographic reconstruction [6, 7]. We ﬁrst

outline the method as presented by Beck and Teboulle [8].

Let Ψbe a convex function on a closed convex set

X ⊆ Rn. Let (Rn)∗denote the corresponding dual space

of Rn. We denote by ¯

R=R∪ {+∞} the extended real

line. Recall that the convex conjugate (or Fenchel con-

jugate) of Ψ, denoted Ψ∗: (Rn)∗→¯

R, is given by

Ψ∗(y) = supx∈X (⟨y, x⟩ − f(x)).Ψinduces a Bregman

divergence BΨ:X × X → ¯

R, deﬁned as:

BΨ(x, y) = Ψ(x)−Ψ(y)− ⟨∇Ψ(y), x −y⟩.

Deﬁnition 1 (Mirror potential) We say Ψ : X → Ris

amirror potential if it is continuously differentiable and

strongly convex. We call the gradient ∇Ψ : X → (Rn)∗a

mirror map.

If Ψis a mirror potential, then the convex conjugate Ψ∗is

everywhere differentiable, and moreover satisﬁes ∇Ψ∗=

(∇Ψ)−1[8, 9]. The standard MD update rule for minimizing

convex fwith initialization x0∈ X is as follows [8]:

yk=∇Ψ(xk)−tk∇f(xk), xk+1 =∇Ψ∗(yk).(1)

For convex Lipschitz f, MD is able to attain O(1/√k)con-

vergence rate. Similarly to GD, algorithmic extensions such

as stochasticity, ergodicity or acceleration are available for

MD [10, 11, 12, 13]. MD can be additionally be interpreted as

natural GD, with a Riemannian metric induced by the mirror

map [14, 15]. While the inclusion of the mirror map makes

this method more ﬂexible than GD, current applications are

limited by the requirement of a hand-crafted mirror map. For

efﬁcient computation, such a mirror map is further restricted

by the requirement of a closed-form convex conjugate, which

is generally unavailable.

arXiv:2210.12238v2 [math.OC] 2 Jun 2023

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ROBUSTDATA-DRIVENACCELERATEDMIRRORDESCENTHongYeTan⋆SubhadipMukherjee⋆†JunqiTang⋆AndreasHauptmann‡§Carola-BibianeSch¨onlieb⋆⋆UniversityofCambridge,DAMTP,CambridgeCB30WA,U.K.†DepartmentofComputerScience,UniversityofBath,U.K.‡UniversityofOulu,ResearchUnitofMathematicalSciences,P.O.Box8000,90014Universi...

展开>> 收起<<

ROBUST DATA-DRIVEN ACCELERATED MIRROR DESCENT Hong Ye TanSubhadip MukherjeeJunqi Tang Andreas HauptmannCarola-Bibiane Sch onlieb.pdf

共5页,预览1页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

ROBUST DATA-DRIVEN ACCELERATED MIRROR DESCENT Hong Ye TanSubhadip MukherjeeJunqi Tang Andreas HauptmannCarola-Bibiane Sch onlieb

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: