Probabilistic Verification of Approximate Algorithms with Unstructured Errors Application to Fully Inexact Generalized ADMM

2025-05-02 0 0 738.03KB 14 页 10玖币
侵权投诉
Probabilistic Verification of Approximate
Algorithms with Unstructured Errors:
Application to Fully Inexact Generalized ADMM
Anis Hamadouche, Yun Wu, Andrew M. Wallace, and Jo˜
ao F. C. Mota
Abstract—We analyse the convergence of an approximate, fully
inexact, ADMM algorithm under additive, deterministic and
probabilistic error models. We consider the generalized ADMM
scheme that is derived from generalized Lagrangian penalty
with additive (smoothing) adaptive-metric quadratic proximal
perturbations. We derive explicit deterministic and probabilistic
convergence upper bounds for the lower-C2nonconvex case as
well as the convex case under the Lipschitz continuity condition.
We also present more practical conditions on the proximal
errors under which convergence of the approximate ADMM to
a suboptimal solution is guaranteed with high probability. We
consider statistically and dynamically-unstructured conditional
mean independent bounded error sequences. We validate our re-
sults using both simulated and practical software and algorithmic
computational perturbations. We apply the proposed algorithm
to a synthetic LASSO and robust regression with k-support
norm regularization problems and test our proposed bounds
under different computational noise levels. Compared to classical
convergence results, the adaptive probabilistic bounds are more
accurate in predicting the distance from the optimal set and
parasitic residual error under different sources of inaccuracies.
Index Terms—Numerical Linear Algebra; Numerical Op-
timization; ADMM; Douglas-Rachford method; Approximate
Computing.
I. INTRODUCTION
The Alternating Direction of Multiplier Method (ADMM)
originally emerged from the early works of Peaceman and
Rachford [2, 1, 3, 4], Glowinski and Marrocco [8], and Gabay
and Mercier [10] on applying numerical procedures to solve
partial differential equations arising in heat conduction and
continuum mechanics problems. Due to its simple implemen-
tation and efficiency in solving large-scale optimization prob-
lems encountered in statistics, and more recent machine learn-
ing problems, ADMM has evolved into different forms and
has been applied to solve more general nonconvex problems
that cannot be solved using conventional methods. ADMM is
also a popular method for online and distributed optimization
[22]. ADMM can be viewed as a decomposition procedure that
exploit the separability of the objective function to reduce the
difficulty of joint minimisation arising from classical methods
of multiplier [5, 6, 7, 9] at the cost of increased number of
iterations [39]. However, this procedure yields efficient local
Work supported by UK’s EPSRC (EP/T026111/1, EP/S000631/1), and the
MOD University Defence Research Collaboration.
Anis Hamadouche, Yun Wu, Andrew M. Wallace, and Jo˜
ao F.
C. Mota are with the School of Engineering & Physical Sci-
ences, Heriot-Watt University, Edinburgh EH14 4AS, UK. (e-mail:
{ah225,y.wu,a.m.wallace,j.mota}@hw.ac.uk).
solutions to subproblems that are coordinated via constraints to
find the global solution [10, 8]. The subproblems are typically
easier to solve than the original problem, and as a matter of
fact, closed-form solutions are often available. For instance,
in the following LASSO problem
minimize
xRn,zRmf(x) := kAx bk2
2+kzk1,
subject to xz= 0 ,
(1)
when the data matrix Ahas some circulant structure as those
in signal/image processing, the system can be solved via fast
Fourier transform with insignificant computation cost [21,
26, 30]. However, when the ADMM subproblems do not
possess closed-form solutions, the solution is obtained (or
approximated) iteratively which can lead to increased compu-
tation cost with additional incurred truncation (computational)
errors. This motivates the need for new computationally-aware
variants of the classical ADMM with inexpensive auxiliary
subproblems.
Another feature that makes ADMM technique practically
appealing is its robustness to poorly selected algorithm param-
eters, i.e., the method is guaranteed to converge independently
from the selected parameter. For instance, if the objective
functions are strongly convex and have Lipschitz-continuous
gradients, then the iterates produced by the ADMM algorithm
converge linearly to the optimum in a certain distance metric
no matter which parameter value is selected [39, 32, 41, 38, 40,
43, 46, 62]. For an extensive review of ADMM, the interested
reader is referred to the survey papers by [22, 36, 33].
Before we delve into the details of composite optimization,
ADMM technique and proximal calculus, let us first recall
some basic definitions from the theory of optimization and
proximal calculus.
A. Background
Definition 1 (Convex and Concave functions).A function f:
RnR∪ {±∞} is convex if
f(βx +(1β)y)βf (x)+ (1β)f(y),x, y Rn.(2)
with β(0,1). If fis convex then fis concave.
Definition 2 (Proper function).A function is proper of its
value is never −∞ and it is finite somewhere.
arXiv:2210.02094v1 [math.OC] 5 Oct 2022
Definition 3 (lower semicontinuous function).A function f:
RnR{±∞} is lower semicontinuous at a point yRn
if and only if
lim inf
xyf(x)f(y)(3)
Definition 4 (Closed function).A convex and proper function
is closed if and only if it is lower semicontinuous.
Definition 5 (Saddle point).Let L(x, y) : Rn×Rm:R
{±∞} be a convex-concave function. The tuple (x?, y?)is
said to be a saddle point of Lif
L(x?, y)≤ L(x?, y?)≤ L(x, y?)(4)
for all xRn,yRm.
Definition 6 (subgradient).Let wf(x), then yRn, we
have
f(y)f(x) + hw, y xi.(5)
Definition 7 (ε-subgradient).Let wεf(x), then yRn,
we have
f(y)f(x) + hw, y xi − ε. (6)
Definition 8 (Proximal operator).For all yRnwe have
proxu(y) := arg min
z
u(z) + 1
2kzyk2
2.(7)
Definition 9 (Approximate Proximal operator).
proxε
u(y) := nxRn:u(x) + 1
2kxyk2
2
ε+ inf
zu(z) + 1
2kzyk2
2o,
(8)
Definition 10 (lower-C2functions [19]).The function fis
lower-C2on an open set Vif for each xVthere is a
neighbourhood Vxof xupon which a representation f(x) =
max
tTft(x)holds for xin Vx, where Tis a compact set and
the functions ftare twice continuously differentiable jointly in
both xand t.
Definition 11 (lower-C2functions [20]).The function fis
lower-C2on an open set Vif at any point xin V, the sum
of fwith a quadratic term is a convex function on an open
neighbourhood V0of x.
Definition 12. Weighted norm Given a PSD matrix M, we
say that kxk2
M=hx, Mxi=hM x, xiis an M-weighted norm
of x.
B. Problem statement
The category of problems that ADMM solves is
minimize
xRn,zRmf(x) := g(x) + h(z),
subject to Ax +Bz =c ,
(9)
where gand hare lower-C2, possibly convex and nondiffer-
entiable functions, ARp×nand BRp×m.
The augmented Lagrangian for problem (9) is constructed
as follows
Lρ(x, z, y) = g(x)+h(z)+y>(Ax+Bzc)+ρ
2kAx +Bz ck2
2
(10)
where yis the Lagrange multiplier associated with the linear
constraint in (9), and ρis a positive scalar parameter. The
ADMM iteration is obtained by minimizing the augmented
Lagrangian (10) with respect to xand zvariables, and updat-
ing the dual variable yas follows
xk+1 = arg min
xLρ(x, zk, yk)(11a)
zk+1 = arg min
zLρ(xk+1, z, yk)(11b)
yk+1 =yk+ρ(Axk+1 +Bzk+1 c).(11c)
The equivalent scaled proximal ADMM is given by
xk+1 = arg min
x
g(x) + 1
2λkAx +Bzkc+vkk2(12a)
zk+1 = arg min
z
h(z) + 1
2λkAxk+1 +Bz c+vkk2
(12b)
vk+1 =vk+ (Axk+1 +Bzk+1 c).
(12c)
where vk= (1)ykand λ= 1.
Consider the case where both ADMM subproblems (12a)
and (12a) are solved inexactly and/or using iterates with
additive perturbations (rk
xand rk
z); i.e., xk=xk+rk
xand
zk=zk+rk
z, where xkand zkare the exact iterates.
Classical convergence guarantees cannot be easily extended
to the inexact case since they completely ignore the presence
and propagation of computational errors during iterations.
Therefore, a more rigorous treatment of such errors is needed
to verify the convergence, reliability and portability of the
ADMM (and its variants) to different computational environ-
ments.
Before we analyse the convergence of the inexact ADMM,
let us introduce some other variants of the original scheme
and focus our analysis to the most generalized form.
C. WL-ADMM approximation
The approximate WL-ADMM is formulated by approximat-
ing the euclidean norm by some positive definite (PD) Lk-
weighted norm k.k2
Lkin the xand zupdates as follows:
xk+1 = arg min
x
g(x) + 1
2λkAx +Bzkc+vkk2
Lk
(13a)
zk+1 = arg min
z
h(z) + 1
2λkAxk+1 +Bz c+vkk2
Lk
(13b)
vk+1 =vk+ (Axk+1 +Bzk+1 c).
(13c)
where vk= (1)yk,λ= 1.
D. WLM-ADMM approximation
Adding quadratic perturbations to (13a) and (13b) yields the
approximate WLM-ADMM
xk+1 = arg min
x
g(x) + 1
2λkAx +Bzkc+vkk2
Lk
+1
2kxxkk2
Mk
x(14a)
zk+1 = arg min
z
h(z) + 1
2λkAxk+1 +Bz c+vkk2
Lk
+1
2kzzkk2
Mk
z(14b)
vk+1 =vk+ (Axk+1 +Bzk+1 c).
(14c)
where Mk
xand Mk
zare positive definite matrices. Note that in
the case of convex function f, adding quadratic terms (prox-
imal terms) makes the objective functions strongly convex
which improves the condition number of the problem being
solved at the expense of yielding approximate solutions.
(14) can be written as
xk+1 = arg min
x
g(x) + 1
2kΛ1/2
1kxΛ1/2
1kΓ1kk2(15a)
zk+1 = arg min
z
h(z) + 1
2kΛ1/2
2kzΛ1/2
2kΓ2kk2(15b)
vk+1 =vk+ (Axk+1 +Bzk+1 c).(15c)
Or in generalized norms form as follows
xk+1 = arg min
x
g(x) + 1
2kxΛ1
1kΓ1kk2
Λ1k(16a)
zk+1 = arg min
z
h(z) + 1
2kzΛ1
2kΓ2kk2
Λ2k(16b)
vk+1 =vk+ (Axk+1 +Bzk+1 c).(16c)
where
Λ1k=1
λA>LkA+Mk
x(17a)
Γ1k=Mk
xxk1
λA>Lk(Bzkc+vk)(17b)
Λ2k=1
λB>LkB+Mk
z(17c)
Γ2k=Mk
zzk1
λB>Lk(Axk+1 c+vk)(17d)
E. Related work
A new ADMM variant of (11) was proposed in [34] that
yields simple proximal evaluations of the objective functions
according the following scheme
ζk
x=1
ρ(sk1
xxk)g(xk)(18a)
ex(xk, vk) = xkPX[xkρ(ζkA>yk)] (18b)
sk
x=xk+ρζk
xγαkex(xk, yk)(18c)
xk+1 = arg min
x
g(x) + 1
2ρkxsk
xk2(18d)
ζk
z=1
ρ(sk1
zzk)h(zk)(18e)
ez(zk, yk) = zkPZ[zkρ(ζk
zB>yk)] (18f)
sk
z=zk+ρζk
zγαkez(zk, yk)(18g)
zk+1 = arg min
z
h(z) + 1
2ρkzsk
zk2(18h)
yk+1 =yk+ρ(Axk+1 +Bzk+1 c).(18i)
where PXand PZdenote the orthogonal projections into sub-
spaces Xand Z, respectively. The stepsize sequence {αk}k>0
is chosen with convergence guarantees. Note that (18a) and
(18e) are both proximal evaluations of gand hat sk
xand sk
z,
respectively.
1) The inexact proximal ADMM: For the method to
be implementable, it is important for numerical optimisa-
tion algorithms to handle approximate solutions. Consider-
ing error-resilient applications which involve optimization
problems that do not require highly accurate solutions,
more computationally-friendly inexact (approximate) ADMM
schemes can be used instead of the original scheme.
Motivated by Rockafellar’s proximal method of multipliers,
which involves an augmented Lagrangian with an additional
quadratic proximal term, [13, 29, 16] proposed inexact semi-
proximal ADMM with added proximal terms as follows
xk+1 arg min
xLρ(x, zk, yk) + 1
2kxxkk2
Mk
x(19a)
zk+1 arg min
zLρ(xk+1, z, yk) + 1
2kzzkk2
Mk
z(19b)
yk+1 =yk+αρ(Axk+1 +Bzk+1 c).(19c)
with Fortin and Glowinski’s relaxation factor α[0,(1 +
5)/2] proposed by [29]. This algorithm is shown to preserve
the good features of the primal proximal method of multipliers,
with the additional advantage that it leads to a decoupling of
the constraints. Mk
xand Mk
zare selected to be PSD matrices
in [29] and in [16] they are selected to have some positive
minimum spectral radius. Note that the inexact semi-proximal
ADMM reduces to the classical ADMM when Mk
x=Mk
z= 0
and to the proximal ADMM [13] when both proximal matrices
are PD and α= 1. Global convergence of the semi-proximal
ADMM was proved in [39] and the iteration complexity
(convergence rate) of O(1/k)was derived in [28]. Faster rates
can be achieved with varying αand better selection of ρ,
Mk
xand Mk
z[39]. For instance, [16, 39, 29, 37] considered
indefinite self-adaptive Mk
xand Mk
zand an optimal self-
adaptive scheme is presented in [60]. Mxand Mzare typically
摘要:

ProbabilisticVericationofApproximateAlgorithmswithUnstructuredErrors:ApplicationtoFullyInexactGeneralizedADMMAnisHamadouche,YunWu,AndrewM.Wallace,andJo˜aoF.C.MotaAbstract—Weanalysetheconvergenceofanapproximate,fullyinexact,ADMMalgorithmunderadditive,deterministicandprobabilisticerrormodels.Weconsid...

展开>> 收起<<
Probabilistic Verification of Approximate Algorithms with Unstructured Errors Application to Fully Inexact Generalized ADMM.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:738.03KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注