A RANDOMIZED OPERATOR SPLITTING SCHEME INSPIRED BY STOCHASTIC OPTIMIZATION METHODS MONIKA EISENMANN AND TONY STILLFJORD

2025-04-27 0 0 682.33KB 23 页 10玖币
侵权投诉
A RANDOMIZED OPERATOR SPLITTING SCHEME INSPIRED
BY STOCHASTIC OPTIMIZATION METHODS
MONIKA EISENMANN AND TONY STILLFJORD
Abstract. In this paper, we combine the operator splitting methodology for
abstract evolution equations with that of stochastic methods for large-scale
optimization problems. The combination results in a randomized splitting
scheme, which in a given time step does not necessarily use all the parts of
the split operator. This is in contrast to deterministic splitting schemes which
always use every part at least once, and often several times. As a result,
the computational cost can be significantly decreased in comparison to such
methods. We rigorously define a randomized operator splitting scheme in an
abstract setting and provide an error analysis where we prove that the temporal
convergence order of the scheme is at least 1/2. We illustrate the theory by
numerical experiments on both linear and quasilinear diffusion problems, using
a randomized domain decomposition approach. We conclude that choosing
the randomization in certain ways may improve the order to 1. This is as
accurate as applying e.g. backward (implicit) Euler to the full problem, without
splitting.
1. Introduction
The main objective of this paper is to combine two successful strategies from
the literature: the first being operator splitting schemes for evolution equations on
general, infinite dimensional frameworks and the second being stochastic optimiza-
tion methods. Operator splitting schemes are an established tool in the field of
numerical analysis of evolution equations and have a wide range of applications.
Stochastic optimization methods have proven to be efficient at solving large-scale
optimization problems, where it is infeasible to evaluate full gradients. They can
drastically decrease the computational cost in e.g. machine learning settings. The
link between these two seemingly disparate areas is that an iterative method applied
to an optimization problem can also be seen as a time-stepping method applied to a
gradient flow connected to the optimization problem. In particular, stochastic opti-
mization methods can then be interpreted as randomized operator splitting schemes
for such gradient flows. In this context, we introduce a general randomized splitting
method that can be applied directly to evolution equations, and provide a rigorous
convergence analysis.
2010 Mathematics Subject Classification. 65C99, 65M12, 90C15, 65M55.
Key words and phrases. Nonlinear evolution equations, operator splitting, stochastic optimiza-
tion, domain decomposition, randomized scheme.
This work was partially supported by the Crafoord foundation through the grant number
20220657 and by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded
by the Knut and Alice Wallenberg Foundation. The computations were enabled by resources
provided by the Swedish National Infrastructure for Computing (SNIC) at LUNARC partially
funded by the Swedish Research Council through grant agreement no. 2018–05973.
1
arXiv:2210.05375v1 [math.NA] 11 Oct 2022
2 M. EISENMANN AND T. STILLFJORD
Abstract evolution equations of the type
(u0(t) + A(t)u(t) = f(t), t (0, T ],
u(0) = u0
are an important building block for modeling processes in physics, biology and
social sciences. Standard examples which appear in a variety of applications are
fluid flow problems, where we model how a flow evolves on a given domain over time,
compare [1,26] and [37, Section 1.3]. The operator A(t) can denote, for example, a
non-linear diffusion operator such as the p-Laplacian or a porous medium operator.
Deterministic operator splitting schemes as discussed in more detail in [16] are
a powerful tool for this type of equation. An example is given by a domain decom-
position scheme, where we split the domain into sub-domains. Instead of solving
one expensive problem on the entire domain, we deal with cheaper problems on the
sub-domains. This is particularly useful in modern computer architectures, as the
sub-problems may often be solved in parallel.
Moreover, evolution equations are tightly connected to unconstrained optimiza-
tion problems, because the solution of minuF(u) is a stationary point of the gra-
dient flow u0(t) = −∇F(u(t)). The latter is an evolution equation on an infinite
time horizon with A=−∇Fand f= 0. In the large-scale case, such optimiza-
tion problems benefit from stochastic optimization schemes. The most basic such
method, the stochastic gradient descent, was first introduced already in [32], but
since then it has been extended and generalized in many directions. See, e.g., the
review article [3] and the references therein.
Via the gradient flow interpretation, we can see these optimization methods
as time-stepping schemes where a randomly chosen sub-problem is considered in
each time step. In essence, it is therefore a randomized operator splitting scheme.
The difference between the works mentioned above and ours is that we apply these
stochastic optimization techniques to solve the evolution equation itself rather than
just finding its stationary state.
We consider nonlinear evolution equations in an abstract framework similar to
[7,10,11] where operators of a monotone type have been studied. Deterministic
splitting schemes for such equations has been considered in e.g. [14,15,17,29].
A particular kind of splitting schemes which is most closely related to our work,
domain decomposition methods, have been studied in [6,7,13,30,31]. In this
paper, we extend this framework of deterministic splitting schemes to a setting of
randomized methods.
Outside of the context of optimization, other kinds of randomized methods have
already proved themselves to be useful for solving evolution equations. Starting in
[34,35] explicit schemes for ordinary differential equations have been randomized.
This approach has been further extended in [2,4,18,22,24]. In [8], it has been
extended both to implicit methods and to partial differential equations and in [23]
to finite element approximations. While these works considered certain random-
izations in their schemes, they are conceptually different from our approach. Their
main idea is to approximate any appearing integrals through
Ztn
tn1
f(t) dtf(ξn) and Ztn
tn1
A(t)vdtA(ξn)v,
RANDOMIZED OPERATOR SPLITTING 3
where ξnis a random variable that takes on values in [tn1, tn]. This ansatz coin-
cides with a Monte Carlo integration idea. In this paper, we use a different approach
where we decompose the operator in a randomized fashion. More precisely, we ap-
proximate data
f=1
s
s
X
`=1
f`and A=1
s
s
X
`=1
A`
by
fB=1
|B|X
`B
f`and AB=1
|B|X
`B
A`
where the batch B⊂ {1, . . . , s}is chosen randomly. The stochastic approximations
fBand ABof the original data fand Aare cheaper to evaluate in applications.
This is less related to Monte Carlo integration and more similar to stochastic opti-
mization methods, compare [3,9]. Similar ideas have been considered in [19,20,28],
where a random batch method for interacting particle systems has been studied.
Moreover, very recently and during the preparation of this work, a similar approach
has also been applied to the optimal control of linear time invariant (LTI) dynamical
systems in [38]. While the convergence rate provided there is essentially the same
as what we establish in our main result Theorem 5.2, our setting is more general
and allows for nonlinear operators on infinite dimensional spaces rather than finite
dimensional matrices. We also consider the error of the time stepping method that
is used to approximate the solution to u0(t) + AB(t)u(t) = fB(t), while the error
bounds in [38] assume that this evolution equation is solved exactly.
This paper is organized as follows. In Section 2, we begin by explaining our
abstract framework. This includes both the precise assumptions that we make and
the definition of our time-stepping scheme. We give a more concrete application
of the abstract framework in Section 3. With the setting fixed, we first prove in
Section 4that the scheme and its solution are indeed well-defined. We prove the
convergence of the scheme in expectation in Section 5. These theoretical conver-
gence results are illustrated by numerical experiments with two-dimensional linear
and quasilinear nonlinear and linear diffusion problem in Section 6. Finally, we
collect some more technical auxiliary results in Appendix A.
2. Setting
In the following, we introduce a theoretical framework for the randomized oper-
ator splitting. This setting is similar to the one in [7].
Assumption 1. Let (H, (·,·)H,k·kH)be a real, separable Hilbert space and let
(V, k·kV)be a real, separable, reflexive Banach space, which is continuously and
densely embedded into H. Moreover, there exists a semi-norm |·|Von V.
Denoting the dual space of Vby Vand identifying the Hilbert space Hwith
its dual space, the spaces from Assumption 1form a Gelfand triple and fulfill, in
particular,
Vd
H
=Hd
V.
Assumption 2. Let the spaces Hand Vbe given as stated in Assumption 1.
Furthermore, for T(0,)as well as p[2,), let {A(t)}t[0,T ]be a family of
operators A(t): VVthat satisfy the following conditions:
4 M. EISENMANN AND T. STILLFJORD
(i) The mapping Av : [0, T ]Vgiven by t7→ A(t)vis continuous almost
everywhere in (0, T )for all vV.
(ii) The operator A(t) : VV,t[0, T ], is radially continuous, i.e., the
mapping s7→ hA(t)(v+sw), wiV×Vis continuous on [0,1] for all v, w V.
(iii) For κA[0,), the operator A(t) + κAI:VV,t[0, T ], fulfills
a monotonicity-type condition in the sense that there exists ηA[0,),
which does not depend on t, such that
hA(t)vA(t)w, v wiV×V+κAkvwk2
HηA|vw|p
V
is fulfilled for all v, w V.
(iv) The operator A(t) : VV,t[0, T ], is uniformly bounded such that
there exists βA[0,), which does not depend on t, with
kA(t)vkVβA1 + kvkp1
V
for all vV.
(v) The operator A(t) + κAI:VV,t[0, T ], fulfills a uniform semi-
coercivity condition such that there exist µA, λA[0,), which do not
depend on t, with
hA(t)v, viV×V+κAkvk2
H+λAµA|v|p
V
for all vV.
Assumption 3. The function fis an element of the Bochner space L2(0, T ;H),
and the initial value u0H, where His the Hilbert space from Assumption 1.
Assumption 13, are requirements on the problem that we want to solve. The
following Assumptions 45are needed to state the approximation scheme for the
given problem.
Assumption 4. Let (Ω,F,P)be a complete probability space and let {ξn}nN
be a family of mutually independent random variables. Further, let the filtration
{Fn}nNbe given by
F0:= σN ∈ F :P(N)=0
Fn:= σσξi:i∈ {1, . . . , n}∪ F0, n N,
where σdenotes the generated σ-algebra.
In the following, we denote the expectation with respect to the probability dis-
tribution of ξfor a random variable Xin the Bochner space L1(Ω; H) by Eξ[X].
Moreover, we abbreviate the total expectation by
En[X] = Eξ1[Eξ2[. . . Eξn[X]. . . ]].
We denote the space of H¨older continuous functions on [0, T ] with H¨older coeffi-
cient γ(0,1) and values in Hby Cγ([0, T ]; H). For notational convenience we
include the case γ= 1 and denote the space of Lipschitz continuous functions by
C1([0, T ]; H).
Assumption 5. Let Assumptions 14be fulfilled. Assume that for almost ev-
ery ω, there exists a real Banach space Vξ(ω)such that Vd
Vξ(ω)
d
H,
RANDOMIZED OPERATOR SPLITTING 5
TωVξ(ω)=Vand there exists a semi-norm | · |Vξ(ω). Moreover, the map-
ping from ω7→ Vξ(ω)is measurable in the sense that for every vHthe set
{ωΩ : vVξ(ω)}is an element of the complete generated σ-algebra
Fξ:= σσ(ξ)σN ∈ F :P(N)=0.
Further, let the family of operators {Aξ(ω)(t)}ω,t[0,T ]be such that for almost
every ω,{Aξ(ω)(t)}t[0,T ]fulfills Assumption 2with the spaces Vξ(ω),Hand
V
ξ(ω)and corresponding constants κξ(ω),ηξ(ω),βξ(ω),µξ(ω)and λξ(ω). Moreover,
the mapping Aξ(t)v: Vis Fξ-measurable and the equality Eξ[Aξ(t)v] = A(t)v
is fulfilled in Vfor vV. The mappings κξ, ηξ, µξ, βξ, λξ: [0,)are measur-
able and there exist κ, λ [0,)which fulfill κξκalmost surely and Eξλξλ.
Further, let the family {fξ(ω)}ωbe given such that fξ(ω)L2(0, T ;H). More-
over, the mapping fξ(t): Ω His Fξ-measurable and Eξ[fξ(t)] = f(t)is fulfilled
in Hfor almost all t(0, T ).
Under the setting explained in the above assumptions, we consider the initial
value problem
(u0(t) + A(t)u(t) = f(t) in V, t (0, T ],
u(0) = u0in H.
(2.1)
For a non-uniform temporal grid 0 = t0< t1<··· < tN=T, a step size hn=
tntn1,h= maxn∈{1,...,N}hn, and a family of random variables {fn}n∈{1,...,N }
such that fn: His Fξn-measurable, we consider the scheme
(UnUn1+hnAξn(tn)Un=hnfnin V
ξn, n ∈ {1, . . . , N},
U0=u0in H.
(2.2)
Note that Un: His a random variable and therefore some statements involving
it below only hold almost surely. Whenever there is no risk of misinterpretation,
we omit writing almost surely for the sake of brevity.
When proving that the scheme is well-defined and establishing an a priori bound,
it is sufficient to assume that {fξn}n∈{1,...,N}are integrable with respect to the
temporal parameter. In that case, we can choose for example
fn=1
hnZtn
tn1
fξn(t) dtin Halmost surely.(2.3)
In our error bounds, we assume more regularity for the functions {fξn}n∈{1,...,N }
and demand continuity with respect to the temporal parameter. In this case, we
may also use
fn=fξn(tn) in Halmost surely.(2.4)
We will focus on this second choice for the error bounds in Section 5.
3. Application: Domain decomposition
One main application that is allowed by our abstract framework is a domain
decomposition scheme for a nonlinear fluid flow problem. Domain decomposition
schemes are well-known for deterministic operator splittings. However, to the best
of our knowledge, it has not been studied in the context of a randomized operator
splitting scheme.
摘要:

ARANDOMIZEDOPERATORSPLITTINGSCHEMEINSPIREDBYSTOCHASTICOPTIMIZATIONMETHODSMONIKAEISENMANNANDTONYSTILLFJORDAbstract.Inthispaper,wecombinetheoperatorsplittingmethodologyforabstractevolutionequationswiththatofstochasticmethodsforlarge-scaleoptimizationproblems.Thecombinationresultsinarandomizedsplitting...

展开>> 收起<<
A RANDOMIZED OPERATOR SPLITTING SCHEME INSPIRED BY STOCHASTIC OPTIMIZATION METHODS MONIKA EISENMANN AND TONY STILLFJORD.pdf

共23页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:23 页 大小:682.33KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 23
客服
关注