Certified machine learning Rigorous a posteriori error bounds for PDE defined PINNs

2025-09-29 1 0 277.55KB 10 页 10玖币
侵权投诉
arXiv:2210.03426v1 [cs.LG] 7 Oct 2022
1
Certified machine learning: Rigorous a posteriori
error bounds for PDE defined PINNs
Birgit Hillebrechtand Benjamin Unger
Abstract—Prediction error quantification in machine learning
has been left out of most methodological investigations of neural
networks, for both purely data-driven and physics-informed
approaches. Beyond statistical investigations and generic results
on the approximation capabilities of neural networks, we present
a rigorous upper bound on the prediction error of physics-
informed neural networks. This bound can be calculated without
the knowledge of the true solution and only with a priori available
information about the characteristics of the underlying dynamical
system governed by a partial differential equation. We apply
this a posteriori error bound exemplarily to four problems: the
transport equation, the heat equation, the Navier-Stokes equation
and the Klein-Gordon equation.
Index Terms—Physics-informed neural network, machine
learning, certification, a posteriori error estimator, Navier-Stokes
I. INTRODUCTION
Physics-informed machine learning is applied to numerous
highly complex problems, such as turbulence and climate
modeling [7], [44], [45], model predictive control [3], [33],
and Hamiltonian system dynamics [16]. The systematic study
of physics-informed machine learning as a method, however,
remains open [30]. Part of this is the question about the quality
of the forecast: how close is the prediction of the physics-
informed neural network (PINN) [36] to the actual solution?
Beyond statistical evaluations, such as in [22], first steps
to answer this question were taken in [18] by establishing
rigorous error bounds for PINNs approximating the solution
of ordinary differential equations. We extend these results
by deriving guaranteed upper bounds on the prediction error
for problems modeled by linear partial differential equations
(PDEs). Our methodology is thereby applicable to, e.g., the
previously mentioned use cases of PINNs in computational
science and engineering. Similar to [18], the error bound can
be computed a posteriori without knowing the actual solution.
Since there are extensive studies on the convergence and ap-
proximation properties of (physics-informed) neural networks
[4], [11], [20], [39], it is clear from a theoretical point of view
that a suitable neural network (NN) can be found with the
desired accuracy. These a priori results and the associated error
estimates [8], [19], [39] are fundamentally different compared
B. Hillebrecht acknowledges funding from the International Max Planck
Research School for Intelligent Systems (IMPRS-IS). B. Unger acknowledges
funding from the DFG under Germany’s Excellence Strategy – EXC 2075
– 390740016 and The Ministry of Science, Research and the Arts Baden-
W¨urttemberg, 7542.2-9-47.10/36/2. Both authors are thankful for support by
the Stuttgart Center for Simulation Science (SimTech).
,:Stuttgart Center for Simulation Science,University of Stuttgart, Stuttgart,
Germany, {birgit.hillebrecht,benjamin.unger}@simtech.uni-stuttgart.de ;
ORCID: 0000-0001-5361-0505 ORCID: 0000-0003-4272-1079
to the computable a posteriori error bounds we present here.
Moreover, we like to stress that our analysis does not rely on
statistical approaches or the Bayesian perspective but provides
a rigorous error bound, which serves as a certificate for
the NN.
Close in topic is [39] wherein a so called ’a posterior’
error estimate is obtained based on norm inequalities. But,
as stated in their conclusion, the estimates are not directly
computable due to unknown constants. In distinction, the error
estimator presented here is computable which we demonstrate
with several academic examples. Other earlier works [5], [31]
also differ from the following work in fundamental aspects,
such as that the error bounds derived in them are either not
guaranteed to hold, are restricted to a certain problem type, or
require discretization steps.
At this point, we would like to emphasize that the results
presented in Section III are independent of PINNs as the
chosen methodology but can be applied to other surrogate
modeling techniques. Nevertheless, since PINNs use the norm
of the residual as a regularization term during the training
process, they appear as a natural candidate to apply our
residual-based error estimator.
Notation: We use bold notation for vectors and vector-
valued functions and italic letters for operators, especially
we use Ito denote the identity operator. The notations
˙
u=tu=u
t are used interchangeably to denote partial
derivatives w.r.t. time. Similarily, we shorten derivatives w.r.t.
spatial variables and use conventional notation for the diver-
gence · φ= div(φ), the gradient u= grad(u), and the
Laplace operator u= div(grad(u)) w.r.t. to spatial variables
only. For normed spaces (X, k·kX),(Y, k·kY), and operator
A:D(A)XYwe define the induced operator norm
kAkX,Y := supxD(A),kxkX6=0
kAxkY
kxkX. In case, XYand
k·kY=k·kX, we shorten the notation to kAkX, and drop the
subscript whenever the norm is clear from the context. For the
spatial domain Rd, we denote the boundary by and
define the size of as kk=R1dx. The considerd temporal
domain is denoted by T= [0, tf)with tfR+. We use
the conventional notation for spaces of p-integrable functions
Lp(Ω), and Sobolev spaces Wk,p(Ω) and Hk(Ω) = Wk,2(Ω).
II. PROBLEM DESCRIPTION
In the following, we aim at solving the boundary and inital
value problem (BIVP): find u:T×Rnsuch that
tu=Auin T×,
u=u0in {t= 0} × ,
Bu=ubin T×.
(1)
2
The initial value u0lies in the domain of definition of the
linear differential operator A:D(A)XX, meaning
u0D(A). The bounded linear operator B:D(A)U
represents the type of the boundary condition and maps to the
Banach space Uwhich contains the boundary condition ub.
As an example, for the Dirichlet boundary condition B=T I
is suitable, wherein Tdenotes the trace operator.
With this setup given, the objective can be intuitively
formulated as follows.
Problem II.1. Given an approximate solution ˆu for the BIVP
(1), which is determined, for example, by approximating the
system with a NN. Find a computable certificate ε:TR+
such that
kˆ
u(t, ·)u(t, ·)kXε(t)
without knowing the true solution u.
In the following, we consider the BIVP (1) in the framework
of semigroups and investigate mild solutions
u(t, ·) = S(t)u0(·)
defined by the semigroup of operators {S(t)}t0generated
by A. To assert that the problem is well-defined and that a
unique (mild) solution exists, we make numerous assumptions.
Firstly, for Bto be well-defined, we need to assert the existence
of a trace operator Tvia trace theorems, e.g. requiring to
be a Lipschitz domain [9] or to be C1[13, Sec. 5.5, Thm.
1]. By formulating the following assumption, we also include
generic boundary conditions [37].
Assumption II.2. For a reflexive Banach space X, the linear
(differential) operator A=A|kerBgenerates a strongly con-
tinuous one-parameter semigroup of bounded linear operators
{S(t)}t0on X. The domain D(A) = D(A)ker(B)is
a linear subspace of X. The boundary operator Bis right
invertible with the inverse B0and AB0is a bounded linear
operator from Uto X.
A mild solution can then be defined by an abstract variation-
of-constants type formula with an extended solution space. To
make the presentation accessible to a large audience, we omit
the general functional analytic framework here (see [37] and
the references therein for the details). Furthermore, we drop
the prime on the operator Ato simplify the notation.
Remark II.3. The results derived below can be generalized
directly to systems modeled by inhomogeneous PDEs of type
tu=Au+f
with f(t)D(A)and findependent of u. To simplify
notation, we continue all investigations without f.
Similarly to [18] for a finite-dimensional setting, the NN
approximation of the solution of the BIVP (1) is interpreted
as the solution of a perturbed problem. Here we consider two
types of perturbed problems, the first has the same boundary
condition as the unperturbed problem, but has perturbation
terms in the initial condition and the differential equation. The
perturbed problem reads
tˆu =Aˆu +Rin T×,
ˆu =u0+R0in {t= 0} × ,
Bˆu =ubin T×,
(2)
with perturbances R0D(A)and R:TX. For an
approximate solution ˆu given by a NN, we can determine the
terms R,R0by computing the residuals
R0:=ˆ
u(0,·)u0in ,
R:=tˆu − Aˆu in T×.(3)
This calculation requires ˆu to be differentiable in time, which
constrains the choice of the activation function of the NN.
Also, we assume the following for the BIVP (2).
Assumption II.4. The perturbed BIVP (2) is such that Ris
Lipschitz continuous in tTand u0+R0D(A).
Regarding the approximation by a NN, for most activation
functions, the Lipschitz condition is satisfied, and only the
second condition requires a more detailed investigation of the
corresponding function spaces.
The second problem involves additionally a perturbation in
the boundary condition
tˆu =Aˆu +Rin T×,
ˆu =u0+R0in {t= 0} × ,
Bˆu =ub+Rbin T×.
(4)
We consider this system as a boundary control system (BCS,
c.f. [41]) and use the notion of input-to-state-stable (ISS)
systems, see [40], to later derive rigorous error bounds.
Definition II.5. System (4) is said to be ISS w.r.t. the boundary
term ub+Rb, if there exist functions β, γ and an operator C
such that
ku(t, ·)k ≤β(ku(0,·)k, t)
+γ(kC(Rb(s) + ub(s))kL(0,t;Ω)).(5)
Here, γ:R+R+is in K, which means that it is continuous
and strictly increasing with γ(0) = 0. Similarily, β:R+×
R+R+is in KL, meaning β(·, t)∈ K for all t0and
β(s, ·)is continuous and strictly decreasing to 0 for all s > 0
[40].
In the following, we consider an ISS BCS, where we set
u0= 0 and ub= 0, such that (5) simplifies to the second
term. As an example, for the heat equation (see Section V-A)
we use γ(s) = 1/3s. For the imposed Dirichlet boundary, the
operator C=δis the Dirac Distribution on the boundary
of the domain .
Remark II.6. We present reasoning and theorems based on
the notion of ISS, however, all results can be transferred easily
to integral input-to-state-stable (iISS) systems [21], [40].
Assumption II.7. The BCS (4) is ISS with respect to the
boundary values ub+Rb.
摘要:

arXiv:2210.03426v1[cs.LG]7Oct20221Certifiedmachinelearning:RigorousaposteriorierrorboundsforPDEdefinedPINNsBirgitHillebrecht∗andBenjaminUnger†Abstract—Predictionerrorquantificationinmachinelearninghasbeenleftoutofmostmethodologicalinvestigationsofneuralnetworks,forbothpurelydata-drivenandphysics-inform...

展开>> 收起<<
Certified machine learning Rigorous a posteriori error bounds for PDE defined PINNs.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:277.55KB 格式:PDF 时间:2025-09-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注