Certified machine learning Rigorous a posteriori error bounds for PDE defined PINNs
2025-09-29
16
0
277.55KB
10 页
10玖币
侵权投诉
arXiv:2210.03426v1 [cs.LG] 7 Oct 2022
1
Certified machine learning: Rigorous a posteriori
error bounds for PDE defined PINNs
Birgit Hillebrecht∗and Benjamin Unger†
Abstract—Prediction error quantification in machine learning
has been left out of most methodological investigations of neural
networks, for both purely data-driven and physics-informed
approaches. Beyond statistical investigations and generic results
on the approximation capabilities of neural networks, we present
a rigorous upper bound on the prediction error of physics-
informed neural networks. This bound can be calculated without
the knowledge of the true solution and only with a priori available
information about the characteristics of the underlying dynamical
system governed by a partial differential equation. We apply
this a posteriori error bound exemplarily to four problems: the
transport equation, the heat equation, the Navier-Stokes equation
and the Klein-Gordon equation.
Index Terms—Physics-informed neural network, machine
learning, certification, a posteriori error estimator, Navier-Stokes
I. INTRODUCTION
Physics-informed machine learning is applied to numerous
highly complex problems, such as turbulence and climate
modeling [7], [44], [45], model predictive control [3], [33],
and Hamiltonian system dynamics [16]. The systematic study
of physics-informed machine learning as a method, however,
remains open [30]. Part of this is the question about the quality
of the forecast: how close is the prediction of the physics-
informed neural network (PINN) [36] to the actual solution?
Beyond statistical evaluations, such as in [22], first steps
to answer this question were taken in [18] by establishing
rigorous error bounds for PINNs approximating the solution
of ordinary differential equations. We extend these results
by deriving guaranteed upper bounds on the prediction error
for problems modeled by linear partial differential equations
(PDEs). Our methodology is thereby applicable to, e.g., the
previously mentioned use cases of PINNs in computational
science and engineering. Similar to [18], the error bound can
be computed a posteriori without knowing the actual solution.
Since there are extensive studies on the convergence and ap-
proximation properties of (physics-informed) neural networks
[4], [11], [20], [39], it is clear from a theoretical point of view
that a suitable neural network (NN) can be found with the
desired accuracy. These a priori results and the associated error
estimates [8], [19], [39] are fundamentally different compared
B. Hillebrecht acknowledges funding from the International Max Planck
Research School for Intelligent Systems (IMPRS-IS). B. Unger acknowledges
funding from the DFG under Germany’s Excellence Strategy – EXC 2075
– 390740016 and The Ministry of Science, Research and the Arts Baden-
W¨urttemberg, 7542.2-9-47.10/36/2. Both authors are thankful for support by
the Stuttgart Center for Simulation Science (SimTech).
∗,†:Stuttgart Center for Simulation Science,University of Stuttgart, Stuttgart,
Germany, {birgit.hillebrecht,benjamin.unger}@simtech.uni-stuttgart.de ;
∗ORCID: 0000-0001-5361-0505 †ORCID: 0000-0003-4272-1079
to the computable a posteriori error bounds we present here.
Moreover, we like to stress that our analysis does not rely on
statistical approaches or the Bayesian perspective but provides
a rigorous error bound, which serves as a certificate for
the NN.
Close in topic is [39] wherein a so called ’a posterior’
error estimate is obtained based on norm inequalities. But,
as stated in their conclusion, the estimates are not directly
computable due to unknown constants. In distinction, the error
estimator presented here is computable which we demonstrate
with several academic examples. Other earlier works [5], [31]
also differ from the following work in fundamental aspects,
such as that the error bounds derived in them are either not
guaranteed to hold, are restricted to a certain problem type, or
require discretization steps.
At this point, we would like to emphasize that the results
presented in Section III are independent of PINNs as the
chosen methodology but can be applied to other surrogate
modeling techniques. Nevertheless, since PINNs use the norm
of the residual as a regularization term during the training
process, they appear as a natural candidate to apply our
residual-based error estimator.
Notation: We use bold notation for vectors and vector-
valued functions and italic letters for operators, especially
we use Ito denote the identity operator. The notations
˙
u=∂tu=∂u
∂t are used interchangeably to denote partial
derivatives w.r.t. time. Similarily, we shorten derivatives w.r.t.
spatial variables and use conventional notation for the diver-
gence ∇ · φ= div(φ), the gradient ∇u= grad(u), and the
Laplace operator ∆u= div(grad(u)) w.r.t. to spatial variables
only. For normed spaces (X, k·kX),(Y, k·kY), and operator
A:D(A)⊆X→Ywe define the induced operator norm
kAkX,Y := supx∈D(A),kxkX6=0
kAxkY
kxkX. In case, X⊆Yand
k·kY=k·kX, we shorten the notation to kAkX, and drop the
subscript whenever the norm is clear from the context. For the
spatial domain Ω⊆Rd, we denote the boundary by ∂Ωand
define the size of Ωas kΩk=RΩ1dx. The considerd temporal
domain is denoted by T= [0, tf)with tf∈R+. We use
the conventional notation for spaces of p-integrable functions
Lp(Ω), and Sobolev spaces Wk,p(Ω) and Hk(Ω) = Wk,2(Ω).
II. PROBLEM DESCRIPTION
In the following, we aim at solving the boundary and inital
value problem (BIVP): find u:T×Ω→Rnsuch that
∂tu=Auin T×Ω,
u=u0in {t= 0} × Ω,
Bu=ubin T×∂Ω.
(1)
2
The initial value u0lies in the domain of definition of the
linear differential operator A:D(A)⊆X→X, meaning
u0∈D(A). The bounded linear operator B:D(A)→U
represents the type of the boundary condition and maps to the
Banach space Uwhich contains the boundary condition ub.
As an example, for the Dirichlet boundary condition B=T I
is suitable, wherein Tdenotes the trace operator.
With this setup given, the objective can be intuitively
formulated as follows.
Problem II.1. Given an approximate solution ˆu for the BIVP
(1), which is determined, for example, by approximating the
system with a NN. Find a computable certificate ε:T→R+
such that
kˆ
u(t, ·)−u(t, ·)kX≤ε(t)
without knowing the true solution u.
In the following, we consider the BIVP (1) in the framework
of semigroups and investigate mild solutions
u(t, ·) = S(t)u0(·)
defined by the semigroup of operators {S(t)}t≥0generated
by A. To assert that the problem is well-defined and that a
unique (mild) solution exists, we make numerous assumptions.
Firstly, for Bto be well-defined, we need to assert the existence
of a trace operator Tvia trace theorems, e.g. requiring Ωto
be a Lipschitz domain [9] or ∂Ωto be C1[13, Sec. 5.5, Thm.
1]. By formulating the following assumption, we also include
generic boundary conditions [37].
Assumption II.2. For a reflexive Banach space X, the linear
(differential) operator A′=A|kerBgenerates a strongly con-
tinuous one-parameter semigroup of bounded linear operators
{S(t)}t≥0on X. The domain D(A′) = D(A)∩ker(B)is
a linear subspace of X. The boundary operator Bis right
invertible with the inverse B0and A′B0is a bounded linear
operator from Uto X.
A mild solution can then be defined by an abstract variation-
of-constants type formula with an extended solution space. To
make the presentation accessible to a large audience, we omit
the general functional analytic framework here (see [37] and
the references therein for the details). Furthermore, we drop
the prime on the operator A′to simplify the notation.
Remark II.3. The results derived below can be generalized
directly to systems modeled by inhomogeneous PDEs of type
∂tu=Au+f
with f(t)∈D(A)and findependent of u. To simplify
notation, we continue all investigations without f.
Similarly to [18] for a finite-dimensional setting, the NN
approximation of the solution of the BIVP (1) is interpreted
as the solution of a perturbed problem. Here we consider two
types of perturbed problems, the first has the same boundary
condition as the unperturbed problem, but has perturbation
terms in the initial condition and the differential equation. The
perturbed problem reads
∂tˆu =Aˆu +Rin T×Ω,
ˆu =u0+R0in {t= 0} × Ω,
Bˆu =ubin T×∂Ω,
(2)
with perturbances R0∈D(A)and R:T→X. For an
approximate solution ˆu given by a NN, we can determine the
terms R,R0by computing the residuals
R0:=ˆ
u(0,·)−u0in Ω,
R:=∂tˆu − Aˆu in T×Ω.(3)
This calculation requires ˆu to be differentiable in time, which
constrains the choice of the activation function of the NN.
Also, we assume the following for the BIVP (2).
Assumption II.4. The perturbed BIVP (2) is such that Ris
Lipschitz continuous in t∈Tand u0+R0∈D(A).
Regarding the approximation by a NN, for most activation
functions, the Lipschitz condition is satisfied, and only the
second condition requires a more detailed investigation of the
corresponding function spaces.
The second problem involves additionally a perturbation in
the boundary condition
∂tˆu =Aˆu +Rin T×Ω,
ˆu =u0+R0in {t= 0} × Ω,
Bˆu =ub+Rbin T×∂Ω.
(4)
We consider this system as a boundary control system (BCS,
c.f. [41]) and use the notion of input-to-state-stable (ISS)
systems, see [40], to later derive rigorous error bounds.
Definition II.5. System (4) is said to be ISS w.r.t. the boundary
term ub+Rb, if there exist functions β, γ and an operator C
such that
ku(t, ·)k ≤β(ku(0,·)k, t)
+γ(kC(Rb(s) + ub(s))kL∞(0,t;Ω)).(5)
Here, γ:R+→R+is in K, which means that it is continuous
and strictly increasing with γ(0) = 0. Similarily, β:R+×
R+→R+is in KL, meaning β(·, t)∈ K for all t≥0and
β(s, ·)is continuous and strictly decreasing to 0 for all s > 0
[40].
In the following, we consider an ISS BCS, where we set
u0= 0 and ub= 0, such that (5) simplifies to the second
term. As an example, for the heat equation (see Section V-A)
we use γ(s) = 1/3s. For the imposed Dirichlet boundary, the
operator C=δ∂Ωis the Dirac Distribution on the boundary
of the domain Ω.
Remark II.6. We present reasoning and theorems based on
the notion of ISS, however, all results can be transferred easily
to integral input-to-state-stable (iISS) systems [21], [40].
Assumption II.7. The BCS (4) is ISS with respect to the
boundary values ub+Rb.
摘要:
展开>>
收起<<
arXiv:2210.03426v1[cs.LG]7Oct20221Certifiedmachinelearning:RigorousaposteriorierrorboundsforPDEdefinedPINNsBirgitHillebrecht∗andBenjaminUnger†Abstract—Predictionerrorquantificationinmachinelearninghasbeenleftoutofmostmethodologicalinvestigationsofneuralnetworks,forbothpurelydata-drivenandphysics-inform...
声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
相关推荐
-
《卖股票》2人仿赵本山小品卖拐VIP免费
2024-11-30 9 -
《罗密欧与茱丽叶》穿越版-10人以上幽默搞笑小品剧本VIP免费
2024-11-30 15 -
《精神病》4人搞笑小品剧本台词VIP免费
2024-11-30 11 -
《超幸福鞋垫》湖南卫视何炅经典之作VIP免费
2024-11-30 14 -
《曹操与葛朗台》3人搞笑小品剧本台词VIP免费
2024-11-30 13 -
《摆摊-卖碟》多人(搞笑)最新9人VIP免费
2024-11-30 14 -
《摆摊-卖碟》多人(搞笑)最新7人VIP免费
2024-11-30 13 -
《摆摊-卖碟》多人(搞笑)最新VIP免费
2024-11-30 15 -
“专心成长 超越自我”主题年会暨经管院就协成立一周年庆典联欢会策划书VIP免费
2024-11-30 18 -
高效团队建设方案-如何组建高效的团队VIP免费
2024-12-09 49
分类:图书资源
价格:10玖币
属性:10 页
大小:277.55KB
格式:PDF
时间:2025-09-29


渝公网安备50010702506394