Certiﬁed machine learning Rigorous a posteriori error bounds for PDE deﬁned PINNs

2025-09-29 1 0 277.55KB 10 页 10玖币

侵权投诉

arXiv:2210.03426v1 [cs.LG] 7 Oct 2022

Certiﬁed machine learning: Rigorous a posteriori

error bounds for PDE deﬁned PINNs

Birgit Hillebrecht∗and Benjamin Unger†

Abstract—Prediction error quantiﬁcation in machine learning

has been left out of most methodological investigations of neural

networks, for both purely data-driven and physics-informed

approaches. Beyond statistical investigations and generic results

on the approximation capabilities of neural networks, we present

a rigorous upper bound on the prediction error of physics-

informed neural networks. This bound can be calculated without

the knowledge of the true solution and only with a priori available

information about the characteristics of the underlying dynamical

system governed by a partial differential equation. We apply

this a posteriori error bound exemplarily to four problems: the

transport equation, the heat equation, the Navier-Stokes equation

and the Klein-Gordon equation.

Index Terms—Physics-informed neural network, machine

learning, certiﬁcation, a posteriori error estimator, Navier-Stokes

I. INTRODUCTION

Physics-informed machine learning is applied to numerous

highly complex problems, such as turbulence and climate

modeling [7], [44], [45], model predictive control [3], [33],

and Hamiltonian system dynamics [16]. The systematic study

of physics-informed machine learning as a method, however,

remains open [30]. Part of this is the question about the quality

of the forecast: how close is the prediction of the physics-

informed neural network (PINN) [36] to the actual solution?

Beyond statistical evaluations, such as in [22], ﬁrst steps

to answer this question were taken in [18] by establishing

rigorous error bounds for PINNs approximating the solution

of ordinary differential equations. We extend these results

by deriving guaranteed upper bounds on the prediction error

for problems modeled by linear partial differential equations

(PDEs). Our methodology is thereby applicable to, e.g., the

previously mentioned use cases of PINNs in computational

science and engineering. Similar to [18], the error bound can

be computed a posteriori without knowing the actual solution.

Since there are extensive studies on the convergence and ap-

proximation properties of (physics-informed) neural networks

[4], [11], [20], [39], it is clear from a theoretical point of view

that a suitable neural network (NN) can be found with the

desired accuracy. These a priori results and the associated error

estimates [8], [19], [39] are fundamentally different compared

B. Hillebrecht acknowledges funding from the International Max Planck

Research School for Intelligent Systems (IMPRS-IS). B. Unger acknowledges

funding from the DFG under Germany’s Excellence Strategy – EXC 2075

– 390740016 and The Ministry of Science, Research and the Arts Baden-

W¨urttemberg, 7542.2-9-47.10/36/2. Both authors are thankful for support by

the Stuttgart Center for Simulation Science (SimTech).

∗,†:Stuttgart Center for Simulation Science,University of Stuttgart, Stuttgart,

Germany, {birgit.hillebrecht,benjamin.unger}@simtech.uni-stuttgart.de ;

∗ORCID: 0000-0001-5361-0505 †ORCID: 0000-0003-4272-1079

to the computable a posteriori error bounds we present here.

Moreover, we like to stress that our analysis does not rely on

statistical approaches or the Bayesian perspective but provides

a rigorous error bound, which serves as a certiﬁcate for

the NN.

Close in topic is [39] wherein a so called ’a posterior’

error estimate is obtained based on norm inequalities. But,

as stated in their conclusion, the estimates are not directly

computable due to unknown constants. In distinction, the error

estimator presented here is computable which we demonstrate

with several academic examples. Other earlier works [5], [31]

also differ from the following work in fundamental aspects,

such as that the error bounds derived in them are either not

guaranteed to hold, are restricted to a certain problem type, or

require discretization steps.

At this point, we would like to emphasize that the results

presented in Section III are independent of PINNs as the

chosen methodology but can be applied to other surrogate

modeling techniques. Nevertheless, since PINNs use the norm

of the residual as a regularization term during the training

process, they appear as a natural candidate to apply our

residual-based error estimator.

Notation: We use bold notation for vectors and vector-

valued functions and italic letters for operators, especially

we use Ito denote the identity operator. The notations

u=∂tu=∂u

∂t are used interchangeably to denote partial

derivatives w.r.t. time. Similarily, we shorten derivatives w.r.t.

spatial variables and use conventional notation for the diver-

gence ∇ · φ= div(φ), the gradient ∇u= grad(u), and the

Laplace operator ∆u= div(grad(u)) w.r.t. to spatial variables

only. For normed spaces (X, k·kX),(Y, k·kY), and operator

A:D(A)⊆X→Ywe deﬁne the induced operator norm

kAkX,Y := supx∈D(A),kxkX6=0

kAxkY

kxkX. In case, X⊆Yand

k·kY=k·kX, we shorten the notation to kAkX, and drop the

subscript whenever the norm is clear from the context. For the

spatial domain Ω⊆Rd, we denote the boundary by ∂Ωand

deﬁne the size of Ωas kΩk=RΩ1dx. The considerd temporal

domain is denoted by T= [0, tf)with tf∈R+. We use

the conventional notation for spaces of p-integrable functions

Lp(Ω), and Sobolev spaces Wk,p(Ω) and Hk(Ω) = Wk,2(Ω).

II. PROBLEM DESCRIPTION

In the following, we aim at solving the boundary and inital

value problem (BIVP): ﬁnd u:T×Ω→Rnsuch that











∂tu=Auin T×Ω,

u=u0in {t= 0} × Ω,

Bu=ubin T×∂Ω.

(1)

The initial value u0lies in the domain of deﬁnition of the

linear differential operator A:D(A)⊆X→X, meaning

u0∈D(A). The bounded linear operator B:D(A)→U

represents the type of the boundary condition and maps to the

Banach space Uwhich contains the boundary condition ub.

As an example, for the Dirichlet boundary condition B=T I

is suitable, wherein Tdenotes the trace operator.

With this setup given, the objective can be intuitively

formulated as follows.

Problem II.1. Given an approximate solution ˆu for the BIVP

(1), which is determined, for example, by approximating the

system with a NN. Find a computable certiﬁcate ε:T→R+

such that

kˆ

u(t, ·)−u(t, ·)kX≤ε(t)

without knowing the true solution u.

In the following, we consider the BIVP (1) in the framework

of semigroups and investigate mild solutions

u(t, ·) = S(t)u0(·)

deﬁned by the semigroup of operators {S(t)}t≥0generated

by A. To assert that the problem is well-deﬁned and that a

unique (mild) solution exists, we make numerous assumptions.

Firstly, for Bto be well-deﬁned, we need to assert the existence

of a trace operator Tvia trace theorems, e.g. requiring Ωto

be a Lipschitz domain [9] or ∂Ωto be C1[13, Sec. 5.5, Thm.

1]. By formulating the following assumption, we also include

generic boundary conditions [37].

Assumption II.2. For a reﬂexive Banach space X, the linear

(differential) operator A′=A|kerBgenerates a strongly con-

tinuous one-parameter semigroup of bounded linear operators

{S(t)}t≥0on X. The domain D(A′) = D(A)∩ker(B)is

a linear subspace of X. The boundary operator Bis right

invertible with the inverse B0and A′B0is a bounded linear

operator from Uto X.

A mild solution can then be deﬁned by an abstract variation-

of-constants type formula with an extended solution space. To

make the presentation accessible to a large audience, we omit

the general functional analytic framework here (see [37] and

the references therein for the details). Furthermore, we drop

the prime on the operator A′to simplify the notation.

Remark II.3. The results derived below can be generalized

directly to systems modeled by inhomogeneous PDEs of type

∂tu=Au+f

with f(t)∈D(A)and findependent of u. To simplify

notation, we continue all investigations without f.

Similarly to [18] for a ﬁnite-dimensional setting, the NN

approximation of the solution of the BIVP (1) is interpreted

as the solution of a perturbed problem. Here we consider two

types of perturbed problems, the ﬁrst has the same boundary

condition as the unperturbed problem, but has perturbation

terms in the initial condition and the differential equation. The

perturbed problem reads











∂tˆu =Aˆu +Rin T×Ω,

ˆu =u0+R0in {t= 0} × Ω,

Bˆu =ubin T×∂Ω,

(2)

with perturbances R0∈D(A)and R:T→X. For an

approximate solution ˆu given by a NN, we can determine the

terms R,R0by computing the residuals

R0:=ˆ

u(0,·)−u0in Ω,

R:=∂tˆu − Aˆu in T×Ω.(3)

This calculation requires ˆu to be differentiable in time, which

constrains the choice of the activation function of the NN.

Also, we assume the following for the BIVP (2).

Assumption II.4. The perturbed BIVP (2) is such that Ris

Lipschitz continuous in t∈Tand u0+R0∈D(A).

Regarding the approximation by a NN, for most activation

functions, the Lipschitz condition is satisﬁed, and only the

second condition requires a more detailed investigation of the

corresponding function spaces.

The second problem involves additionally a perturbation in

the boundary condition











∂tˆu =Aˆu +Rin T×Ω,

ˆu =u0+R0in {t= 0} × Ω,

Bˆu =ub+Rbin T×∂Ω.

(4)

We consider this system as a boundary control system (BCS,

c.f. [41]) and use the notion of input-to-state-stable (ISS)

systems, see [40], to later derive rigorous error bounds.

Deﬁnition II.5. System (4) is said to be ISS w.r.t. the boundary

term ub+Rb, if there exist functions β, γ and an operator C

such that

ku(t, ·)k ≤β(ku(0,·)k, t)

+γ(kC(Rb(s) + ub(s))kL∞(0,t;Ω)).(5)

Here, γ:R+→R+is in K, which means that it is continuous

and strictly increasing with γ(0) = 0. Similarily, β:R+×

R+→R+is in KL, meaning β(·, t)∈ K for all t≥0and

β(s, ·)is continuous and strictly decreasing to 0 for all s > 0

[40].

In the following, we consider an ISS BCS, where we set

u0= 0 and ub= 0, such that (5) simpliﬁes to the second

term. As an example, for the heat equation (see Section V-A)

we use γ(s) = 1/3s. For the imposed Dirichlet boundary, the

operator C=δ∂Ωis the Dirac Distribution on the boundary

of the domain Ω.

Remark II.6. We present reasoning and theorems based on

the notion of ISS, however, all results can be transferred easily

to integral input-to-state-stable (iISS) systems [21], [40].

Assumption II.7. The BCS (4) is ISS with respect to the

boundary values ub+Rb.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

arXiv:2210.03426v1[cs.LG]7Oct20221Certiﬁedmachinelearning:RigorousaposteriorierrorboundsforPDEdeﬁnedPINNsBirgitHillebrecht∗andBenjaminUnger†Abstract—Predictionerrorquantiﬁcationinmachinelearninghasbeenleftoutofmostmethodologicalinvestigationsofneuralnetworks,forbothpurelydata-drivenandphysics-inform...

展开>> 收起<<

Certiﬁed machine learning Rigorous a posteriori error bounds for PDE deﬁned PINNs.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Certiﬁed machine learning Rigorous a posteriori error bounds for PDE deﬁned PINNs

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: