A Deep Fourier Residual Method for solving PDEs using Neural Networks Jamie M. Taylor1 David Pardo234 and Ignacio Muga25

2025-04-30 0 0 913.39KB 32 页 10玖币

侵权投诉

A Deep Fourier Residual Method for solving PDEs using

Neural Networks

Jamie M. Taylor1, David Pardo2,3,4, and Ignacio Muga2,5

1Department of Quantitative Methods, CUNEF University, Madrid, Spain

2Basque Center for Applied Mathematics (BCAM), Bilbao, Bizkaia, Spain

3University of the Basque Country (UPV/EHU), Leioa, Spain

4Ikerbasque (Basque Foundation for Sciences), Bilbao, Spain

5Instituto de Matem´aticas, Pontiﬁcia Universidad Cat´olica de Valpara´ıso, Chile.

Abstract

When using Neural Networks as trial functions to numerically solve PDEs, a key choice to be

made is the loss function to be minimised, which should ideally correspond to a norm of the error.

In multiple problems, this error norm coincides with–or is equivalent to–the H−1-norm of the

residual; however, it is often diﬃcult to accurately compute it. This work assumes rectangular

domains and proposes the use of a Discrete Sine/Cosine Transform to accurately and eﬃciently

compute the H−1norm. The resulting Deep Fourier-based Residual (DFR) method eﬃciently

and accurately approximate solutions to PDEs. This is particularly useful when solutions lack

H2regularity and methods involving strong formulations of the PDE fail. We observe that the

H1-error is highly correlated with the discretised loss during training, which permits accurate

error estimation via the loss.

1 Introduction

The use of Deep Learning techniques employing Neural Networks (NNs) have been sucessful to

solve a wide range of data-based problems across ﬁelds such as image proccessing, healthcare,

and autonomous cars [1, 2, 20, 23, 34, 43, 52, 54]. Recently, there has been a surge of interest

in the use of neural networks as function spaces that can be employed to obtain numerical

solutions of Partial Diﬀerential Equations (PDEs) [5, 9, 37, 40, 47, 48]. Owing to the universal

approximation theorem, and variants in Sobolev spaces [16, 24, 25, 30], it is known that a

suﬃciently wide or deep NN is able to approximate any given continuous function on a compact

domain with arbitrary accuracy, and thus they make suitable function spaces for solving PDEs.

The use of automatic diﬀerentiation (autodiﬀ) [4] facilitates eﬃcient numerical evaluation of

derivatives, which allows algorithmic diﬀerentiation of the neural network itself, as well as the

use of gradient-based optimisation techniques such as Stochastic Gradient Descent (SGD) [8]

and Adam [31] in order to minimise appropriate loss functions over a space of neural networks.

A quantitative version of the Universal Approximation Theorem [3] demostrates that NNs

can approximate without suﬀering the curse of dimensionality, requiring far fewer degrees of

freedom to approximate functions with high-dimensional inputs than classical piecewise-linear

function spaces, making them an attractive function space for solving PDEs, in particular,

in high-dimensional problems. Beyond solving single instances of PDEs, NNs have shown a

capacity to learn operators that solve families of parametrised PDEs, allowing rapid “online”

evaluation of solutions after an “oﬄine” training of the network [13, 22, 32, 35, 36].

arXiv:2210.14129v1 [math.NA] 25 Oct 2022

The ﬂexibility of NNs to solve many classes of PDEs is owed to a general and simple frame-

work, whereby one chooses an appropriate architecture of the NN, a loss function, whose min-

imiser should be an exact solution of the PDE, and an optimisation procedure to attempt to

minimise the loss function. In this article, we focus on the choice of loss function when solv-

ing PDEs with NN function spaces. Previous works have considered losses based on strong

[26, 44, 53] and weak [29, 28, 27] formulations of the PDE. However, the choice of a perfect loss

function is generally not obvious as in practice solutions will only reach local minima, and the

loss and error may have distinct or unknown convergence rates as one approaches either a local

minimiser or a practically unattainable global minimiser.

Generally, a PDE operator can be described by a (possibly nonlinear) map R:X→Y,

where X, Y are Banach spaces. The PDE then takes the form

R(u) = 0.(1)

For example, Poisson’s equation, −∆u(x) = f(x), on a domain Ω with homogeneous Dirichlet

boundary condition and f∈L2(Ω) may be interpreted in strong form via the map

Rs:H2(Ω) ∩H1

0(Ω) →L2(Ω) given by

Rs(u) = ∆u+f, (2)

or in weak form via the map Rw:H1

0(Ω) →H−1(Ω) := [H1

0(Ω)]∗given by

hRw(u), viH−1×H1=ZΩ∇u(x)· ∇v(x)−f(x)v(x)dx. (3)

When employing NNs to numerically solve PDEs, a loss is often selected as the norm of the

PDE residual in Y, that is,

L(u) := ||R(u)||Y.(4)

One is generally confronted with two issues within this framework. The ﬁrst is that norms on

function spaces are generally given by integrals and thus a quadrature rule must be employed

in order to numerically approximate L(u). In contrast to polynomial-based function spaces, an

exact quadrature rule is generally unobtainable. Moreover, a poor choice of quadrature rule can

lead to a form of “overﬁtting” and poor approximation of solutions [46]. The second issue is

that in inﬁnite dimensional spaces not all norms are equivalent, and thus the choice of norm on

Ycan directly aﬀect the convergence of the error during training. Ideally, we should employ

norms on Xand Ywhich are compatible in the sense that the X-norm of the error is equivalent

to the Y-norm of the residual, leading to a residual minimisation method [9, 10, 15].

Related to this second issue, progress has been made in the direction of a priori and a

posteriori error estimates that allow estimation of the error via the loss [6, 7, 18, 38, 50, 51].

The works rely on coercivity-type estimates of the error in terms of the exact norms, as well as

a control of quadrature and training errors.

Many PDEs can be expressed in weak form via (1) with X=Uis a space of trial functions,

and Y=V∗, where Vis the space of test functions. That is,

hR(u), viV∗×V= 0 ∀v∈V. (5)

We commonly consider cases where Rrepresents a linear and inhomogeneous PDE, and thus

may be expressed in the form

hR(u), viV∗×V=b(u, v)−f(v),(6)

where f∈V∗and b:U×V→Ris a bilinear form. It is clear that the PDE, in weak form, is

equivalent to the statement that ||R(u∗)|| = 0 for any norm on V∗. The most natural norm is

the dual norm, induced by the norm on V, deﬁned via

||f||V∗= sup

v∈V\{0}

|f(v)|

||v||V

.(7)

The advantage of employing the dual norm on V∗is that, under certain assumptions that

we will outline in more detail in Section 3.1, one can relate the dual norm of the residual to the

norm of the error. Speciﬁcally, 1

M||R(u)||V∗≤ ||u−u∗||U≤1

γ||R(u)||V∗, where uis a candidate

solution, u∗is the exact solution, and M, γ are positive, problem dependent, constants. This

allows ||R(u)||V∗to be used as an error estimator, without needing to know the exact solution.

In addition, if we can ﬁnd a way to numerically approximate the dual norm, we can employ this

as a loss function to be minimised over a trial function space.

We propose a Deep Fourier Residual (DFR) method to approximate the error of candidate

solutions of PDEs in H1via an approximation of the dual norm of the residual of the PDE

operator. The dual norm is then employed as a loss function to be minimised. The advantage

of such a method is that the resulting norm is equivalent to the H1-error of the solutions for

certain well-posed problems.

We consider several numerical examples, comparing the DFR approach to other losses em-

ployed to solve diﬀerential equations using NNs. Our numerical examples exhibit strong cor-

relation between the proposed loss and H1-error during the training process. For suﬃciently

regular problems, our DFR method is qualitatively equivalent to existing methods in the litera-

ture (Section 4.1.2) [27, 44]. However, in less regular problems, our method leads to signiﬁcantly

more accurate solutions, both for an equation that admits a smooth solution with large gradi-

ents (Section 4.1.3), and for an elliptic equation with discontinuous parameters (Section 4.1.4).

Indeed, methods based on the strong formulation of the PDE, such as PINNs [44], cannot be

implemented for such applications. The DFR method is shown to be advantageous both when

solutions admit H1\H2regularity, and in regular problems where the forcing term has a large

discrepency between its L2and H−1norm. We then consider further numerical experiments

which demonstrate the DFR method’s capability in a linear equation with point source (Sec-

tion 4.2.1), a nonlinear ODE (Section 4.2.2), and a 2D linear problem (Section 4.2.3).

The DFR method is currently limited to rectangular domains where each face has either

a Dirichlet or a Neumann Boundary condition. We rely on a Fourier-type representation of

the H−1norm that can be performed eﬃciently using the one-dimensional Discrete Cosine

Transform and Discrete Sine Transform (DCT/DST), which are based on the Fast Fourier

Transform (FFT), in each coordinate direction. Generally, an extension of our techniques to

PDEs on arbitrary domains Ω would require access to an orthonormal basis of H1(Ω), whose

obtention may prove more costly than solving the PDE itself. Furthermore, the DST/DCT

takes advantage of the FFT, which allows an inexpensive evaluation of the loss and would not

be available in general domains. A possibility for the extension of the DFR method to arbitrary

domains include methods analogous to embedded domain methods [19, 21, 33, 39, 41, 45, 49],

which embeds domains with complex geometry into a simpler ﬁctious computational domain. It

is also possible to borrow ideas from Goal-Oriented adaptivity (e.g., [42]) to the proposed DFR

method, although this will be postponed for a future work.

The structure of the paper is as follows. In Section 2 we cover some preliminary concepts.

The theoretical groundwork for the deﬁnition of the DFR method is presented in Section 3, with

our proposed loss deﬁned in Section 3.3. Section 4.1 contains numerical examples comparing our

proposed loss function with the VPINNs and collocation losses, which are roughly equivalent in

regular problems, but we will demonstrate that the DFR method greatly outperforms VPINNs

and PINNs when solutions are less regular. In Section 4.2 we consider further numerical ex-

periments that demonstrate the DFR in equations with a point source, nonlinearities, and 2D

results. Finally, concluding remarks are made in Section 5.

2 Preliminaries

2.1 Neural Networks

Neural networks are functions expressed as compositions of more elementary functions. In the

simplest case of a fully connected feed-forward NN, an M-layer neural network is described by

Mlayer functions, Li:RNi→RNi+1 , that are of the form

Li(x) = σi(Aix+bi),(8)

where Aiis an Ni×Ni+1 matrix, bi∈RNi+1 , and σiis an activation function that may depend

on the layer index iand acts component-wise on vectors. A fully-connected feed forward neural

network is a function ˜u:RN1→RNM+1 deﬁned by

˜u(x) = LM◦LM−1◦. . . ◦L1(x).(9)

The ﬁnal activation function σMis taken to be the identity, σM(x) = x. The parameters Ai, bi,

known as the weights and biases of the network, parametrise the neural network. Optimisation

over a neural network space with ﬁxed architecture corresponds to identifying the optimal values

of these trainable parameters.

In the context of NNs for PDEs, we often need to impose homogeneous Dirichlet boundary

conditions on our candidate solutions. In this work, we will do this by introducing a cutoﬀ

function. That is, if we wish to consider functions u: Ω →Rso that for a subset of the

boundary ΓD⊂∂Ω, u|ΓD=u0, we take ˜uto be of the form (9), and deﬁne

u(x) = φ1(x)˜u(x) + φ2(x),(10)

where φ1is a function satisfying φ1|ΓD= 0 and φ1>0 on ¯

Ω\ΓD, and φ2|ΓD=u0.

We include a schematic of this architecture in Figure 1

Input

z }| {

Hidden layers

z }| {

Trainable layers

| {z }

˜u

z }| {

Output

z }| {

Apply B.C.

x˜u(x)u(x)

...

Figure 1: NN architecture

2.2 PINN and VPINN losses

Whilst there any many discrete losses employed when solving PDEs via NNs, in this section we

outline two particular cases, the PINN (collocation), and the VPINN losses, which are based

on strong and weak formulations of the PDE, respectively. These methods will be used for

comparison in the numerical experiments of Section 4.1.

2.2.1 Collocation

We assume that the strong form of the residual can be represented in the form

Lu(x) =0 (x∈Ω),

Gu(x) =0 (x∈∂Ω).(11)

The collocation method considers discretisations of the L2norms of Luand Guas the loss

function to be minimised, according to an appropriate quadrature rule. Explicitly, we consider

the loss

Lcol(u) := 1

i=1

ωi|Lu(xi)|2+1

i=1

ωb

iGu(xb

i)

2,(12)

where (xi)K1

i=1 and (ωi)K1

i=1 are quadrature points in Ω and quadrature weights, respectively,

which may be taken via a Monte Carlo or a deterministic quadrature scheme. Similarly, (xb

i)K2

i=1

and (ωb

i)K2

i=1 are quadrature points and weights on the boundary.

In PDEs with low regularity, the strong form of the PDE does not hold and minimisers of

Eq. (12) will not accurately represent the PDE. Despite this limitation, the collocation (PINN)

method is one of the most attractive methods for regular problems as it is simple to implement

using autodiﬀ algorithms. Furthermore, by using Monte Carlo integration techniques, integrals

can be estimated in high dimension without suﬀering from the curse of dimensionality.

2.2.2 VPINNs

VPINNs employ a loss that utilizes the weak formulation of the PDE. They correspond to a

Petrov-Galerkin method where the trial space is given by NNs. Given a set of test functions

(vk)K

k=1, a candidate solution uand the residual R(u)∈V∗given in weak form, the loss is

deﬁned as

LVP (u) =

k=1 |hR(u), vkiV∗×V|2.(13)

In [27], this method was shown to be advantageous over classical PINNs method, both

in terms of accuracy and speed. A particular application within their work, relevant to this

manuscript, was to consider ODEs on [0,1] with a NN architecture that consists of a single

hidden layer with sine activation function, and test functions vk(x) = sin(kπx). For this imple-

mentation, the authors were able to perform an exact quadrature to evaluate hR(u), vkiV∗×V,

which was employed in their loss function. In other implementations within their article, Leg-

endre polynomials are considered as test functions. Whilst not directly commented within their

work, in their implementation with sine test functions, the norm may be interpreted as a dis-

cretisation of the L2-norm of the strong form of the residual. As they consider the test functions

(vk)K

k=1 form to be a subset of an orthonormal basis of L2, if there exists a strong form residual

Lu∈L2such that

hR(u), viV∗×V=hLu, viL2

for all v∈H1

0, we observe that

k=1 |hR(u), vkiV∗×V|2=

k=1hLu, vki2

L2≈ ||Lu||2

L2.(14)

In particular, for suﬃciently regular problems, this implies that LVP and Lcol each correspond

to distinct discretisations of the same loss, i.e., the L2-norm of the strong-form residual. The

signiﬁcant diﬀerence, however, is that the discretisation (13) is always well deﬁned, even if the

residual cannot be represented by an L2function, and we will observe the consequences of this

distinction in Section 4.1.4, employing sine-based test functions, as in [27].

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ADeepFourierResidualMethodforsolvingPDEsusingNeuralNetworksJamieM.Taylor1,DavidPardo2,3,4,andIgnacioMuga2,51DepartmentofQuantitativeMethods,CUNEFUniversity,Madrid,Spain2BasqueCenterforAppliedMathematics(BCAM),Bilbao,Bizkaia,Spain3UniversityoftheBasqueCountry(UPV/EHU),Leioa,Spain4Ikerbasque(BasqueFou...

展开>> 收起<<

A Deep Fourier Residual Method for solving PDEs using Neural Networks Jamie M. Taylor1 David Pardo234 and Ignacio Muga25.pdf

共32页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A Deep Fourier Residual Method for solving PDEs using Neural Networks Jamie M. Taylor1 David Pardo234 and Ignacio Muga25

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: