MAgNet Mesh Agnostic Neural PDE Solver Oussama Boussif Mila - Québec AI Institute

2025-04-24 0 0 4.21MB 23 页 10玖币

侵权投诉

MAgNet: Mesh Agnostic Neural PDE Solver

Oussama Boussif

Mila - Québec AI Institute

DIRO, Université de Montréal

oussama.boussif@mila.quebec

Dan Assouline

Mila - Québec AI Institute

DIRO, Université de Montréal

dan.assouline@mila.quebec

Loubna Benabbou

Université du Québec à Rimouski

Loubna_Benabbou@uqar.ca

Yoshua Bengio∗

Mila - Québec AI Institute

DIRO, Université de Montréal

yoshua.bengio@mila.quebec

Abstract

The computational complexity of classical numerical methods for solving Partial

Differential Equations (PDE) scales signiﬁcantly as the resolution increases. As

an important example, climate predictions require ﬁne spatio-temporal resolutions

to resolve all turbulent scales in the ﬂuid simulations. This makes the task of

accurately resolving these scales computationally out of reach even with modern

supercomputers. As a result, current numerical modelers solve PDEs on grids

that are too coarse (3km to 200km on each side), which hinders the accuracy and

usefulness of the predictions. In this paper, we leverage the recent advances in

Implicit Neural Representations (INR) to design a novel architecture that predicts

the spatially continuous solution of a PDE given a spatial position query. By

augmenting coordinate-based architectures with Graph Neural Networks (GNN),

we enable zero-shot generalization to new non-uniform meshes and long-term

predictions up to 250 frames ahead that are physically consistent. Our Mesh

Agnostic Neural PDE Solver (MAgNet) is able to make accurate predictions across

a variety of PDE simulation datasets and compares favorably with existing baselines.

Moreover, MAgNet generalizes well to different meshes and resolutions up to four

times those trained on2.

1 Introduction

Partial Differential Equations (PDEs) describe the continuous evolution of multiple variables, e.g.

over time and/or space. They arise everywhere in physics, from quantum mechanics to heat transfer

and have several engineering applications in ﬂuid and solid mechanics. However, most PDEs can’t

be solved analytically, so it is necessary to resort to numerical methods. Since the introduction of

computers, many numerical approximations were implemented, and new ﬁelds emerged such as

Computational Fluid Mechanics (CFD) (Richardson and Lynch, 2007). The most famous numerical

approximation scheme is the Finite Element Method (FEM) (Courant, 1943; Hrennikoff, 1941). In the

FEM, the PDE is discretized along with its domain, and the problem is transformed into solving a set

of matrix equations. However, the computational complexity scales signiﬁcantly with the resolution.

For climate predictions, this number can be quite signiﬁcant if the desired error is to be reached,

which renders its use impractical.

∗CIFAR Senior Fellow

2Code and dataset can be found on: https://github.com/jaggbow/magnet

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.05495v1 [cs.LG] 11 Oct 2022

In this paper, we propose to learn the continuous solutions for spatio-temporal PDEs. Previous

methods focused on either generating ﬁxed resolution predictions or generating arbitrary resolution

solutions on a ﬁxed grid (Li et al., 2021; Wang et al., 2020). PDE models based on Multi-Layer

Perceptrons (MLPs) can generate solutions at any point of the domain (Dissanayake and Phan-Thien,

1994; Lagaris et al., 1998; Raissi et al., 2017a). However, without imposing a physics-motivated loss

that constrains the predictions to follow the smoothness bias resulting from the PDE, MLPs become

less competitive than CNN-based approaches especially when the PDE solutions have high-frequency

information (Rahaman et al., 2018).

We leverage the recent advances in Implicit Neural Representations ((Tancik et al., 2020), (Chen et al.,

2020), (Jiang et al., 2020)) and propose a general purpose model that can not only learn solutions to a

PDE with a resolution it was trained on, but it can also perform zero-shot super-resolution on irregular

meshes. The added advantage is that we propose a general framework where we can make predictions

given any spatial position query for both grid-based architectures like CNNs and graph-based ones

able to handle sensors and predictions at arbitrary spatial positions.

Contributions

Our main contributions are in the context of machine learning for approximately

but efﬁciently solving PDEs and can be summarized as follows:

•

We propose a framework that enables grid-based and graph-based architectures to generate

continuous-space PDE solutions given a spatial query at any position.

•

We show experimentally that this approach can generalize to resolutions up to four times

those seen during training in zero-shot super-resolution tasks.

2 Related Works

Current solvers can require a lot of computations to generate solutions on a ﬁne spatio-temporal

grid. For example, climate predictions typically use General Circulation Models (GCM) to make

forecasts that span several decades over the whole planet (Phillips, 1956). These GCMs use PDEs to

model the climate in the atmosphere-ocean-land system and to solve these PDEs, classical numerical

solvers are used. However, the quality of predictions is bottlenecked by the grid resolution that is in

turn constrained by the available amount of computing power. Deep learning has recently emerged

as an alternative to these classical solvers in hopes of generating data-driven predictions faster and

making approximations that do not just rely on lower resolution grids but also on the statistical

regularities that underlie the family of PDEs being considered. Using deep learning also makes it

possible to combine the information in actual sensor data with the physical assumptions embedded in

the classical PDEs. All of this would enable practitioners to increase the actual resolution further for

the same computational budget, which in turn improves the quality of the predictions.

Machine Learning for PDE solving:

Dissanayake and Phan-Thien (1994) published one of the

ﬁrst papers on PDE solving using neural networks. They parameterized the solutions to the Poisson

and heat transfer equations using an MLP and studied the evolution of the error with the mesh size.

Lagaris et al. (1998) used MLPs for solving PDEs and ordinary differential equations. They wrote

the solution as a sum of two components where the ﬁrst term satisﬁes boundary conditions and is

not learnable, and the second is parameterized with an MLP and trained to satisfy the equations. In

Raissi et al. (2017a) the authors also parameterized the solution to a PDE using an MLP that takes

coordinates as input. With the help of automatic differentiation, they calculate the PDE residual and

use its MSE loss along with an MSE loss on the boundary conditions. In follow-up work, Raissi et al.

(2017b) also learn the parameters of the PDE (e.g. Reynolds number for Navier-Stokes equations).

The recently introduced Neural Operators framework (Kovachki et al., 2021; Li et al., 2020b,a)

attempts to learn operators between spaces of functions. Li et al. (2021) use "Fourier Layers" to

learn the solution to a PDE by framing the problem as learning an operator from the space of initial

conditions to the space of the PDE solutions. Their model can learn the solution to PDEs that lie on a

uniform grid while maintaining their performance in the zero-shot super-resolution setting. In the

same spirit, Jiang et al. (2020) developed a model based on Implicit Neural Representations called

"MeshFreeFlowNet" where they upsample existing PDE solutions to a higher resolution. They use

3D low-resolution space-time tensors as inputs to a 3DUnet in order to generate a feature map. Next,

some points are sampled uniformly from the corresponding high-resolution tensors and fed to an

MLP called ImNet (Chen and Zhang, 2018). They train their model using a PDE residual loss and

Forecasting (GNN)

Nearest Neighbors

Interpolation

Encoder

DecoderMLP

Parent Mesh Parent Mesh Embedding

Spatial Queries

Figure 1: We illustrate the "Encode-Interpolate-Forecast" framework of MAgNet. The

parent mesh

is fed to the encoder to generate the

parent mesh embedding

. Next, we estimate the values at

the

spatial queries

using the interpolation module that uses features from both the

parent mesh

points and the

parent mesh embedding

points closest to these queries. Finally, the

parent mesh

observations and interpolated values at

spatial queries

are gathered as nodes forming a

new graph

using nearest neighbors and the PDE solution is forecast for all nodes (therefore all spatial locations)

into the future using the forecasting module.

are able to predict the ﬂow ﬁeld at any spatio-temporal coordinate. Their approach is closest to the

one we propose here. The main difference is that we perform super-resolution on the spatial queries

and forecast the solution to a PDE instead of only doing super-resolution on the existing sequence.

Brandstetter et al. (2022) use the message-passing paradigm ((Gilmer et al., 2017), (Watters et al.,

2017), (Sanchez-Gonzalez et al., 2020)) to solve 1D PDEs. They are able to beat state-of-the-art

Fourier Neural Operators (Li et al., 2021) and classical WENO5 solvers while introducing the

"pushforward trick" that allows them to generate better long-term rollouts. Moreover, they present an

added advantage over existing methods since they can learn PDE solutions at any mesh. However,

they are not able to generalize to different resolutions, which is a crucial capability of our method.

Most machine learning approaches require data from a simulator in order to learn the required PDE

solutions and that can be expensive depending on the PDE and the resolution. Wandel et al. (2020)

alleviate that requirement by using a PDE loss.

Machine Learning for Turbulence Modeling:

Recent years have known a surge in machine

learning-based models for modeling turbulence. Since it is expensive to resolve all relevant scales,

some methods were developed that only solve large scales explicitly and separately model sub-grid

scales (SGS). Recently, Novati et al. (2021) used multi-agent reinforcement learning to learn the

dissipation coefﬁcient of the Smagorinsky SGS model (Smagorinsky, 1963) using as reward the

recovery of the statistical properties of Direct Numerical Simulations (DNS). Rasp et al. (2018) used

MLPs to represent sub-grid processes in clouds and replace previous parametrization models in a

global general circulation model. In the same fashion, Park and Choi (2021) used MLPs to learn

DNS sub-grid scale (SGS) stresses using as input ﬁltered ﬂow variables in a turbulent channel ﬂow.

Brenowitz and Bretherton (2018) use MLPs to predicts the apparent sources of heat and moisture

using coarse-grained data and use a multi-step loss to optimize their model.Wang et al. (2020) used

one-layer CNNs to learn the spatial ﬁlter in LES methods and the temporal ﬁlter in RANS as well

as the turbulent terms. A UNet (Ronneberger et al., 2015) is then used as a decoder to get the ﬂow

velocity. de Bezenac et al. (2017) predict future frames by deforming the input sequence according to

the advection-diffusion equation and apply it to Sea-Surface Temperature forecasting.

Stachenfeld et al. (2021) use the "encode-process-decode" (Sanchez-Gonzalez et al., 2018, 2020)

paradigm along with dilated convolutional networks to capture turbulent dynamics seen in high-

resolution solutions only by training on low spatial and temporal resolutions. Their approach beats

existing neural PDE solvers in addition to the state-of-the-art

Athena++

engine (Stone et al., 2020).

We take inspiration of this approach but replace the process module by an interpolation module, to

allow the model to capture spatial correlations between known points and new query points.

3 Methodology

We present the developed framework that leverages recent advances in Implicit Neural Representations

(INR) (Jiang et al., 2020; Sitzmann et al., 2020; Chen et al., 2020; Tancik et al., 2020) and draws

inspiration from mesh-free methods for PDE solving. We ﬁrst start by giving a mathematical

deﬁnition of a PDE. Next, we showcase the proposed "MAgNet" and derive two variants: A grid-

based architecture and a graph-based one.

3.1 Preliminaries

We deﬁne PDE as follows, using Dkto denote k-th order derivatives:

Deﬁnition 3.1.

Evans (2010) Let

denote an open subset of

and

k≥1

an integer. An expression

of the form:

L(Dku(x), Dk−1u(x),...,u(x), x) = 0∀x∈U(1)

is called a

-th order system of PDEs, where

L:Rmnk×Rmnk−1× · · · × Rmn ×Rm×U→Rm

is given and u:U→Rm,u= (u1, . . . , um)is the unknown function to be characterized.

In this paper, we are interested in spatio-temporal PDEs. In this class of PDEs, the domain is

U= [0,+∞]× S

(time

space) where

S ⊂ Rn, n ≥1

and, with

indicating differentiation

w.r.t x, any such PDE can be formulated as:







∂u

∂t =L(Dku(x),...,u(x), x, t)∀t≥0,∀x∈ S.

u(0, x) = g(x)∀x∈ S

Bu= 0 ∀t≥0,∀x∈∂S

(2)

Where

∂S

is the boundary of

is a non-linear operator enforcing boundary conditions on

and

g:S → Rmrepresents the initial condition constraints for the solution u.

Numerical PDE simulations have enjoyed a great body of innovations especially where their use is

paramount in industrial applications and research. Mesh-based methods like the FEM numerically

compute the PDE solution on a predeﬁned mesh. However, when there are regions in the PDE domain

that present large discontinuities, the mesh needs to be modiﬁed and provided with many more points

around that region in order to obtain acceptable approximations. Mesh-based methods typically solve

this problem by re-meshing in what is called Adaptive Mesh Reﬁnement (Berger and Oliger, 1984;

Berger and Colella, 1989). However, this process can be quite expensive, which is why mesh-free

methods have become an attractive option that goes around these limitations.

3.2 MAgNet: Mesh-Agnostic Neural PDE Solver

3.2.1 "Encode-Interpolate-Forecast" framework

Let

{x1, x2, . . . , xT} ∈ RC×N

denote a sequence of

frames that represents the ground-truth

data coming from a PDE simulator or real-world observations.

denotes the number of physical

channels, that is the number of physical variables involved in the PDE and

is the number of points

in the mesh. These frames are deﬁned on the same mesh, that is the mesh does not change in time.

We call that mesh the parent mesh and denote its normalized coordinates of dimensionality

{pi}1≤i≤N∈[−1,1]n

. Let

{ci}1≤i≤M∈[−1,1]n

denote a set of

coordinates representing the

spatial queries. The task is to predict the solution for subsequent time steps both at: (i) all coordinates

from the parent mesh

{pi}1≤i≤N

, and (ii) coordinates from the spatial queries

{ci}1≤i≤M

. At test

time, the model can be queried at any spatially continuous coordinate within the PDE domain to

provide an estimate of the PDE solution at those coordinates.

To perform the prediction, we ﬁrst estimate the PDE solutions at the spatial queries for the ﬁrst

frames and then use that to forecast the PDE solutions at the subsequent timesteps at the query

locations. We do this through three stages (see Figure 1):

1. Encoding

: The encoder takes as input the given PDE solution

{xt}1≤t≤T

at each point of

the parent mesh

{pi}1≤i≤N

and generates a state-representation of original frames, which

can be referred to as embeddings, and which we note

{zt}1≤t≤T

. This representation will

be used in the interpolation step to ﬁnd the PDE solution at the spatial queries

{ci}1≤i≤M

Note that in this encoding step, we can generate one embedding for each frame such that we

have

embeddings or summarize all the information in the

frames into one embedding.

We will explain the methodology using

embeddings, as it is easier to grasp the time

dimension in this formulation, but the implementation has been done using a summarized

single embedding, as mentioned in section 3.2.2. We also note that the embedded mesh

remains the same, i.e. we don’t change it by upsampling or downsampling it.

2. Interpolation

: We follow the same approach as Jiang et al. (2020) and Chen et al. (2020)

by performing an interpolation in the feature space. Note that in case we generate one

representation that summarizes all

frames into one, then

zt=z

for

t= 1, . . . , T

. Let

{tk}1≤k≤Tdenote the timesteps at which the xtare generated.

For each spatial query

, let

N(ci)

denote the nearest points in the parent mesh

. We

generate an interpolation of the features

zk[ci]

at coordinates

and at timestep

as follows:

∀k∈ {1, T },∀i∈ {1, M}:zk[ci] = Ppj∈N (ci)wjgθ(xk[pj], zk[pj], ci−pj, tk)

Ppj∈N (ci)wj

(3)

Where

zk[pj]

and

xk[pj]

denote the embedding and input frame at position

and time

respectively. Moreover,

are interpolation weights and are positive and sum to one.

Weights are chosen such that points closer to the spatial query have a higher contribution to

the interpolated feature than points farther away from the spatial query. The

gθ

is an MLP.

To get the PDE solution

xk[ci]

at coordinate

, we use a decoder

dθ

which is an MLP here:

xk[ci] = dθ(zk[ci])

. In practice, the number of neighbors that we choose is

where

the dimensionality of the coordinates.

3. Forecasting

: Now that we generated the PDE solution at the spatial queries

for all the

past frames, we forecast the PDE solution at future time points at both spatial queries and

the parent mesh coordinates. Let

denote the Nearest-Neighbors Graph (NNG) that has as

nodes all the

locations in the parent mesh (at original coordinates

{pi}1≤i≤N

) as well as

all the

query points (at locations

{ci}1≤i≤M

), with edges that include only the nearest

neighbors of each node among the

N+M−1

others. This corresponds to a new mesh

represented by the graph

. Let

{c0

i}1≤i≤M+N

denote the corresponding new coordinates.

We generate the PDE solution for subsequent time steps on this graph auto-regressively

using a decoder ∆θas follows:

xk+1[c0

i] = xk[c0

i]+(tk+1 −tk)∆θ(xk[c0

i], . . . , x1[c0

i]), k =T, T + 1, . . . (4)

We train MAgNet by using two losses:

•Interpolation Loss

: This loss makes sure that the interpolated points match the ground-truth

and is computed as follows:

Linterpolation =PM

i=1 PT

k=1 ||ˆxk[ci]−xk[ci]||1

T×M(5)

Where ˆxk[ci]denotes the interpolated values generated by the model at the spatial queries.

•Forecasting Loss

: This loss makes sure that the model predictions into the future are

accurate. If His the horizon of the predictions, then we can express the loss as follows:

Lforecasting =PM+N

i=1 PH

k=1 ||ˆxk+T[c0

i]−xk+T[c0

i]||1

H×(M+N)(6)

Where

ˆxk+T[c0

denotes the forecasted values generated by the model at the graph

which

combines both spatial queries and the parent mesh.

The ﬁnal loss is then expressed as:

L=Lforecasting +Linterpolation.

3.2.2 Implementation Details

In the previous section, we described the general MAgNet framework. In this section, we present how

we build the inputs to MAgNet as well as the architectural choices for the encoding, interpolation and

forecasting modules and suggest two main architectures: MAgNet[CNN] and MAgNet[GNN].

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MAgNet:MeshAgnosticNeuralPDESolverOussamaBoussifMila-QuébecAIInstituteDIRO,UniversitédeMontréaloussama.boussif@mila.quebecDanAssoulineMila-QuébecAIInstituteDIRO,UniversitédeMontréaldan.assouline@mila.quebecLoubnaBenabbouUniversitéduQuébecàRimouskiLoubna_Benabbou@uqar.caYoshuaBengioMila-QuébecAIInst...

展开>> 收起<<

MAgNet Mesh Agnostic Neural PDE Solver Oussama Boussif Mila - Québec AI Institute.pdf

共23页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

MAgNet Mesh Agnostic Neural PDE Solver Oussama Boussif Mila - Québec AI Institute

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: