Neural Conservation Laws A Divergence-Free Perspective Jack Richter-Powell

2025-05-02 0 0 2.36MB 21 页 10玖币

侵权投诉

Neural Conservation Laws:

A Divergence-Free Perspective

Jack Richter-Powell

Vector Institute

jack.richter-powell@mcgill.ca

Yaron Lipman

Meta AI

ylipman@meta.com

Ricky T. Q. Chen

Meta AI

rtqichen@meta.com

Abstract

We investigate the parameterization of deep neural networks that by design satisfy

the continuity equation, a fundamental conservation law. This is enabled by

the observation that any solution of the continuity equation can be represented

as a divergence-free vector ﬁeld. We hence propose building divergence-free

neural networks through the concept of differential forms, and with the aid of

automatic differentiation, realize two practical constructions. As a result, we can

parameterize pairs of densities and vector ﬁelds that always exactly satisfy the

continuity equation, foregoing the need for extra penalty methods or expensive

numerical simulation. Furthermore, we prove these models are universal and so can

be used to represent any divergence-free vector ﬁeld. Finally, we experimentally

validate our approaches by computing neural network-based solutions to ﬂuid

equations, solving for the Hodge decomposition, and learning dynamical optimal

transport maps.

1 Introduction

Modern successes in deep learning are often attributed to the expressiveness of black-box neural

networks. These models are known to be universal function approximators [Hornik et al.,1989]—but

this ﬂexibility comes at a cost. In contrast to other parametric function approximators such as Finite

Elements [Schroeder and Lube,2017], it is hard to bake exact constraints into neural networks.

Existing approaches often resort to penalty methods to approximately satisfy constraints—but these

increase the cost of training and can produce inaccuracies in downstream applications when the

constraints are not exactly satisﬁed. For the same reason, theoretical analysis of soft-constrained

models also becomes more difﬁcult. On the other hand, enforcing hard constraints on the architecture

can be challenging, and even once enforced, it is often unclear whether the model remains sufﬁciently

expressive within the constrained function class.

In this work, we discuss an approach to directly bake in two constraints into deep neural networks: (i)

having a divergence of zero

, and (ii)

exactly satisfying the continuity equation

. One of our key

insights is that the former directly leads into the latter, so the ﬁrst portion of the paper focuses on

divergence-free vector ﬁelds. These represent a special class of vector ﬁelds which have widespread

use in the physical sciences. In computational ﬂuid dynamics, divergence-free vector ﬁelds are used

to model incompressible ﬂuid interactions formalized by the Euler or Navier-Stokes equations. In

we know that the curl of a vector ﬁeld has a divergence of zero, which has seen many uses in

graphics simulations (e.g.,Eisenberger et al. [2018]). Perhaps less well-known is that lurking behind

this fact is the generalization of divergence and curl through differential forms [Cartan,1899], and

the powerful identity

d2= 0

. We ﬁrst explore this generalization, then discuss two constructions

derived from it for parmaterizing divergence-free vector ﬁelds. While this approach has been partially

discussed previously [Barbarosie,2011,Kelliher,2021], it is not extensively known and to the best of

our knowledge has not been explored by the machine learning community.

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.01741v3 [cs.LG] 11 Dec 2022

A=Jb−JT

eq. (7)

A1A2

v=div(A)

eq. (4)

Figure 1: Divergence-free vector ﬁelds

v:Rd→Rd

can be constructed from an antisymmetric

matrix ﬁeld

A:Rd→Rd×d

or an arbitrary vector ﬁeld

b:Rd→Rd

represents the Jacobian

matrix of b, and A1and A2are the ﬁrst and second rows of A. Color denotes divergence.

Concretely, we derive two approaches—visualized in Figure 1—to transform sufﬁciently smooth

neural networks into divergence-free neural networks, made efﬁcient by automatic differentiation and

recent advances in vectorized and composable transformations [Bradbury et al.,2018,Horace He,

2021]. Furthermore, both approaches can theoretically represent any divergence-free vector ﬁeld.

We combine these new modeling tools with the observation that solutions of the continuity equation—

a partial differential equation describing the evolution of a density under a ﬂow —can be characterized

jointly as a divergence-free vector ﬁeld. As a result, we can parameterize neural networks that, by

design, always satisfy the continuity equation, which we coin Neural Conservation Laws (NCL).

While prior works either resorted to penalizing errors [Raissi et al.,2019] or numerically simulating

the density given a ﬂow [Chen et al.,2018], baking this constraint directly into the model allows us

to forego extra penalty terms and expensive numerical simulations.

2 Constructing divergence-free vector ﬁelds

We will use the notation of differential forms in

for deriving the divergence-free and universality

properties of our vector ﬁeld constructions. We provide a concise overview of differential forms in

Appendix A. For a more extensive introduction see e.g.,Do Carmo [1998], Morita [2001]. Without

this formalism, it is difﬁcult to show either of these two properties; however, readers who wish to

skip the derivations can jump to the matrix formulations in equation 4and equation 7.

Let us denote

Ak(Rn)

as the space of differential

-forms,

d:Ak(Rn)→ Ak+1(Rn)

as the exterior

derivative, and

?:Ak(Rn)→ An−k(Rn)

as the Hodge star operator that maps each

-form to an

(

)-form. Identifying a vector ﬁeld

v:Rd→Rd

as a

-form,

v=Pn

i=1 vidxi

, we can express

the divergence div(v)as the composition d?v.

To parameterize a divergence-free vector ﬁeld

v:Rn→Rn

, we note that by the fundamental

property of the exterior derivative, taking an arbitrary (n-2)-form µ∈ An−2(Rn)we have that

0 = d2µ=d(dµ)(1)

and since ?is its own inverse up to a sign, it follows that

v=?dµ (2)

is divergence free. We write the parameterization

v=?dµ

explicitly in coordinates. Since a basis for

An−2(Rn)

can be chosen to be

?(dxi∧dxj)

, we can write

µ=1

2Pn

i,j=1 µij ?(dxi∧dxj)

, where

µji =−µij Now a direct calculation shows that up to a constant sign (see Appendix-A.1)

? dµ =

i=1





j=1

∂µij

∂xj

dxi.(3)

This formula is suggestive of a simple matrix formulation: If we let

A:Rn→Rn×n

be the anti-

symmetric matrix-valued function where

Aij =µij

then the divergence-free vector ﬁeld

v=?dµ

can be written as taking row-wise divergence of A,i.e.,

v=





div(A1)

div(An)





.(4)

However, this requires parameterizing

O(n2)

functions. A more compact representation, which starts

from a vector ﬁeld, can also be derived. The idea behind this second construction is to model

instead as µ=δν, where ν∈ An−1(Rn). Putting this together with equation 2we get that

v=?dδν (5)

is a divergence-free vector ﬁeld. To provide equation 5in matrix formulation we ﬁrst write

ν∈

An−1(Rn)

in the (

)-form basis, i.e.,

ν=Pn

i=1 νi? dxi

. Then a direct calculation provides up to

a constant sign

δν =1

i,j=1 ∂νi

∂xj

−∂νj

∂xi?(dxi∧dxj)(6)

So, given an arbitrary vector ﬁeld

b:Rn→Rn

, and denoting

as the Jacobian of

, we can

construct Aas

A=Jb−JT

b.(7)

where Jbdenotes the Jacobian of b.

To summarize, we have two constructions for divergence-free vector ﬁelds v:

Matrix-ﬁeld: (equations 2and 4)vis represented using an anti-symmetric matrix ﬁeld A.

Vector-ﬁeld: (equations 5and 4+7)vis represented using a vector ﬁeld b.

Figure 2: Compute times.

As we show in the next section, these two approaches are maximally

expressive (i.e., universal), meaning they can approximate arbitrary

smooth divergence-free vector ﬁelds. However, empirically these

two constructions can exhibit different practical trade-offs. The

matrix-ﬁeld construction has a computational advantage as it requires

one less Jacobian computation, though it requires more components—

O(n2)

vs.

O(n)

—to represent than the vector-ﬁeld construction.

This generally isn’t a concern as all components can be computed in

parallel. However, the vector-ﬁeld construction can make it easy to

bake in additional constraints; an example of this is discussed in Section 7.1, where a non-negativity

constraint is imposed for modeling continuous-time probability density functions. In Figure 2we

show wallclock times for evaluating the divergence-free vector ﬁeld based on our two constructions.

Both exhibit quadratic scaling (in function evaluations) with the number of dimensions due to

the row-wise divergence in equation 4, while the vector-ﬁeld construction has an extra Jacobian

computation.

2.1 Universality

Both the matrix-ﬁeld and vector-ﬁeld representations are universal, i.e. they can model arbitrary

divergence-free vector ﬁelds. The main tool in the proof is the Hodge decomposition theorem [Morita,

2001,Berger,2003]. For simplicity we will be working on a periodic piece of

, namely the torus

Tn= [−M, M]n/∼

where

∼

means identifying opposite edges of the

-dimensional cube, and

M > 0

is arbitrary. Vector ﬁelds with compact support can always be encapsulated in

with

M > 0

sufﬁciently large. As

is locally identical to

, all the previous deﬁnitions and constructions hold.

Theorem 2.1.

The matrix and vector-ﬁeld representations are universal in

, possibly only missing

a constant vector ﬁeld.

A formal proof of this result is in Appendix B.1.

3 Neural Conservation Laws

We now discuss a key aspect of our work, which is parameterizing solutions of scalar conservation

laws. Conservation laws are a focal point of mathematical physics and have seen applications in

machine learning, with the most well known examples being conserved scalar quantities often referred

to as density, energy, or mass. Formally, a conservation law can be expressed as a ﬁrst-order PDE

written in divergence form as

∂ρ

∂t +div(j) = 0

, where

is known as the ﬂux, and the divergence is

taken over spatial variables. In the case of the continuity equation, there is a velocity ﬁeld

which

describes the ﬂow and the ﬂux is equal to the density times the velocity ﬁeld:

∂ρ

∂t +div(ρu) = 0 (8)

where

ρ:Rn→R+

and

u:Rn→Rn

. One can then interpret the equation to mean that

transports

ρ(0,·)

ρ(t, ·)

continuously – without teleporting, creating or destroying mass. Such

an interpretation plays a key role in physics simulations as well as the dynamic formulation of

optimal transport [Benamou and Brenier,2000]. In machine learning, the continuity equation appears

in continuous normalizing ﬂows [Chen et al.,2018]—which also have been used to approximate

solutions to dynamical optimal transport. [Finlay et al.,2020,Tong et al.,2020,Onken et al.,2021]

These, however, only model the velocity

and rely on numerical simulation to solve for the density

ρ, which can be costly and time-consuming.

Instead, we observe that equation 8can be expressed as a divergence-free vector ﬁeld by augmenting

the spatial dimensions with the time variable, resulting in a vector ﬁeld

that takes as input

(t, x)

and outputs (ρ, ρu). Then equation 8is equivalent to

div(v) = div ρ

ρu=∂ρ

∂t +div(ρu) = 0 (9)

where the divergence operator is now taken with respect to the joint system

(t, x)

,i.e.

∂

∂t +Pn

i=1

∂

∂xi

We thus propose modeling solutions of conservation laws by parameterizing the divergence-free vector

ﬁeld

. Speciﬁcally, we parameterize a divergence-free vector ﬁeld

and set

v1=ρ

and

v2:n+1 =ρu

allowing us to recover the vector ﬁeld as

u=v2:n+1

, assuming

ρ6= 0

. This allows us to enforce the

continuity equation at an architecture level. Compared to simulation-based modeling approaches, we

completely forego such computationally expensive simulation procedures. Code for our experiments

are available at https://github.com/facebookresearch/neural-conservation-law.

4 Related Works

Baking in constraints in deep learning

Existing approaches to enforcing constraints in deep

neural networks can induce constraints on the derivatives, such as convexity [Amos et al.,2017] or

Lipschitz continuity [Miyato et al.,2018]. More complicated formulations involve solving numerical

problems such as using solutions of convex optimization problems [Amos and Kolter,2017], solutions

of ﬁxed-points iterations [Bai et al.,2019], and solutions of ordinary differential equations [Chen

et al.,2018]. These models can help provide more efﬁcient approaches or alternative algorithms for

handling downstream applications such as constructing ﬂexible density models [Chen et al.,2019,

Lu et al.,2021] or approximating optimal transport paths [Tong et al.,2020,Makkuva et al.,2020,

Onken et al.,2021]. However, in many cases, there is a need to solve a numerical problem in which

the solution may only be approximated up to some numerical accuracy; for instance, the need to

compute the density ﬂowing through a vector ﬁeld under the continuity equation [Chen et al.,2018].

Applications of differential forms

Differential forms and more generally, differential geometry,

have been previously applied in manifold learning—see e.g. Arvanitidis et al. [2021] and Bronstein

et al. [2017] for an in-depth overview. Most of the applications thus far have been restricted to 2 or 3

dimensions—either using identities like

div ◦curl = 0

in 3D for ﬂuid simulations [Rao et al.,2020],

or for learning geometric invariances in 2D images or 3D space [Gerken et al.,2021,Li et al.,2021].

Conservation Laws in Machine Learning

[Sturm and Wexler,2022] previously explored discrete

analogs of conservation laws by conserving mass via a balancing operation in the last layer of a neural

network. [Müller,2022] utilizes a wonderful application of Noether’s theorem to model conservation

laws by enforcing symmetries in a Lagrangian represented by a neural network.

5 Neural approximations to PDE solutions

As a demonstration of our method, we apply it to neural-based PDE simulations of ﬂuid dynamics.

First, we apply it to modelling inviscid ﬂuid ﬂow in the open ball

B⊆R3

with free slip boundary

conditions, then to a 2d example on the ﬂat Torus

, but with more complex initial conditions.

While these are toy examples, they demonstrate the value of our method in comparison to existing

approaches—namely that we can exactly satisfy the continuity equation and preserve exact mass.

The Euler equations of incompressible ﬂow

The incompressible Euler equations [Feynman et al.,

1989] form an idealized model of inviscid ﬂuid ﬂow, governed by the system of partial differential

equations1

∂ρ

∂t +div(ρu)=0,∂u

∂t +∇uu=∇p

ρ,div(u)=0 (10)

in three unknowns: the ﬂuid velocity

u(t, x)∈R3

, the pressure

p(t, x)

, and the ﬂuid density

ρ(t, x)

While the ﬂuid velocity and density are usually given at

t= 0

, the initial pressure is not required.

Typically, on a bounded domain

Ω⊆Rn

, these are supplemented by the free-slip boundary condition

and initial conditions

u·n= 0 on ∂Ωu(0, x) = u0and ρ(0, x) = ρ0on Ω(11)

The density

plays a critical role since in addition to being a conserved quantity, it inﬂuences the

dynamics of the ﬂuid evolution over time. In numerical simulations, satisfying the continuity equation

as closely as possible is desirable since the equations in (10) are coupled. Error in the density feeds

into error in the velocity and then back into the density over time. In the ﬁnite element literature,

a great deal of effort has been put towards developing conservative schemes that preserve mass

(or energy in the more general compressible case)—see Guermond and Quartapelle [2000] and the

introduction of Almgren et al. [1998] for an overview. But since the application of physics informed

neural networks (PINNs) to ﬂuid problems is much newer, conservative constraints have only been

incorporated as penalty terms into the loss [Mao et al.,2020,Jin et al.,2021].

5.1 Physics informed neural networks

Physics Informed Neural Networks (PINNs; Raissi et al. [2019,2017]) have recently received renewed

attention as an application of deep neural networks. While using neural networks as approximate

solutions to PDE had been previously explored (e.g in Lagaris et al. [1998]), modern advances in

automatic differentiation algorithms have made the application to much more complex problems

feasible [Raissi et al.,2019]. The “physics” in the name is derived from the incorporation of physical

terms into the loss function, which consist of adding the squared residual norm of a PDE. For example,

to train a neural network

φ= [ρ, p, u]

to satisfy the Euler equations, the standard choice of loss to ﬁt

to is

LF=



ut+∇uu+∇p

ρ



ΩLdiv =kdiv(u)kΩLI=ku(0,·)−u0(·)k2

Ω+kρ(0,·)−ρ0(·)kΩ

LCont =



∂ρ

∂t +div(ρu)



ΩLG=ku·nk2

∂ΩLtotal =γ·[LF, LI, Ldiv, LCont, LG]

where

γ= (γF, γI, γdivγCont, γG)

denotes suitable coefﬁcients (hyperparameters). The loss term

ensures ﬂuid does not pass through boundaries, when they are present. Similar approaches were

taken in [Mao et al.,2020] and [Jagtap et al.,2020] for related equations. While schemes of this

nature are very easy to implement, they have the drawback that since PDE terms are only penalized

and not strictly enforced, one cannot make guarantees as to the properties of the solution.

To showcase the ability of our method to model conservation laws, we will parameterize the density

and vector ﬁeld as

v= [ρ, ρu]

, as detailed in Section 3. This means we can omit the term

LCont

described in Section 5.1 from the training loss. The divergence penalty,

Ldiv

remains when modeling

incompressible ﬂuids, since

is not necessarily itself divergence-free – it is

v= [ρ, ρu]

which is

divergence free. In order to stablize training, we can modify the loss terms

LF, LG, LI

to avoid

division by ρ. This is detailed in Appendix B.2.

5.2 Incompressible variable density inside the 3D unit ball

We ﬁrst construct a simple example within a bounded domain, speciﬁcally, we will consider the Euler

equations inside B(0,1) ⊆R3, with the initial conditions

ρ(0, x) = 3/2− kxk2v(0, x)=(−2, x0−1,1/2) (12)

The convective derivative appearing in equation 10,

∇uu(x) = limh→0

u(x+hu(x))−u(x)

h= [Du](u)

also often written as (∇ · u)u.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

NeuralConservationLaws:ADivergence-FreePerspectiveJackRichter-PowellVectorInstitutejack.richter-powell@mcgill.caYaronLipmanMetaAIylipman@meta.comRickyT.Q.ChenMetaAIrtqichen@meta.comAbstractWeinvestigatetheparameterizationofdeepneuralnetworksthatbydesignsatisfythecontinuityequation,afundamentalconser...

展开>> 收起<<

Neural Conservation Laws A Divergence-Free Perspective Jack Richter-Powell.pdf

共21页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Neural Conservation Laws A Divergence-Free Perspective Jack Richter-Powell

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: