Neural Conservation Laws A Divergence-Free Perspective Jack Richter-Powell

2025-05-02 0 0 2.36MB 21 页 10玖币
侵权投诉
Neural Conservation Laws:
A Divergence-Free Perspective
Jack Richter-Powell
Vector Institute
jack.richter-powell@mcgill.ca
Yaron Lipman
Meta AI
ylipman@meta.com
Ricky T. Q. Chen
Meta AI
rtqichen@meta.com
Abstract
We investigate the parameterization of deep neural networks that by design satisfy
the continuity equation, a fundamental conservation law. This is enabled by
the observation that any solution of the continuity equation can be represented
as a divergence-free vector field. We hence propose building divergence-free
neural networks through the concept of differential forms, and with the aid of
automatic differentiation, realize two practical constructions. As a result, we can
parameterize pairs of densities and vector fields that always exactly satisfy the
continuity equation, foregoing the need for extra penalty methods or expensive
numerical simulation. Furthermore, we prove these models are universal and so can
be used to represent any divergence-free vector field. Finally, we experimentally
validate our approaches by computing neural network-based solutions to fluid
equations, solving for the Hodge decomposition, and learning dynamical optimal
transport maps.
1 Introduction
Modern successes in deep learning are often attributed to the expressiveness of black-box neural
networks. These models are known to be universal function approximators [Hornik et al.,1989]—but
this flexibility comes at a cost. In contrast to other parametric function approximators such as Finite
Elements [Schroeder and Lube,2017], it is hard to bake exact constraints into neural networks.
Existing approaches often resort to penalty methods to approximately satisfy constraints—but these
increase the cost of training and can produce inaccuracies in downstream applications when the
constraints are not exactly satisfied. For the same reason, theoretical analysis of soft-constrained
models also becomes more difficult. On the other hand, enforcing hard constraints on the architecture
can be challenging, and even once enforced, it is often unclear whether the model remains sufficiently
expressive within the constrained function class.
In this work, we discuss an approach to directly bake in two constraints into deep neural networks: (i)
having a divergence of zero
, and (ii)
exactly satisfying the continuity equation
. One of our key
insights is that the former directly leads into the latter, so the first portion of the paper focuses on
divergence-free vector fields. These represent a special class of vector fields which have widespread
use in the physical sciences. In computational fluid dynamics, divergence-free vector fields are used
to model incompressible fluid interactions formalized by the Euler or Navier-Stokes equations. In
R3
we know that the curl of a vector field has a divergence of zero, which has seen many uses in
graphics simulations (e.g.,Eisenberger et al. [2018]). Perhaps less well-known is that lurking behind
this fact is the generalization of divergence and curl through differential forms [Cartan,1899], and
the powerful identity
d2= 0
. We first explore this generalization, then discuss two constructions
derived from it for parmaterizing divergence-free vector fields. While this approach has been partially
discussed previously [Barbarosie,2011,Kelliher,2021], it is not extensively known and to the best of
our knowledge has not been explored by the machine learning community.
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
arXiv:2210.01741v3 [cs.LG] 11 Dec 2022
b
A=JbJT
b
eq. (7)
A1A2
v=div(A)
eq. (4)
v
Figure 1: Divergence-free vector fields
v:RdRd
can be constructed from an antisymmetric
matrix field
A:RdRd×d
or an arbitrary vector field
b:RdRd
.
Jb
represents the Jacobian
matrix of b, and A1and A2are the first and second rows of A. Color denotes divergence.
Concretely, we derive two approaches—visualized in Figure 1—to transform sufficiently smooth
neural networks into divergence-free neural networks, made efficient by automatic differentiation and
recent advances in vectorized and composable transformations [Bradbury et al.,2018,Horace He,
2021]. Furthermore, both approaches can theoretically represent any divergence-free vector field.
We combine these new modeling tools with the observation that solutions of the continuity equation—
a partial differential equation describing the evolution of a density under a flow —can be characterized
jointly as a divergence-free vector field. As a result, we can parameterize neural networks that, by
design, always satisfy the continuity equation, which we coin Neural Conservation Laws (NCL).
While prior works either resorted to penalizing errors [Raissi et al.,2019] or numerically simulating
the density given a flow [Chen et al.,2018], baking this constraint directly into the model allows us
to forego extra penalty terms and expensive numerical simulations.
2 Constructing divergence-free vector fields
We will use the notation of differential forms in
Rn
for deriving the divergence-free and universality
properties of our vector field constructions. We provide a concise overview of differential forms in
Appendix A. For a more extensive introduction see e.g.,Do Carmo [1998], Morita [2001]. Without
this formalism, it is difficult to show either of these two properties; however, readers who wish to
skip the derivations can jump to the matrix formulations in equation 4and equation 7.
Let us denote
Ak(Rn)
as the space of differential
k
-forms,
d:Ak(Rn)→ Ak+1(Rn)
as the exterior
derivative, and
?:Ak(Rn)→ Ank(Rn)
as the Hodge star operator that maps each
k
-form to an
(
n
-
k
)-form. Identifying a vector field
v:RdRd
as a
1
-form,
v=Pn
i=1 vidxi
, we can express
the divergence div(v)as the composition d?v.
To parameterize a divergence-free vector field
v:RnRn
, we note that by the fundamental
property of the exterior derivative, taking an arbitrary (n-2)-form µ∈ An2(Rn)we have that
0 = d2µ=d()(1)
and since ?is its own inverse up to a sign, it follows that
v=?dµ (2)
is divergence free. We write the parameterization
v=?dµ
explicitly in coordinates. Since a basis for
An2(Rn)
can be chosen to be
?(dxidxj)
, we can write
µ=1
2Pn
i,j=1 µij ?(dxidxj)
, where
µji =µij Now a direct calculation shows that up to a constant sign (see Appendix-A.1)
? dµ =
n
X
i=1
n
X
j=1
µij
xj
dxi.(3)
This formula is suggestive of a simple matrix formulation: If we let
A:RnRn×n
be the anti-
symmetric matrix-valued function where
Aij =µij
then the divergence-free vector field
v=?dµ
can be written as taking row-wise divergence of A,i.e.,
v=
div(A1)
.
.
.
div(An)
.(4)
2
However, this requires parameterizing
O(n2)
functions. A more compact representation, which starts
from a vector field, can also be derived. The idea behind this second construction is to model
µ
instead as µ=δν, where ν∈ An1(Rn). Putting this together with equation 2we get that
v=?dδν (5)
is a divergence-free vector field. To provide equation 5in matrix formulation we first write
ν
An1(Rn)
in the (
n
-
1
)-form basis, i.e.,
ν=Pn
i=1 νi? dxi
. Then a direct calculation provides up to
a constant sign
δν =1
2
n
X
i,j=1 νi
xj
νj
xi?(dxidxj)(6)
So, given an arbitrary vector field
b:RnRn
, and denoting
Jb
as the Jacobian of
b
, we can
construct Aas
A=JbJT
b.(7)
where Jbdenotes the Jacobian of b.
To summarize, we have two constructions for divergence-free vector fields v:
Matrix-field: (equations 2and 4)vis represented using an anti-symmetric matrix field A.
Vector-field: (equations 5and 4+7)vis represented using a vector field b.
Figure 2: Compute times.
As we show in the next section, these two approaches are maximally
expressive (i.e., universal), meaning they can approximate arbitrary
smooth divergence-free vector fields. However, empirically these
two constructions can exhibit different practical trade-offs. The
matrix-field construction has a computational advantage as it requires
one less Jacobian computation, though it requires more components—
O(n2)
vs.
O(n)
—to represent than the vector-field construction.
This generally isn’t a concern as all components can be computed in
parallel. However, the vector-field construction can make it easy to
bake in additional constraints; an example of this is discussed in Section 7.1, where a non-negativity
constraint is imposed for modeling continuous-time probability density functions. In Figure 2we
show wallclock times for evaluating the divergence-free vector field based on our two constructions.
Both exhibit quadratic scaling (in function evaluations) with the number of dimensions due to
the row-wise divergence in equation 4, while the vector-field construction has an extra Jacobian
computation.
2.1 Universality
Both the matrix-field and vector-field representations are universal, i.e. they can model arbitrary
divergence-free vector fields. The main tool in the proof is the Hodge decomposition theorem [Morita,
2001,Berger,2003]. For simplicity we will be working on a periodic piece of
Rn
, namely the torus
Tn= [M, M]n/
where
means identifying opposite edges of the
n
-dimensional cube, and
M > 0
is arbitrary. Vector fields with compact support can always be encapsulated in
Tn
with
M > 0
sufficiently large. As
Tn
is locally identical to
Rn
, all the previous definitions and constructions hold.
Theorem 2.1.
The matrix and vector-field representations are universal in
T
, possibly only missing
a constant vector field.
A formal proof of this result is in Appendix B.1.
3 Neural Conservation Laws
We now discuss a key aspect of our work, which is parameterizing solutions of scalar conservation
laws. Conservation laws are a focal point of mathematical physics and have seen applications in
machine learning, with the most well known examples being conserved scalar quantities often referred
to as density, energy, or mass. Formally, a conservation law can be expressed as a first-order PDE
3
written in divergence form as
ρ
t +div(j) = 0
, where
j
is known as the flux, and the divergence is
taken over spatial variables. In the case of the continuity equation, there is a velocity field
u
which
describes the flow and the flux is equal to the density times the velocity field:
ρ
t +div(ρu) = 0 (8)
where
ρ:RnR+
and
u:RnRn
. One can then interpret the equation to mean that
u
transports
ρ(0,·)
to
ρ(t, ·)
continuously – without teleporting, creating or destroying mass. Such
an interpretation plays a key role in physics simulations as well as the dynamic formulation of
optimal transport [Benamou and Brenier,2000]. In machine learning, the continuity equation appears
in continuous normalizing flows [Chen et al.,2018]—which also have been used to approximate
solutions to dynamical optimal transport. [Finlay et al.,2020,Tong et al.,2020,Onken et al.,2021]
These, however, only model the velocity
u
and rely on numerical simulation to solve for the density
ρ, which can be costly and time-consuming.
Instead, we observe that equation 8can be expressed as a divergence-free vector field by augmenting
the spatial dimensions with the time variable, resulting in a vector field
v
that takes as input
(t, x)
and outputs (ρ, ρu). Then equation 8is equivalent to
div(v) = div ρ
ρu=ρ
t +div(ρu) = 0 (9)
where the divergence operator is now taken with respect to the joint system
(t, x)
,i.e.
t +Pn
i=1
xi
.
We thus propose modeling solutions of conservation laws by parameterizing the divergence-free vector
field
v
. Specifically, we parameterize a divergence-free vector field
v
and set
v1=ρ
and
v2:n+1 =ρu
,
allowing us to recover the vector field as
u=v2:n+1
ρ
, assuming
ρ6= 0
. This allows us to enforce the
continuity equation at an architecture level. Compared to simulation-based modeling approaches, we
completely forego such computationally expensive simulation procedures. Code for our experiments
are available at https://github.com/facebookresearch/neural-conservation-law.
4 Related Works
Baking in constraints in deep learning
Existing approaches to enforcing constraints in deep
neural networks can induce constraints on the derivatives, such as convexity [Amos et al.,2017] or
Lipschitz continuity [Miyato et al.,2018]. More complicated formulations involve solving numerical
problems such as using solutions of convex optimization problems [Amos and Kolter,2017], solutions
of fixed-points iterations [Bai et al.,2019], and solutions of ordinary differential equations [Chen
et al.,2018]. These models can help provide more efficient approaches or alternative algorithms for
handling downstream applications such as constructing flexible density models [Chen et al.,2019,
Lu et al.,2021] or approximating optimal transport paths [Tong et al.,2020,Makkuva et al.,2020,
Onken et al.,2021]. However, in many cases, there is a need to solve a numerical problem in which
the solution may only be approximated up to some numerical accuracy; for instance, the need to
compute the density flowing through a vector field under the continuity equation [Chen et al.,2018].
Applications of differential forms
Differential forms and more generally, differential geometry,
have been previously applied in manifold learning—see e.g. Arvanitidis et al. [2021] and Bronstein
et al. [2017] for an in-depth overview. Most of the applications thus far have been restricted to 2 or 3
dimensions—either using identities like
div curl = 0
in 3D for fluid simulations [Rao et al.,2020],
or for learning geometric invariances in 2D images or 3D space [Gerken et al.,2021,Li et al.,2021].
Conservation Laws in Machine Learning
[Sturm and Wexler,2022] previously explored discrete
analogs of conservation laws by conserving mass via a balancing operation in the last layer of a neural
network. [Müller,2022] utilizes a wonderful application of Noether’s theorem to model conservation
laws by enforcing symmetries in a Lagrangian represented by a neural network.
5 Neural approximations to PDE solutions
As a demonstration of our method, we apply it to neural-based PDE simulations of fluid dynamics.
First, we apply it to modelling inviscid fluid flow in the open ball
BR3
with free slip boundary
4
conditions, then to a 2d example on the flat Torus
T2
, but with more complex initial conditions.
While these are toy examples, they demonstrate the value of our method in comparison to existing
approaches—namely that we can exactly satisfy the continuity equation and preserve exact mass.
The Euler equations of incompressible flow
The incompressible Euler equations [Feynman et al.,
1989] form an idealized model of inviscid fluid flow, governed by the system of partial differential
equations1
ρ
t +div(ρu)=0,u
t +uu=p
ρ,div(u)=0 (10)
in three unknowns: the fluid velocity
u(t, x)R3
, the pressure
p(t, x)
, and the fluid density
ρ(t, x)
.
While the fluid velocity and density are usually given at
t= 0
, the initial pressure is not required.
Typically, on a bounded domain
Rn
, these are supplemented by the free-slip boundary condition
and initial conditions
u·n= 0 on u(0, x) = u0and ρ(0, x) = ρ0on (11)
The density
ρ
plays a critical role since in addition to being a conserved quantity, it influences the
dynamics of the fluid evolution over time. In numerical simulations, satisfying the continuity equation
as closely as possible is desirable since the equations in (10) are coupled. Error in the density feeds
into error in the velocity and then back into the density over time. In the finite element literature,
a great deal of effort has been put towards developing conservative schemes that preserve mass
(or energy in the more general compressible case)—see Guermond and Quartapelle [2000] and the
introduction of Almgren et al. [1998] for an overview. But since the application of physics informed
neural networks (PINNs) to fluid problems is much newer, conservative constraints have only been
incorporated as penalty terms into the loss [Mao et al.,2020,Jin et al.,2021].
5.1 Physics informed neural networks
Physics Informed Neural Networks (PINNs; Raissi et al. [2019,2017]) have recently received renewed
attention as an application of deep neural networks. While using neural networks as approximate
solutions to PDE had been previously explored (e.g in Lagaris et al. [1998]), modern advances in
automatic differentiation algorithms have made the application to much more complex problems
feasible [Raissi et al.,2019]. The “physics” in the name is derived from the incorporation of physical
terms into the loss function, which consist of adding the squared residual norm of a PDE. For example,
to train a neural network
φ= [ρ, p, u]
to satisfy the Euler equations, the standard choice of loss to fit
to is
LF=
ut+uu+p
ρ
2
Ldiv =kdiv(u)kLI=ku(0,·)u0(·)k2
+kρ(0,·)ρ0(·)k
LCont =
ρ
t +div(ρu)
2
LG=ku·nk2
Ltotal =γ·[LF, LI, Ldiv, LCont, LG]
where
γ= (γF, γI, γdivγCont, γG)
denotes suitable coefficients (hyperparameters). The loss term
LG
ensures fluid does not pass through boundaries, when they are present. Similar approaches were
taken in [Mao et al.,2020] and [Jagtap et al.,2020] for related equations. While schemes of this
nature are very easy to implement, they have the drawback that since PDE terms are only penalized
and not strictly enforced, one cannot make guarantees as to the properties of the solution.
To showcase the ability of our method to model conservation laws, we will parameterize the density
and vector field as
v= [ρ, ρu]
, as detailed in Section 3. This means we can omit the term
LCont
as
described in Section 5.1 from the training loss. The divergence penalty,
Ldiv
remains when modeling
incompressible fluids, since
u
is not necessarily itself divergence-free – it is
v= [ρ, ρu]
which is
divergence free. In order to stablize training, we can modify the loss terms
LF, LG, LI
to avoid
division by ρ. This is detailed in Appendix B.2.
5.2 Incompressible variable density inside the 3D unit ball
We first construct a simple example within a bounded domain, specifically, we will consider the Euler
equations inside B(0,1) R3, with the initial conditions
ρ(0, x) = 3/2− kxk2v(0, x)=(2, x01,1/2) (12)
1
The convective derivative appearing in equation 10,
uu(x) = limh0
u(x+hu(x))u(x)
h= [Du](u)
is
also often written as (∇ · u)u.
5
摘要:

NeuralConservationLaws:ADivergence-FreePerspectiveJackRichter-PowellVectorInstitutejack.richter-powell@mcgill.caYaronLipmanMetaAIylipman@meta.comRickyT.Q.ChenMetaAIrtqichen@meta.comAbstractWeinvestigatetheparameterizationofdeepneuralnetworksthatbydesignsatisfythecontinuityequation,afundamentalconser...

展开>> 收起<<
Neural Conservation Laws A Divergence-Free Perspective Jack Richter-Powell.pdf

共21页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:21 页 大小:2.36MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 21
客服
关注