Data-driven discovery of non-Newtonian astronomy via learning non-Euclidean Hamiltonian Oswin So

2025-04-27 0 0 677.03KB 16 页 10玖币
侵权投诉
Data-driven discovery of non-Newtonian
astronomy via learning non-Euclidean Hamiltonian
Oswin So
Massachusetts Institute of Technology, USA
oswinso@mit.edu
Gongjie Li
Georgia Institute of Technology, USA
gongjie.li@physics.gatech.edu
Evangelos A. Theodorou
Georgia Institute of Technology, USA
evangelos.theodorou@gatech.edu
Molei Tao
Georgia Institute of Technology, USA
mtao@gatech.edu
Abstract
Incorporating the Hamiltonian structure of physical dynamics into deep learning
models provides a powerful way to improve the interpretability and prediction accu-
racy. While previous works are mostly limited to the Euclidean spaces, their exten-
sion to the Lie group manifold is needed when rotations form a key component of the
dynamics, such as the higher-order physics beyond simple point-mass dynamics for
N
-body celestial interactions. Moreover, the multiscale nature of these processes
presents a challenge to existing methods as a long time horizon is required. By lever-
aging a symplectic Lie-group manifold preserving integrator, we present a method
for data-driven discovery of non-Newtonian astronomy. Preliminary results show
the importance of both these properties in training stability and prediction accuracy.
1 Introduction
Point
Rigid
Lie T2
Figure 1: One planet’s orbit
around a star: rigid body correc-
tion results in a precession,i.e.
slow rotation of the orbital axis.
Our method, ‘Lie T2’, learns
V
from data and predicts a trajec-
tory that matches the ground truth
with the rigid body potential.
Deep Neural Networks (DNN) have been demonstrated to be
effective tools for learning dynamical systems from data. One
important class of systems to be learned have dynamics described
by physical laws, whose structure can be exploited by learning the
Hamiltonian of the system instead of the vector field [
1
,
2
]. An
appropriately learned Hamiltonian can endow the learned system
with properties such as superior long prediction accuracy [
3
] and
applicability to chaotic systems [35].
To learn continuous dynamics from discrete data, one important
step is to bridge the continuous and discrete times. Seminal work
initially approximated the time derivative via finite differences
and then matched it with a learned (Hamiltonian) vector field[
1
,
2
].
Recent efforts avoid the inaccuracy of finite difference by numer-
ically integrating the learned vector field. Especially relevant
here is SRNN [
6
], which uses a symplectic integrator to ensure
the learned dynamics is symplectic (a necessity for Hamiltonian
systems). Although SRNN only demonstrated learning separable Hamiltonians, breakthrough in
symplectic integration of arbitrary Hamiltonians [
7
] was used to extend SRNN [
8
]. Further efforts on
improving the time integration error have also been made [
9
11
]. Meanwhile, alternative approaches
based on learning a symplectic map instead of the Hamiltonian also demonstrated efficacy [
3
,
12
],
although these approaches have not been extended to non-Euclidean problems.
Work was done while Oswin was at Georgia Tech.
Preprint. Under review.
arXiv:2210.00090v1 [cs.LG] 30 Sep 2022
In fact, one relatively under-explored area is learning Hamiltonian dynamics on manifolds like the Lie
group manifold family
2
. One important member of this family is
SO(n)
, which describes isometries
in
Rn
and is important for, e.g., dynamical astronomy. The evolution of celestial bodies correspond to
a mechanical system, and the 2- and 3-body problems have been a staple problem in works on learning
Hamiltonian (e.g., [
1
,
3
,
6
,
15
]); however, the Newtonian (point-mass) gravity considered is already
well understood. Practical problems in planetary dynamics are complicated by higher-order physics
such as planet spin-orbit interaction, tidal dissipation, and general relativistic correction. While it is
unclear what would be a perfect scientific model for these effects, planetary rotation is a necessary
component to account for spin-orbit interaction and tidal forcing, creating an
SO(3)N
component of
the configuration space. To learn these physics from data, we need to learn on the Lie group.
Rigid body dynamics also play important roles in other applications such as robotics. In a seminal
work [
16
], Hamiltonian dynamics on
SO(3)
are used to learn rigid body dynamics for a quadrotor.
In that work, Runge-Kutta 4 integrator is used. Consequently, the method is applicable to short
time-horizon (see Sec.3and last paragraph of Sec.2).
For our problem of learning non-Newtonian astronomy, the time-horizon has to be long. Hence,
we use a different approach by leveraging a Lie-group preserving symplectic integrator. Structure-
preserving integration of dynamical systems on manifolds has been extensively studied in literature,
for example for Lie groups [1721] and more broadly, geometric integration [2225].
In summary, we propose a deep learning methodology for performing data-driven discovery of
non-Newtonian astronomy. By leveraging the use of a symplectic Lie-group manifold preserving
integrator, we show how a non-Euclidean Hamiltonian can be learned for accurate prediction of
non-Newtonian effects. Moreover, we provide insights that show the importance of both symplecticity
and exact preservation of the Lie-group manifold in training stability.
2 Method
Given observations of a dynamically evolving system, our goal is to learn the physics that governs
its evolution from the data. Denote by
(qk,l,Rk,l,pk,l,Πk,l)K
k=1,l [L]
a dataset of snapshots of
L
continuous-time trajectories of a system with Ninteracting rigid bodies. That is,
(qk,l,Rk,l,pk,l,Πk,l)=ql(kt),Rl(kt),pl(kt),Πl(kt),
where
ql(t),Rl(t),pl(t),Πl(t)
is a solution of some latent Hamiltonian ODE to be learned corre-
sponding to mechanical dynamics on
TSE(3)N
.
t
is a (possibly large) observation timestep,
RSO(3)N
is the rotational configuration of the
N
rigid bodies, and
Πso(3)N
denotes each’s
angular momentum in their respective body frames.
Importantly, since the configuration space
Q=SE(3)N
is not flat, the mechanical dynamics are not
given by
˙
q=H
p,˙
p=H
q
for some Hamiltonian
H
that depends on the generalized coordinates
qQ
and generalized momentum
pT
qQ
. Instead, the equations of motion can be derived via
either Lagrange multipliers [
20
,
26
] or a Lie group variational principle [
20
,
27
], which will be
˙
qi=pi/mi,(1a)
˙
pi=V
pi
+Fpi(1b)
˙
Ri=Ri
\
J1
iΠi(1c)
˙
Πi=Πi×J1
iΠiR|
i
V
RiV
Ri|
Ri+FΠi(1d)
assuming a physical Hamiltonian
H(q,R,p,Π) = PN
i=1 1
2pT
ipi/mi+Pi1
2ΠT
iJ1
iΠi+V(q,R)
that sums total (translation and rotational) kinetic energy and interaction potential
V
, where
mi,Ji
denote the mass and inertial tensor of the
i
th body, and
Fp,FΠ
are forcing terms to model
nonconervative forces.
ΠiR3
is a vector,
is the map from
R3
to
Skew3
and
is its inverse ([
20
]
for more details). By learning the potential
V
, external forcing
Fp
and torque
FΠ
, we can learn the
physics of the system.
2.1 Machine Learning Challenges Posed by Dynamical Astronomy
We study this setup because it helps answer scientific questions like: what physics governs the motions
of celestial bodies, such as planets in a planetary system? The leading order physics is of course
already well known, namely these bodies can be approximated by point masses that are interacting
2We note extensions to include holonomic constraints in [13] and to handle contact in [14]).
2
(q1, R1, p1,Π1)
(q0, R0, p0,Π0)
.
.
.
ϕθ,Lie T2
h
. . . ϕθ,Lie T2
h
ϕθ,Lie T2
h
. . . ϕθ,Lie T2
h
Hlayer SRNN
(ˆq1,ˆ
R1,ˆp1,ˆ
Π1)
(ˆq2,ˆ
R2,ˆp2,ˆ
Π2)L(θ)
Figure 2: Inputs are fed through a recurrent layer with Lie
T2
. Prediction error is used as a loss on
θ
.
through a
1/r
gravitational potential. However, planets are not point masses, and their rotations
matter because they shape planetary climates [
28
,
29
] and even feedback to their orbits [
30
]. This
already starts to alter
V
even if one only considers classical gravity. For example, the gravitational
potential Vfor interacting bodies of finite sizes should be V(q,R)=Pi<j Vi,j , where
Vi,j (q,R)=ZBiZBjGρ(xi)ρ(xj)
kqi+RixiqjRjxjkdxidxj=Gmimj
kqiqjk
| {z }
Vi,j,point
+O 1
kqiqjk2!
| {z }
Vi,j,resid
.(2)
Working with the full potential is complicated since
Bi
is not known and the integral is not analytically
known. Can we directly learn Vresid from time-series data?
Classical gravity (i.e. Newtonian physics) is not the only driver of planetary motion — tidal forces
and general relativity (GR) matter too. The former provides a dissipation mechanism and plays
critical roles in altering planetary orbits [
31
,
32
]; the latter doesn’t need much explanation and has
been demonstrated by, e.g., Mercury’s precessing orbit [
33
]. Tidal forces depend on celestial bodies’
rotations [
34
] and thus is a function of both
q,R
. GR’s effects cannot be fully characterized with
classical coordinates
q,R,p,Π
, but post-Newtonian approximations based purely on these coordinates
are popular [35]. Can we learn both purely from data if we did not have theories for either?
In addition to the scientific questions, there are also significant machine learning challenges:
Multiscale dynamics.
Rigid-body correction (
Vresid
), tidal force, and GR correction are all much
smaller forces compared to point-mass gravity. Consequently, their effects do not manifest until long
time. Thus, one challenge for learning them is that the dynamical system exhibits different behaviors
over
multiple timescales
. It is reasonable to require long time series data for the small effects to be
learned; meanwhile, when observations are expensive to make, the observation time step
t
can be
much longer than the smallest timescales. Can we still learn the physics in this case? We will leverage
symplectic integrator and its mild growth of error over long time [
7
,
26
] to provide a positive answer.
Respecting the Lie group manifold.
However, even having a symplectic integrator is not enough
because the position variable of the latent dynamics (i.e. truth) stays on
SE(3)N
. If the integrated
solution falls off this manifold such that
R|R=I
no longer holds, it is not only incorrect but likely
misleading for the learning of
V(q,R)
. Popular integrators such as forward Euler, Runge-Kutta 4
(RK4) and Leapfrog [1,6,36] unfortunately do no maintain the manifold structure.
2.2 Learning with Lie Symplectic RNNs
Our method can be viewed as a Lie-group generalization of the seminal work of SRNN [
6
], where a
good integrator that is both symplectic and Lie-group preserving is employed as a recurrent block.
Lie T2: A Symplectic Lie-Group Preserving Integrator.
To construct an integrator that achieves
both properties, we borrow from [
20
] the idea of Lie-group and symplecticity preserving splitting, and
split our Hamiltonian as
H=HKE +HPE +Hasym
, which contains the axial-symmetric kinetic energy,
potential energy and asymmetric kinetic energy correction terms. This enables computing the exact
integrators
φ[KE]
t[PE]
t
and
φ[asym]
t
(see App Bfor details). We then construct a 2nd-order symplectic
integrator Lie
T2
by applying the Strang composition scheme. To account for non-conservative forces,
the corresponding non-conservative momentum update
φ[force]:(p,Π)F(q,R,p,Π)
is inserted in
the middle of the composition [20]. This gives φ[Lie T2]
hfor stepsize has
φLie T2
h:=φ[KE]
h/2φ[PE]
h/2φ[asym]
h/2φ[force]
hφ[asym]
h/2φ[PE]
h/2φ[KE]
h/2(3)
A Recurrent Architecture for Nonlinear Regression.
Given the simplicity of
Vpoint
, we assume this
is known and learn
Vresid
and
Fθ
with multi-layer perceptron (MLP)
Vθ
resid
and
Fθ
without assuming
any pairwise structure (see App Cfor discussion). We then use
φθ,Lie T2
to integrate dynamics
forward, where
θ
denotes the dependence on the networks. However, when the temporal spacing
between observations
t
is large, using a single
φθ,Lie T2
t
will result in large errors for the fast
3
timescale dynamics. Instead, we compose
φθ,Lie T2
hH
times as
(ˆ
q
k+1,l,ˆ
p
k+1,l) = φθ,Lie T2
h···
φθ,Lie T2
h(q
k,l,p
k,l)
, where
H=h/tZ
determines the integration stepsize
h
. We perform training
by minimizing the following empirical loss over random minibatches of size Nb
L(θ):=1
NbK
Nb
X
l=1
K
X
k=1n
q
k,l ˆ
qθ
k,l
2
2+
p
k,l ˆ
pθ
k,l
2
2o(4)
Note that we do not assume access to the true derivatives
˙
q
k,j
and
˙
p
k,j
used in the loss function of
some works [1,37,38]. Our training process in summarized in Fig. 2(see App Cfor details).
Benefit.
Learning an accurate
Vθ,F θ
requires accurate numerical simulation which also leads to
a trainable model. Without preservation of the manifold structure, training can lead to ‘shortcuts’
outside the manifold that seemingly match the data but completely mislead the learning. Symplecticity
also plays a vital role in controlling the long time integration error — under reasonable conditions, a
p
th-order symplectic integrator has linear
O(∆thp)
error bound, whereas a
p
th-order nonsymplectic
one has an exponential
O(eCthp)
error bound [
7
,
26
]. While these bounds do not matter for small
t
, they are significant for multiscale problems where
t
is macroscopic but
h
is microscopic.
Consequently, improving error estimates for a nonsymplectic integrator by reducing
h
makes the
RNN exponentially deep — this often renders training difficult [39] and is not desirable.
3 Results
0 100 200 300 400 500
Integrator steps
1010
106
102
kRTRIk2
0 20 40 60 80 100
Integrator steps
107
104
101
Herror
Euler
RK4
Verlet
Lie RK2
Lie RK4
Lie T2
Figure 3: Results on TRAPPIST-1 with short
data separation.
Top
:
SO(3)
manifold er-
ror.
Bottom
: Hamiltonian error over the inte-
grated trajectory. Only Lie T2 achieves low
errors in both metrics.
We aim to answer two questions.
Q1
Can we learn
multiscale physics? Q2 How important are symplec-
ticity (
S
) and Lie-group preservation (
L
) for learn-
ing? The closest baseline for our problem is work of
[16], which learns short timescale rigid-body Hamil-
tonian dynamics for robotics. Placed in our frame-
work, their work corresponds to using RK4 for the re-
current block, which is neither
S
nor
L
. Therefore, to
investigate
Q2
, we vary the choice of integrator in our
framework as follows: Normal: Explicit Euler, RK4.
S
: Verlet.
L
: Lie RK2(CF2) and Lie RK4(CF4) [
21
].
We leave the precise details to Apps Cand D.
Toy Two-Body Problem.
We consider an illustrative
two-body problem to demonstrate the effects of
Vrigid
.
In Fig. 1‘Point’ & ‘Rigid’ denote exact solutions for
a point-mass and rigid-body potential, and ‘Lie T2’
the prediction of our method based on a
V
learned
from data. Compared to ‘Point’, ‘Rigid’ induces an
apsidal precession (rotation of the orbital axis) due to spin-orbit couplings. Our method successfully
predict this interaction and matches the trajectory of ‘Rigid’.
We next test our method by learning the dynamics of the
TRAPPIST-1
system [
40
] which consists of
seven earth-sized planets and is notable for potential habitability for terrestrial forms of lives.
TRAPPIST-1, Large t.
To answer
Q1
, we choose a large data timestep
t= 2.4e3yr
. The
closest planet has an orbital period of
2∆t
(
4.1e3yr
), while the rigid body correction, tidal force
and GR correction act on much longer scales. Only Lie
T2
successfully trains. All other methods
diverge during training (denoted by
) despite attempts at stabilization with techniques such as
normalization (LayerNorm [
41
], GroupNorm [
42
]). Reducing
h
improves integration accuracy, but
increases the RNN depth and makes training more unstable. We compare with the solution for point-
mass potential only (No Correction). Our method reduces the error up to two orders of magnitude in
measures of trajectory error and potential gradients (Table 1). See App Efor column definitions.
TRAPPIST-1, Small t.
To gain more insight on
Q2
, we shrink
t
until almost all methods can
converge and only consider conservative forces (i.e. no tidal force or GR). The mean errors in the
predicted trajectory and derivatives of the learned potential
V
after
500
integrator steps are shown in
Table 2. Both
S
methods achieve small errors in position related terms. Verlet has a large rotational
error since it does not integrate on the rotation manifold.
L
methods achieve lower rotational errors
but are worse elsewhere. Lie T2being both SLachieves the lowest error on both fronts.
4
摘要:

Data-drivendiscoveryofnon-Newtonianastronomyvialearningnon-EuclideanHamiltonianOswinSoMassachusettsInstituteofTechnology,USAoswinso@mit.eduGongjieLiGeorgiaInstituteofTechnology,USAgongjie.li@physics.gatech.eduEvangelosA.TheodorouGeorgiaInstituteofTechnology,USAevangelos.theodorou@gatech.eduMoleiTao...

展开>> 收起<<
Data-driven discovery of non-Newtonian astronomy via learning non-Euclidean Hamiltonian Oswin So.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:16 页 大小:677.03KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注