Action functional gradient descent algorithm for estimating escape paths in stochastic chemical reaction networks Praful Gagrani12 Eric Smith13456

2025-04-30 0 0 3.45MB 40 页 10玖币
侵权投诉
Action functional gradient descent algorithm for estimating escape paths in stochastic
chemical reaction networks
Praful Gagrani1,2, Eric Smith1,3,4,5,6
1Department of Physics, University of Wisconsin–Madison, Madison, WI 53706, USA
2Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
3Department of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA
4Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo 152-8550, Japan
5Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA
6Ronin Institute, 127 Haddon Place, Montclair, NJ 07043, USA
(Dated: February 7, 2023)
We first derive the Hamilton-Jacobi theory underlying continuous-time Markov processes, and
then use the construction to develop a variational algorithm for estimating escape (least improbable
or first passage) paths for a generic stochastic chemical reaction network that exhibits multiple fixed
points. The design of our algorithm is such that it is independent of the underlying dimensionality of
the system, the discretization control parameters are updated towards the continuum limit, and there
is an easy-to-calculate measure for the correctness of its solution. We consider several applications
of the algorithm and verify them against computationally expensive means such as the shooting
method and stochastic simulation. While we employ theoretical techniques from mathematical
physics, numerical optimization and chemical reaction network theory, we hope that our work finds
practical applications with an inter-disciplinary audience including chemists, biologists, optimal
control theorists and game theorists.
Contents
I. Introduction 2
II. From master equation to Hamilton-Jacobi
equation for CRN 2
A. Hamilton-Jacobi equation and
Non-Equilibrium Potential (NEP) for
stochastic dynamics 3
B. Hamilton-Jacobi theory for stochastic
Chemical Reaction Networks (CRN) 5
C. Switching dynamics for a multistable
stochastic CRN 8
III. Action Functional Gradient Descent
(AFGD) algorithm 9
A. Formulation as a MinMax problem 10
B. Algorithm and pseudocode 11
1. Main idea 11
2. Noise and discretization 11
3. Verification protocol 13
4. Algorithm in a nutshell 14
C. Performance costs and relation to other
methods 14
IV. Application and results 14
A. Selkov model 16
B. N-D Schlögl model 16
V. Discussion 21
Acknowledgements 22
A. Experimental applications of AFGD 22
B. Rederiving previous results in the
Hamilton-Jacobi formalism 23
C. Non-equilibrium action, action functional
and its first and second variational
derivatives 25
1. Deriving stochastic and quantum dynamics
from the non-equilibrium action (NEA) 25
2. Derivation of the path integral formula 27
3. Optimality condition: Hamilton’s equations of
motion and their relation to the
Hamilton-Jacobi equation 28
4. Second variational derivative of action
functional: Onsager-Machlup action and
convexity of instanton 29
D. Details of AFGD 31
1. Notation 31
2. Initial condition 31
3. Lift curve to trajectory 32
4. Functional gradient - obtaining and filtering 34
5. Filtering and resampling routines 34
a. Filtering 34
b. Resampling 35
6. Pick step size 35
7. Updating cutoff frequency and increasing
sample points during descent 36
a. Updating cutoff frequency 36
b. Annealing or increasing sample points 37
8. Integrating Hamilton’s equations of motion
and convergence criteria 38
References 38
arXiv:2210.15419v2 [cond-mat.stat-mech] 6 Feb 2023
2
I. INTRODUCTION
Stochastic modelling has played an increasingly central
role in science since the advent of high speed computing.
There are, however, many categories of events that are
vital to the organization of long-term dynamics for which
the rate diminishes exponentially with system size (‘rare’
events) and computer simulations become unaffordable.
A typical approach in these scenarios is to employ impor-
tance sampling approaches in the stochastic simulation to
efficiently simulate the rare event of interest (see [1–3]).
In this work, for the particular case of stochastic chemical
reaction networks, we provide a deterministic alternative
for estimating the likelihood of rare events.
Stochastic nonlinear dynamical processes exhibiting
multistability, or multiple coexisting attractors, have
found several scientific applications ranging from mod-
elling climate change to population biology and the origin
of life [4–7], and are an active subject of interdisciplinary
research. A question of critical importance for any prac-
tical application of such a system is, how often does it
transition out of a stable attractor and what are the least-
improbable paths by which such a transition occurs? In
literature, the dynamics that arise due to the system’s
transitioning from one stable attractor to another and
the optimal path of transition is referred to as ‘switching
dynamics’ and ‘escape path’, respectively.
Many phenomena in biology, chemistry, physics or en-
gineering can be modelled as a chemical reaction network
(CRN). It is also well known that CRNs can exhibit a
host of dynamics, including limit cycles and multistabil-
ity [8]. Perhaps less well-known is the role of Hamilton-
Jacobi ray theory underlying stochastic CRN, and the
interpretation of escape paths as particular character-
istic curves that arise as a solution to the associated
Hamilton-Jacobi equation. In our work, we review the
necessary formalism and employ techniques from math-
ematical physics and numerical optimization to estimate
the escape paths for a multistable CRN. More precisely,
we first recognize that the escape paths are variational
(locally least-action) solutions to the action functional,
and subsequently use the functional to perform a gradi-
ent descent that converges on the desired path.
It must be noted that neither the recognition of the
escape path as a variational solution nor using the ac-
tion functional to perform gradient descent is novel to us
(for instance, see [9]), however, we differ from the earlier
work in a few ways. In the past, the use of Hamilton-
Jacobi theory to find the escape path for CRN has only
been made using the ‘shooting-method’, which relies on
integrating Hamilton’s equations of motion, rather than
a functional gradient descent approach that we present
here (see [10]). On the other hand, while the action func-
tional has been used for a gradient descent, to our best
knowledge, the form of the Hamiltonian assumed has
always been separable in momentum and position (see
[11, 12]). This form of the Hamiltonian, with other con-
straints on the functional form, makes it amenable to
finding an analytic form for the associated Lagrangian
which drastically simplifies the gradient descent. While
this assumption is typical for a Hamiltonian describing
mechanical energy in physics, it is far too restrictive for a
generic Hamiltonian in chemistry, as can be seen below:
HME(p, q) = K(p) + U(q),
HCRN(p, q) = X
yα,yβe(yβyα)·p1kyαyβqyα,(1)
where pand qare the momentum and position coordi-
nates, K(p)and U(q)are the kinetic and potential en-
ergy, and HME and HCRN (Eq. 16) are the Hamiltonians
for mechanical energy and chemical reaction networks
respectively. In our algorithm, however, we start from
the Hamiltonian, numerically solve for the Lagrangian
at each iteration and use it to calculate the descent di-
rection. While our algorithm is designed for stochastic
CRNs, it is amenable to generalization for other classes
of stochastic Hamiltonian dynamical systems.
The layout of the paper is as follows. In Section II,
we derive the Hamilton-Jacobi theory for stochastic pro-
cesses starting from a master equation and apply it to
stochastic CRN. We show that the Hamilton-Jacobi the-
ory of the non-equilibrium potential (NEP) arises natu-
rally as a result of a variational principle applied to the
action functional. In Section III, we propose an action
functional gradient descent (AFGD) algorithm that finds
the variational solution in the space of paths constrained
at end points for a given Hamiltonian value constraint.
We also explain how the algorithm can be used to find
least-improbable escape paths out of a stable attractor
and assign a value to the NEP along them. The details
of the implementation are provided in Appendix D, and
a MATLAB implementation is made available at [13].
In Section IV, we demonstrate the applications of the
AFGD algorithm on several CRNs. We first consider the
Selkov model, and compare the result of the algorithm
against the escape trajectory found by the ‘shooting-
method’ (Figure 9). We then define a class of high dimen-
sional birth-death models, namely ‘N-Schlögl model’, and
compare the results of the algorithm against a stochastic
simulation for the 2-Schlögl model (Figure 4). We also
use the algorithm on the six dimensional 6-Schlögl model,
and compare the result against the integration of Hamil-
ton’s equations of motion (Figure 21). Finally, in Section
V, we conclude with a discussion of our contribution and
potential avenues of future research.
II. FROM MASTER EQUATION TO
HAMILTON-JACOBI EQUATION FOR CRN
The application of Hamilton-Jacobi theory to chemical
reaction networks (CRNs) has a long history [10, 14]. In
this section, we review the necessary formulation needed
to understand the switching dynamics of stochastic chem-
ical reaction networks (Section II C) and to devise a vari-
ational algorithm for predicting the transitions (Section
3
III). In Section II A we derive the non-equilibrium poten-
tial for continuous time Markov population processes and
explain its role in estimating the probability of stochas-
tic events. In Section II B, we derive the Hamiltonian for
a CRN and prove that the NEP is a Lyapunov function
along the deterministic trajectories under mass-action ki-
netics. In Appendix A, we investigate the relevance our
work might have to a stochastic modelling practitioner
by posing a general practical problem, giving an overall
picture of the solution and explaining where our algo-
rithm fits in. In Appendix B, we employ the Hamilton-
Jacobi formalism to recover the ACK theorem and NEP
for complex-balanced systems. In Appendix C, we de-
fine a ‘non-equilibrium action’, of which both the master
equation and Schrödinger equation can be seen as a vari-
ational solution, derive the path integral formula and ac-
tion functional for stochastic population processes, and
calculate the first and second variational derivatives of
the action functional. There are many resources that
provide an alternative treatment of the same subject, for
instance see [15, 16] and references therein.
A. Hamilton-Jacobi equation and Non-Equilibrium
Potential (NEP) for stochastic dynamics
A continuous time Markovian population process is
specified by its master equation,
t ρ(n, t) = X
n0
Tnn0ρ(n0, t)
X
n
ρ(n, t) = 1 for all t
X
n
Tnn0= 0 for all n0(2)
where tis the time variable, nis a Ddimensional
discrete vector in the positive integer lattice ZD
0denoting
a position in the state space (to simplify notation we drop
the arrow in ~n) and ρ(n, t)is a time-evolving probability
distribution function (PDF) over the state space. Tis
referred to as a transition operator, and in this paper we
only consider time independent transition operators.
For what follows, it is useful to recast Eq. 2 into the
following form and define a Hamiltonian operator ˆ
H that
acts on the PDF ρ,
t ρ(n, t) = ˆ
H
n , nρ(n, t)(3)
where /∂n is the infinitesimal-shift operator1.
The resemblance of Eq. 3 to the Schrödinger equation
is not a mere coincidence, and we show in Section C 1
1A function in xcan be shifted by yby the application of a shift
how they can both be derived from a common varia-
tional problem of extremizing, what is termed as, the
‘non-equilibrium action’ (NEA) in [17]. It should come
as no surprise then that we can employ the same ma-
chinery developed in mathematical physics for quantum
mechanics to derive results about stochastic dynamics.
This line of reasoning has a long history and we refer the
readers to [6, 18–20] for a ‘second-quantization’ treat-
ment via abstract linear algebra.
We can integrate Eq. 3 starting from a PDF ρ(n0,0) at
time 0to get a distribution ρ(nT, T )at time T, indexed
by n0and nTrespectively.
ρ(nT, T ) = eRT
0
ˆ
Hdtρ(n0,0)
=Z[dn]Z[dp]e−A[n(t),p(t)]ρ(n0,0) (4)
where Ais the action functional
A[n(t), p(t)] = ZT
0p·dn
dtH(p, n)dt. (5)
and pis a momentum variable canonically conjugate to
the position variable n. The second line in Eq. 4 is the
path-integral formula, a proof of which can be found in
Section C 2, and [dn] [ dp]is the path-integral measure.
Note that in Eq. 5, the Hamiltonian function H(p, n)has
the same functional form as the Hamiltonian operator ˆ
H
except the operator /∂n is replaced by the variable p.
In the first line of Eq. 4, the exponentiation of the
Hamiltonian operator is to be understood as a time-
ordered matrix product over small intervals dt. Its role
is to time-evolve the initial distribution ρ(0) and accu-
mulate probability through all possible chains of states
until time Tso as to obtain a new distribution ρ(T).
In the second line, upon taking appropriate limits sig-
nifying a continuous time parameter, we obtain another
description of the same process where we sum over all the
paths starting at n0at time 0and ending at nTat time
weighted with an appropriate measure. The negative log
of the measure of a particular path is given by the ac-
tion functional (in other words, Atakes as input a path
and returns its log-improbability), and integrating under
the path-integral measure amounts to summing over all
paths. For a didactic introduction to the topic and its
application to stochastic dynamics, we refer the reader
to [19].
It is useful to pass from a discrete to a continuous state
space by descaling nwith a scale factor Vand considering
the large Vlimit. In order to align our presentation
operator given by ey
x , as can be seen by
f(x+y) =
X
n=0
yn
n!
nf(x)
xn=ey
x f(x).
4
here with the mathematical physics literature, we denote
the continuous variable by q=n/V and refer to the
continuous state space as configuration space. There are
several interpretations the variables can take; the one of
interest to us is where nis a population vector (thus
n0), qis the concentration vector and Vis a scale of
the the total population. Following Eq. 4, we can write
the time evolution of a distribution in the transformed
coordinate qas
ρ(q, T ) = Z[dq]Z[dp]eVA[q(t),p(t)]ρ(q(0),0)
A[q(t), p(t)] = ZT
0p·dq
dtH(p, q)dt, (6)
where Ais the action functional in the new coordinates
and has a different Hamiltonian H(p, q)which relates to
H(p, n)by 2
V H(p, q)H(p, qV ) = H(p, n).(7)
It must be noted that Eq. 7 is an additional constraint
on the form of the Hamiltonian, but is one that is satis-
fied by the CRN Hamiltonian which we are interested in
for the scope of this paper. Moreover, if the asymptotic
equality was an equality then the condition is often re-
ferred to by saying ‘the Hamiltonian is extensive in the
position coordinate’, or ‘the Hamiltonian is a homoge-
nous function of degree one in the position coordinate’.
We will return to this point in the next subsection, once
we write the explicit Hamiltonian for a CRN.
Together the position and momentum coordinates
(q, p)constitute a point in the phase space [21]. In order
to evaluate the distribution at time Tusing Eq. 6, in prin-
ciple we need to sum over all the phase space paths that
end at qat time T. Evaluating the full sum, however, is
not necessary to obtain the leading large-Vasymptotics.
Through Laplace’s or saddle-point method, for large V
the path integral is dominated by the saddle point of the
functional and the distribution at time Tcan be approx-
imated by
ρ(q, T )N(V)eVA[q(t),p(t)]ρ(q(0),0)
where (q(t), p(t)) is such that
δA[q(t), p(t)] = 0 and q(T) = q, (8)
and N(V)is a normalization factor expounded upon in
the subsequent paragraph. The constraint in the last
line specifies the stationarity or optimality condition on
the saddle point path (q(t), p(t)), also referred to as
the path of stationary action, and we henceforth refer
2We say f(V)g(V)(read as f(V)is asymptotic to g(V)) if
lim
V→∞
f(V)
g(V)= 1.
to it simply as the optimal path (for a MinMax formu-
lation of the saddle-point see Section III A). We require
here that the Hamiltonian function is convex in the mo-
mentum variable p, which we show is the case for CRN
Hamiltonians with mass-action kinetics in Section II B.
There are two caveats to Eq. 8 that we now point out.
First, the normalization factor N(V)in the first line de-
pends not only on the scale factor Vbut also on the
optimal path. There is an in-principle method, although
costly in practice, to obtain N(V)from the action func-
tional itself, outlined in [22], Ch. 7 of [23] and Ch. 4
of [6] (for an application, see [24]). Second, since the
path integral is a sum over all paths, there are several
stationary paths starting from different q(0) that reach
qat T, and one must perform a sum over all of them.
This becomes increasingly relevant in the limit T→ ∞,
where the system can bounce back and forth multiple
times between a q(0) and q(see [6, 23]). There is, how-
ever, a two-fold resolution to this problem. First, the
normalization factor N(V)is sub-exponential in V, and
can be well-approximated by unity large for V(also see
the introduction to Section III). Second, the initial distri-
bution ρ(q, 0) is typically taken to be peaked at a given
value, thus yielding negligible contribution from every
q(0) that is not near the peak of the initial condition.
Thus we only pick the least-improbable direct paths that
start from near the peak of the initial distribution and
reach qin Eq. 8.
In the first line of Eq. 8, the only free variables are con-
figuration qand final time T, which together characterize
a time evolving distribution. The optimality condition
picks a path (q(t), p(t)), and it is only the probabil-
ity mass at q(0) in the initial distribution at time 0i.e.
ρ(q(0),0), weighted appropriately by the action func-
tional, that contributes to the final distribution at con-
figuration qand time T. The weight factor eVA[q,p]
accounts for the fact that not all the probability mass
from q(0) goes to qin time T, and in this sense takes the
interpretation of a conditional probability.
Since the conditional probability term is only in the
exponential, it is useful to write the probability distribu-
tion itself as an exponential function ρ(q, t) = eV S(q,t).
Here S(q, t)is the action function (not be confused with
Awhich is the action functional) and its time evolution
is given as follows
S(q, T ) = A[q(t), p(t)] + S(q(0),0)
=ZT
0p·dq
dtH(p, q)dt +S(q(0),0) (9)
where in the second line, we expand the action functional
using Eq. 6. Thus the action function evolves by the
optimal value of the action functional, and since it is a
5
function we can also consider its total differential
S(q, T ) = ZT
0
dS +S(q(0),0)
=ZT
0S(q, t)
q·dq
dt+S(q, t)
t dt +S(q(0),0).
(10)
Comparing equations 9 and 10, we get
S(q, t)
q=p
S(q, t)
t =H(p, q)
S(q, t)
t =HS
q, q(11)
where the first two lines are obtained by comparison and
the last line is obtained by substituting the first line in
the second line. The non-linear PDE in the last equation
is called the Hamilton-Jacobi equation. For an in-
troduction to the subject and its history, we refer the
readers to [21, 25] (what we call as the action func-
tion is also called Hamilton’s principal function). We
briefly remark that the definite integral of the action
functional along the optimal path RT
0dS also acts as a di-
vergence function in information geometry (see [20, 26])
and is denoted as S(q(T)||q(0)) yielding the relation
S(q
T) = S(q
T||q
0) + S(q
0), as mentioned in Eq. A2.
The role of momentum for stochastic systems can be
best understood by the Hamilton Jacobi equation. The
first line in Eq. 11 shows that momentum is the gradient
of the action function, which in turn is descaled negative
log probability of the time evolving distribution given
some initial conditions. Using this, we see that p(0)
must be the gradient of the action function Sat q(0),
and that an optimal path in phase space corresponds to a
contour in the probability distribution arising by scaling
and exponentiating the action.
The third line in Eq. 2 ensures that the transition op-
erator always has a cokernel, namely the row vector con-
sisting of all ones [1,...,1]. The Perron-Frobenius theo-
rem for stochastic matrices proves that the operator also
has a unique right kernel, referred to as the stationary
distribution which we denote here by π.πthus satisfies
PnTn0nπ(n, t)=0or equivalently
ˆ
Hπ(n, t)=0.
From Eqs. 2, 3 one can see that the stationary distribu-
tion is indeed time independent as π(n, t)/∂t = 0, thus
justifying its name.
The descaled log-improbability or action function of
the stationary distribution is commonly referred to as
the non-equilibrium potential (NEP) and denoted
by V(see [27]),
π(n)eVV(q),(12)
where the normalization constant has been set to unity
following the discussion below Eq. 8. As the action of a
stationary distribution, it must satisfy
V
t = 0,
HV
q , q= 0 (13)
where the second line is the Hamilton-Jacobi equation
from Eq. 11 that must be satisfied by the NEP.
In the remainder of this section, we will develop and
demonstrate the applications of Hamilton-Jacobi theory
for CRNs. Finding the NEP Vfor a general CRN is
of importance for getting any numerical estimates on its
behavior, and it is the task that we will concern our-
selves with in this paper starting Section III. As explained
above, to calculate the difference of the action function
or NEP between two points we first need to find the op-
timal path connecting them, which is precisely what the
Action Functional Gradient Descent (AFGD) algorithm
is designed to do.
B. Hamilton-Jacobi theory for stochastic Chemical
Reaction Networks (CRN)
In this subsection, we start by introducing the mathe-
matical formulation of CRNs and derive its Hamiltonian
function in the concentration coordinate. Next, we show
that the Hamiltonian function is convex in the momen-
tum coordinate, a necessary condition for the existence of
optimal paths, and write the equations that the optimal
paths must satisfy. Then we classify the optimal paths
constituting the stationary distribution into two cate-
gories, namely relaxation paths and escape paths, and
show that the former yield the deterministic mass-action
kinetics (MAK) and the latter yields the non-equilibrium
potential (NEP). Finally we show that, for a general CRN
with multiple fixed points, the NEP is always a Lyapunov
function with respect to MAK.
A CRN is defined by the triple {S,C,R}, where S,C,R
are the set of species, complexes and reactions respec-
tively.
S={S1, ..., Si, ..., S|S|}
C={y1, ..., yα, ..., y|C| :yαN|S|}
R={yα
kyαyβ
yβ:kyαyβ0},
where Roman letters (i, j) and Greek letters (α, β) are
used to denote species and complex indices, respectively.
A complex is a multi-set of species, and is denoted by
the column vector yαrepresenting the stoichiometry of
the multi-set. The state of a CRN is characterized by a
population vector nZ|S|
0, and the time evolution of a
probability distribution over the state space is given by
(for a rigorous microphysical derivation, see [28] )
摘要:

ActionfunctionalgradientdescentalgorithmforestimatingescapepathsinstochasticchemicalreactionnetworksPrafulGagrani1;2,EricSmith1;3;4;5;61DepartmentofPhysics,UniversityofWisconsinMadison,Madison,WI53706,USA2WisconsinInstituteforDiscovery,UniversityofWisconsin-Madison,Madison,WI53715,USA3DepartmentofB...

展开>> 收起<<
Action functional gradient descent algorithm for estimating escape paths in stochastic chemical reaction networks Praful Gagrani12 Eric Smith13456.pdf

共40页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:40 页 大小:3.45MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 40
客服
关注