Action functional gradient descent algorithm for estimating escape paths in stochastic chemical reaction networks Praful Gagrani12 Eric Smith13456

2025-04-30 0 0 3.45MB 40 页 10玖币

侵权投诉

Action functional gradient descent algorithm for estimating escape paths in stochastic

chemical reaction networks

Praful Gagrani1,2, Eric Smith1,3,4,5,6

1Department of Physics, University of Wisconsin–Madison, Madison, WI 53706, USA

2Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA

3Department of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA

4Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo 152-8550, Japan

5Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA

6Ronin Institute, 127 Haddon Place, Montclair, NJ 07043, USA

(Dated: February 7, 2023)

We ﬁrst derive the Hamilton-Jacobi theory underlying continuous-time Markov processes, and

then use the construction to develop a variational algorithm for estimating escape (least improbable

or ﬁrst passage) paths for a generic stochastic chemical reaction network that exhibits multiple ﬁxed

points. The design of our algorithm is such that it is independent of the underlying dimensionality of

the system, the discretization control parameters are updated towards the continuum limit, and there

is an easy-to-calculate measure for the correctness of its solution. We consider several applications

of the algorithm and verify them against computationally expensive means such as the shooting

method and stochastic simulation. While we employ theoretical techniques from mathematical

physics, numerical optimization and chemical reaction network theory, we hope that our work ﬁnds

practical applications with an inter-disciplinary audience including chemists, biologists, optimal

control theorists and game theorists.

Contents

I. Introduction 2

II. From master equation to Hamilton-Jacobi

equation for CRN 2

A. Hamilton-Jacobi equation and

Non-Equilibrium Potential (NEP) for

stochastic dynamics 3

B. Hamilton-Jacobi theory for stochastic

Chemical Reaction Networks (CRN) 5

C. Switching dynamics for a multistable

stochastic CRN 8

III. Action Functional Gradient Descent

(AFGD) algorithm 9

A. Formulation as a MinMax problem 10

B. Algorithm and pseudocode 11

1. Main idea 11

2. Noise and discretization 11

3. Veriﬁcation protocol 13

4. Algorithm in a nutshell 14

C. Performance costs and relation to other

methods 14

IV. Application and results 14

A. Selkov model 16

B. N-D Schlögl model 16

V. Discussion 21

Acknowledgements 22

A. Experimental applications of AFGD 22

B. Rederiving previous results in the

Hamilton-Jacobi formalism 23

C. Non-equilibrium action, action functional

and its ﬁrst and second variational

derivatives 25

1. Deriving stochastic and quantum dynamics

from the non-equilibrium action (NEA) 25

2. Derivation of the path integral formula 27

3. Optimality condition: Hamilton’s equations of

motion and their relation to the

Hamilton-Jacobi equation 28

4. Second variational derivative of action

functional: Onsager-Machlup action and

convexity of instanton 29

D. Details of AFGD 31

1. Notation 31

2. Initial condition 31

3. Lift curve to trajectory 32

4. Functional gradient - obtaining and ﬁltering 34

5. Filtering and resampling routines 34

a. Filtering 34

b. Resampling 35

6. Pick step size 35

7. Updating cutoﬀ frequency and increasing

sample points during descent 36

a. Updating cutoﬀ frequency 36

b. Annealing or increasing sample points 37

8. Integrating Hamilton’s equations of motion

and convergence criteria 38

References 38

arXiv:2210.15419v2 [cond-mat.stat-mech] 6 Feb 2023

I. INTRODUCTION

Stochastic modelling has played an increasingly central

role in science since the advent of high speed computing.

There are, however, many categories of events that are

vital to the organization of long-term dynamics for which

the rate diminishes exponentially with system size (‘rare’

events) and computer simulations become unaﬀordable.

A typical approach in these scenarios is to employ impor-

tance sampling approaches in the stochastic simulation to

eﬃciently simulate the rare event of interest (see [1–3]).

In this work, for the particular case of stochastic chemical

reaction networks, we provide a deterministic alternative

for estimating the likelihood of rare events.

Stochastic nonlinear dynamical processes exhibiting

multistability, or multiple coexisting attractors, have

found several scientiﬁc applications ranging from mod-

elling climate change to population biology and the origin

of life [4–7], and are an active subject of interdisciplinary

research. A question of critical importance for any prac-

tical application of such a system is, how often does it

transition out of a stable attractor and what are the least-

improbable paths by which such a transition occurs? In

literature, the dynamics that arise due to the system’s

transitioning from one stable attractor to another and

the optimal path of transition is referred to as ‘switching

dynamics’ and ‘escape path’, respectively.

Many phenomena in biology, chemistry, physics or en-

gineering can be modelled as a chemical reaction network

(CRN). It is also well known that CRNs can exhibit a

host of dynamics, including limit cycles and multistabil-

ity [8]. Perhaps less well-known is the role of Hamilton-

Jacobi ray theory underlying stochastic CRN, and the

interpretation of escape paths as particular character-

istic curves that arise as a solution to the associated

Hamilton-Jacobi equation. In our work, we review the

necessary formalism and employ techniques from math-

ematical physics and numerical optimization to estimate

the escape paths for a multistable CRN. More precisely,

we ﬁrst recognize that the escape paths are variational

(locally least-action) solutions to the action functional,

and subsequently use the functional to perform a gradi-

ent descent that converges on the desired path.

It must be noted that neither the recognition of the

escape path as a variational solution nor using the ac-

tion functional to perform gradient descent is novel to us

(for instance, see [9]), however, we diﬀer from the earlier

work in a few ways. In the past, the use of Hamilton-

Jacobi theory to ﬁnd the escape path for CRN has only

been made using the ‘shooting-method’, which relies on

integrating Hamilton’s equations of motion, rather than

a functional gradient descent approach that we present

here (see [10]). On the other hand, while the action func-

tional has been used for a gradient descent, to our best

knowledge, the form of the Hamiltonian assumed has

always been separable in momentum and position (see

[11, 12]). This form of the Hamiltonian, with other con-

straints on the functional form, makes it amenable to

ﬁnding an analytic form for the associated Lagrangian

which drastically simpliﬁes the gradient descent. While

this assumption is typical for a Hamiltonian describing

mechanical energy in physics, it is far too restrictive for a

generic Hamiltonian in chemistry, as can be seen below:

HME(p, q) = K(p) + U(q),

HCRN(p, q) = X

yα,yβe(yβ−yα)·p−1kyα→yβqyα,(1)

where pand qare the momentum and position coordi-

nates, K(p)and U(q)are the kinetic and potential en-

ergy, and HME and HCRN (Eq. 16) are the Hamiltonians

for mechanical energy and chemical reaction networks

respectively. In our algorithm, however, we start from

the Hamiltonian, numerically solve for the Lagrangian

at each iteration and use it to calculate the descent di-

rection. While our algorithm is designed for stochastic

CRNs, it is amenable to generalization for other classes

of stochastic Hamiltonian dynamical systems.

The layout of the paper is as follows. In Section II,

we derive the Hamilton-Jacobi theory for stochastic pro-

cesses starting from a master equation and apply it to

stochastic CRN. We show that the Hamilton-Jacobi the-

ory of the non-equilibrium potential (NEP) arises natu-

rally as a result of a variational principle applied to the

action functional. In Section III, we propose an action

functional gradient descent (AFGD) algorithm that ﬁnds

the variational solution in the space of paths constrained

at end points for a given Hamiltonian value constraint.

We also explain how the algorithm can be used to ﬁnd

least-improbable escape paths out of a stable attractor

and assign a value to the NEP along them. The details

of the implementation are provided in Appendix D, and

a MATLAB implementation is made available at [13].

In Section IV, we demonstrate the applications of the

AFGD algorithm on several CRNs. We ﬁrst consider the

Selkov model, and compare the result of the algorithm

against the escape trajectory found by the ‘shooting-

method’ (Figure 9). We then deﬁne a class of high dimen-

sional birth-death models, namely ‘N-Schlögl model’, and

compare the results of the algorithm against a stochastic

simulation for the 2-Schlögl model (Figure 4). We also

use the algorithm on the six dimensional 6-Schlögl model,

and compare the result against the integration of Hamil-

ton’s equations of motion (Figure 21). Finally, in Section

V, we conclude with a discussion of our contribution and

potential avenues of future research.

II. FROM MASTER EQUATION TO

HAMILTON-JACOBI EQUATION FOR CRN

The application of Hamilton-Jacobi theory to chemical

reaction networks (CRNs) has a long history [10, 14]. In

this section, we review the necessary formulation needed

to understand the switching dynamics of stochastic chem-

ical reaction networks (Section II C) and to devise a vari-

ational algorithm for predicting the transitions (Section

III). In Section II A we derive the non-equilibrium poten-

tial for continuous time Markov population processes and

explain its role in estimating the probability of stochas-

tic events. In Section II B, we derive the Hamiltonian for

a CRN and prove that the NEP is a Lyapunov function

along the deterministic trajectories under mass-action ki-

netics. In Appendix A, we investigate the relevance our

work might have to a stochastic modelling practitioner

by posing a general practical problem, giving an overall

picture of the solution and explaining where our algo-

rithm ﬁts in. In Appendix B, we employ the Hamilton-

Jacobi formalism to recover the ACK theorem and NEP

for complex-balanced systems. In Appendix C, we de-

ﬁne a ‘non-equilibrium action’, of which both the master

equation and Schrödinger equation can be seen as a vari-

ational solution, derive the path integral formula and ac-

tion functional for stochastic population processes, and

calculate the ﬁrst and second variational derivatives of

the action functional. There are many resources that

provide an alternative treatment of the same subject, for

instance see [15, 16] and references therein.

A. Hamilton-Jacobi equation and Non-Equilibrium

Potential (NEP) for stochastic dynamics

A continuous time Markovian population process is

speciﬁed by its master equation,

∂

∂t ρ(n, t) = X

Tnn0ρ(n0, t)

ρ(n, t) = 1 for all t

Tnn0= 0 for all n0(2)

where tis the time variable, nis a D−dimensional

discrete vector in the positive integer lattice ZD

≥0denoting

a position in the state space (to simplify notation we drop

the arrow in ~n) and ρ(n, t)is a time-evolving probability

distribution function (PDF) over the state space. Tis

referred to as a transition operator, and in this paper we

only consider time independent transition operators.

For what follows, it is useful to recast Eq. 2 into the

following form and deﬁne a Hamiltonian operator ˆ

H that

acts on the PDF ρ,

∂

∂t ρ(n, t) = ˆ

H−∂

∂n , nρ(n, t)(3)

where ∂/∂n is the inﬁnitesimal-shift operator1.

The resemblance of Eq. 3 to the Schrödinger equation

is not a mere coincidence, and we show in Section C 1

1A function in xcan be shifted by yby the application of a shift

how they can both be derived from a common varia-

tional problem of extremizing, what is termed as, the

‘non-equilibrium action’ (NEA) in [17]. It should come

as no surprise then that we can employ the same ma-

chinery developed in mathematical physics for quantum

mechanics to derive results about stochastic dynamics.

This line of reasoning has a long history and we refer the

readers to [6, 18–20] for a ‘second-quantization’ treat-

ment via abstract linear algebra.

We can integrate Eq. 3 starting from a PDF ρ(n0,0) at

time 0to get a distribution ρ(nT, T )at time T, indexed

by n0and nTrespectively.

ρ(nT, T ) = eRT

Hdtρ(n0,0)

=Z[dn]Z[dp]e−A[n(t),p(t)]ρ(n0,0) (4)

where Ais the action functional

A[n(t), p(t)] = ZT

0p·dn

dt−H(p, n)dt. (5)

and pis a momentum variable canonically conjugate to

the position variable n. The second line in Eq. 4 is the

path-integral formula, a proof of which can be found in

Section C 2, and [dn] [ dp]is the path-integral measure.

Note that in Eq. 5, the Hamiltonian function H(p, n)has

the same functional form as the Hamiltonian operator ˆ

except the operator −∂/∂n is replaced by the variable p.

In the ﬁrst line of Eq. 4, the exponentiation of the

Hamiltonian operator is to be understood as a time-

ordered matrix product over small intervals dt. Its role

is to time-evolve the initial distribution ρ(0) and accu-

mulate probability through all possible chains of states

until time Tso as to obtain a new distribution ρ(T).

In the second line, upon taking appropriate limits sig-

nifying a continuous time parameter, we obtain another

description of the same process where we sum over all the

paths starting at n0at time 0and ending at nTat time

weighted with an appropriate measure. The negative log

of the measure of a particular path is given by the ac-

tion functional (in other words, Atakes as input a path

and returns its log-improbability), and integrating under

the path-integral measure amounts to summing over all

paths. For a didactic introduction to the topic and its

application to stochastic dynamics, we refer the reader

to [19].

It is useful to pass from a discrete to a continuous state

space by descaling nwith a scale factor Vand considering

the large Vlimit. In order to align our presentation

operator given by ey∂

∂x , as can be seen by

f(x+y) =

∞

n=0

∂nf(x)

∂xn=ey∂

∂x f(x).

here with the mathematical physics literature, we denote

the continuous variable by q=n/V and refer to the

continuous state space as conﬁguration space. There are

several interpretations the variables can take; the one of

interest to us is where nis a population vector (thus

n≥0), qis the concentration vector and Vis a scale of

the the total population. Following Eq. 4, we can write

the time evolution of a distribution in the transformed

coordinate qas

ρ(q, T ) = Z[dq]Z[dp]e−VA[q(t),p(t)]ρ(q(0),0)

A[q(t), p(t)] = ZT

0p·dq

dt−H(p, q)dt, (6)

where Ais the action functional in the new coordinates

and has a diﬀerent Hamiltonian H(p, q)which relates to

H(p, n)by 2

V H(p, q)H(p, qV ) = H(p, n).(7)

It must be noted that Eq. 7 is an additional constraint

on the form of the Hamiltonian, but is one that is satis-

ﬁed by the CRN Hamiltonian which we are interested in

for the scope of this paper. Moreover, if the asymptotic

equality was an equality then the condition is often re-

ferred to by saying ‘the Hamiltonian is extensive in the

position coordinate’, or ‘the Hamiltonian is a homoge-

nous function of degree one in the position coordinate’.

We will return to this point in the next subsection, once

we write the explicit Hamiltonian for a CRN.

Together the position and momentum coordinates

(q, p)constitute a point in the phase space [21]. In order

to evaluate the distribution at time Tusing Eq. 6, in prin-

ciple we need to sum over all the phase space paths that

end at qat time T. Evaluating the full sum, however, is

not necessary to obtain the leading large-Vasymptotics.

Through Laplace’s or saddle-point method, for large V

the path integral is dominated by the saddle point of the

functional and the distribution at time Tcan be approx-

imated by

ρ(q, T )N(V)e−VA[q∗(t),p∗(t)]ρ(q∗(0),0)

where (q∗(t), p∗(t)) is such that

δA[q∗(t), p∗(t)] = 0 and q∗(T) = q, (8)

and N(V)is a normalization factor expounded upon in

the subsequent paragraph. The constraint in the last

line speciﬁes the stationarity or optimality condition on

the saddle point path (q∗(t), p∗(t)), also referred to as

the path of stationary action, and we henceforth refer

2We say f(V)g(V)(read as f(V)is asymptotic to g(V)) if

lim

V→∞

f(V)

g(V)= 1.

to it simply as the optimal path (for a MinMax formu-

lation of the saddle-point see Section III A). We require

here that the Hamiltonian function is convex in the mo-

mentum variable p, which we show is the case for CRN

Hamiltonians with mass-action kinetics in Section II B.

There are two caveats to Eq. 8 that we now point out.

First, the normalization factor N(V)in the ﬁrst line de-

pends not only on the scale factor Vbut also on the

optimal path. There is an in-principle method, although

costly in practice, to obtain N(V)from the action func-

tional itself, outlined in [22], Ch. 7 of [23] and Ch. 4

of [6] (for an application, see [24]). Second, since the

path integral is a sum over all paths, there are several

stationary paths starting from diﬀerent q∗(0) that reach

qat T, and one must perform a sum over all of them.

This becomes increasingly relevant in the limit T→ ∞,

where the system can bounce back and forth multiple

times between a q∗(0) and q(see [6, 23]). There is, how-

ever, a two-fold resolution to this problem. First, the

normalization factor N(V)is sub-exponential in V, and

can be well-approximated by unity large for V(also see

the introduction to Section III). Second, the initial distri-

bution ρ(q, 0) is typically taken to be peaked at a given

value, thus yielding negligible contribution from every

q∗(0) that is not near the peak of the initial condition.

Thus we only pick the least-improbable direct paths that

start from near the peak of the initial distribution and

reach qin Eq. 8.

In the ﬁrst line of Eq. 8, the only free variables are con-

ﬁguration qand ﬁnal time T, which together characterize

a time evolving distribution. The optimality condition

picks a path (q∗(t), p∗(t)), and it is only the probabil-

ity mass at q∗(0) in the initial distribution at time 0i.e.

ρ(q∗(0),0), weighted appropriately by the action func-

tional, that contributes to the ﬁnal distribution at con-

ﬁguration qand time T. The weight factor e−VA[q∗,p∗]

accounts for the fact that not all the probability mass

from q(0) goes to qin time T, and in this sense takes the

interpretation of a conditional probability.

Since the conditional probability term is only in the

exponential, it is useful to write the probability distribu-

tion itself as an exponential function ρ(q, t) = e−V S(q,t).

Here S(q, t)is the action function (not be confused with

Awhich is the action functional) and its time evolution

is given as follows

S(q, T ) = A[q∗(t), p∗(t)] + S(q∗(0),0)

=ZT

0p∗·dq∗

dt−H(p∗, q∗)dt +S(q∗(0),0) (9)

where in the second line, we expand the action functional

using Eq. 6. Thus the action function evolves by the

optimal value of the action functional, and since it is a

function we can also consider its total diﬀerential

S(q, T ) = ZT

dS +S(q∗(0),0)

=ZT

0∂S(q∗, t)

∂q∗·dq∗

dt+∂S(q∗, t)

∂t dt +S(q∗(0),0).

(10)

Comparing equations 9 and 10, we get

∂S(q∗, t)

∂q∗=p∗

∂S(q∗, t)

∂t =−H(p∗, q∗)

∂S(q∗, t)

∂t =−H∂S

∂q∗, q∗(11)

where the ﬁrst two lines are obtained by comparison and

the last line is obtained by substituting the ﬁrst line in

the second line. The non-linear PDE in the last equation

is called the Hamilton-Jacobi equation. For an in-

troduction to the subject and its history, we refer the

readers to [21, 25] (what we call as the action func-

tion is also called Hamilton’s principal function). We

brieﬂy remark that the deﬁnite integral of the action

functional along the optimal path RT

0dS also acts as a di-

vergence function in information geometry (see [20, 26])

and is denoted as S(q∗(T)||q∗(0)) yielding the relation

S(q∗

T) = S(q∗

T||q∗

0) + S(q∗

0), as mentioned in Eq. A2.

The role of momentum for stochastic systems can be

best understood by the Hamilton Jacobi equation. The

ﬁrst line in Eq. 11 shows that momentum is the gradient

of the action function, which in turn is descaled negative

log probability of the time evolving distribution given

some initial conditions. Using this, we see that p∗(0)

must be the gradient of the action function Sat q∗(0),

and that an optimal path in phase space corresponds to a

contour in the probability distribution arising by scaling

and exponentiating the action.

The third line in Eq. 2 ensures that the transition op-

erator always has a cokernel, namely the row vector con-

sisting of all ones [1,...,1]. The Perron-Frobenius theo-

rem for stochastic matrices proves that the operator also

has a unique right kernel, referred to as the stationary

distribution which we denote here by π.πthus satisﬁes

PnTn0nπ(n, t)=0or equivalently

Hπ(n, t)=0.

From Eqs. 2, 3 one can see that the stationary distribu-

tion is indeed time independent as ∂π(n, t)/∂t = 0, thus

justifying its name.

The descaled log-improbability or action function of

the stationary distribution is commonly referred to as

the non-equilibrium potential (NEP) and denoted

by V(see [27]),

π(n)e−VV(q),(12)

where the normalization constant has been set to unity

following the discussion below Eq. 8. As the action of a

stationary distribution, it must satisfy

∂V

∂t = 0,

H∂V

∂q , q= 0 (13)

where the second line is the Hamilton-Jacobi equation

from Eq. 11 that must be satisﬁed by the NEP.

In the remainder of this section, we will develop and

demonstrate the applications of Hamilton-Jacobi theory

for CRNs. Finding the NEP Vfor a general CRN is

of importance for getting any numerical estimates on its

behavior, and it is the task that we will concern our-

selves with in this paper starting Section III. As explained

above, to calculate the diﬀerence of the action function

or NEP between two points we ﬁrst need to ﬁnd the op-

timal path connecting them, which is precisely what the

Action Functional Gradient Descent (AFGD) algorithm

is designed to do.

B. Hamilton-Jacobi theory for stochastic Chemical

Reaction Networks (CRN)

In this subsection, we start by introducing the mathe-

matical formulation of CRNs and derive its Hamiltonian

function in the concentration coordinate. Next, we show

that the Hamiltonian function is convex in the momen-

tum coordinate, a necessary condition for the existence of

optimal paths, and write the equations that the optimal

paths must satisfy. Then we classify the optimal paths

constituting the stationary distribution into two cate-

gories, namely relaxation paths and escape paths, and

show that the former yield the deterministic mass-action

kinetics (MAK) and the latter yields the non-equilibrium

potential (NEP). Finally we show that, for a general CRN

with multiple ﬁxed points, the NEP is always a Lyapunov

function with respect to MAK.

A CRN is deﬁned by the triple {S,C,R}, where S,C,R

are the set of species, complexes and reactions respec-

tively.

S={S1, ..., Si, ..., S|S|}

C={y1, ..., yα, ..., y|C| :yα∈N|S|}

R={yα

kyα→yβ

−−−−−→ yβ:kyα→yβ≥0},

where Roman letters (i, j) and Greek letters (α, β) are

used to denote species and complex indices, respectively.

A complex is a multi-set of species, and is denoted by

the column vector yαrepresenting the stoichiometry of

the multi-set. The state of a CRN is characterized by a

population vector n∈Z|S|

≥0, and the time evolution of a

probability distribution over the state space is given by

(for a rigorous microphysical derivation, see [28] )

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ActionfunctionalgradientdescentalgorithmforestimatingescapepathsinstochasticchemicalreactionnetworksPrafulGagrani1;2,EricSmith1;3;4;5;61DepartmentofPhysics,UniversityofWisconsinMadison,Madison,WI53706,USA2WisconsinInstituteforDiscovery,UniversityofWisconsin-Madison,Madison,WI53715,USA3DepartmentofB...

展开>> 收起<<

Action functional gradient descent algorithm for estimating escape paths in stochastic chemical reaction networks Praful Gagrani12 Eric Smith13456.pdf

共40页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Action functional gradient descent algorithm for estimating escape paths in stochastic chemical reaction networks Praful Gagrani12 Eric Smith13456

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: