Neural-network solutions to stochastic reaction networks Ying Tang1Jiayu Weng2 3yand Pan Zhang4 5 6z 1International Academic Center of Complex Systems

2025-05-02 0 0 1.46MB 13 页 10玖币
侵权投诉
Neural-network solutions to stochastic reaction networks
Ying Tang,1, Jiayu Weng,2, 3, and Pan Zhang4, 5, 6,
1International Academic Center of Complex Systems,
Beijing Normal University, Zhuhai 519087, China
2Faculty of Arts and Sciences, Beijing Normal University, Zhuhai 519087, China
3School of Systems Science, Beijing Normal University, Beijing 100875, China
4CAS Key Laboratory for Theoretical Physics, Institute of Theoretical Physics,
Chinese Academy of Sciences, Beijing 100190, China
5School of Fundamental Physics and Mathematical Sciences,
Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China
6International Centre for Theoretical Physics Asia-Pacific, Beijing/Hangzhou, China
The stochastic reaction network in which chemical species evolve through a set of reactions is
widely used to model stochastic processes in physics, chemistry and biology. To characterize the
evolving joint probability distribution in the state space of species counts requires solving a sys-
tem of ordinary differential equations, the chemical master equation, where the size of the counting
state space increases exponentially with the type of species, making it challenging to investigate the
stochastic reaction network. Here, we propose a machine-learning approach using the variational
autoregressive network to solve the chemical master equation. Training the autoregressive network
employs the policy gradient algorithm in the reinforcement learning framework, which does not
require any data simulated in prior by another method. Different from simulating single trajec-
tories, the approach tracks the time evolution of the joint probability distribution, and supports
direct sampling of configurations and computing their normalized joint probabilities. We apply the
approach to representative examples in physics and biology, and demonstrate that it accurately
generates the probability distribution over time. The variational autoregressive network exhibits a
plasticity in representing the multimodal distribution, cooperates with the conservation law, enables
time-dependent reaction rates, and is efficient for high-dimensional reaction networks with allowing
a flexible upper count limit. The results suggest a general approach to investigate stochastic reaction
networks based on modern machine learning.
The stochastic reaction network is a standard model for stochastic processes in physics [1], chemistry [2, 3], biology [4],
and ecology [5, 6]. Representative examples include the birth-death process [7], the model of spontaneous asymmetric
synthesis [8], and gene regulatory networks [9]. In particular, due to the rapid development of measuring molecules at
the single-cell level, it becomes increasingly important to model intracellular reaction networks which have low counts
of molecules and are subject to random noise [10]. These stochastic reaction networks in a well-mixed condition can be
modeled by the chemical master equation (CME) [11], which describes a time-evolving joint probability distribution
of discrete states representing counts of reactive species. However, the number of possible states grows exponentially
with the number (type) of species; thus, it is challenging to exactly represent the joint probability and solve the CME.
Many efforts have been made to approximately solve the CME by numerical methods. The most popular method,
the Gillespie algorithm [12–14] as a type of the kinetic Monte Carlo method, samples from all possible trajectories to
generate statistics of relevant variables. However, achieving high-accuracy statistics of the joint probability distribution
requires a large number of trajectories. Moreover, the dynamics can be dramatically affected by rare but important
trajectories, which are difficult to sample by the Gillespie algorithm [15, 16]. Different from the sampling-based
methods, asymptotic approximations have been proposed to transform the CME into continuous-state equations,
e.g., the chemical Langevin equation [17]. This method is more computationally efficient, but the continuous-state
approximation becomes inaccurate when fluctuations on the count of species are significant, e.g., in the case of
proteins [10]. Another class of methods truncates the CME into a state space covering the majority of the probability
distribution, including the finite state projection [18], the sliding window method [19], and the ACME method [20, 21].
Further advances employ the Krylov subspace approximation [22] and tensor-train representations [23, 24]. However,
the computational cost of these methods is still prohibitive to reach high accuracy when both the number and counts
These authors contributed equally; Corresponding authors: jamestang23@gmail.com
These authors contributed equally
Corresponding authors: panzhang@itp.ac.cn
arXiv:2210.01169v2 [q-bio.MN] 7 Feb 2023
2
of species become large [25]. Although a great effort has been made, we still lack a general method to solve the CME
by directly facing the representation problem of the evolving joint probability distribution.
Chemical
master
equation
Minimize the loss over time points
From the present approach, tracking joint probability distribution
of all species over time, which can produce marginal distributions
Chemical reactions
X1X2
… …
X1X2XM
X2
State space increases exponentially
with the number of species
()
(|)


(|,…,)
Variational autoregressive networks:
=

(|,…,)
Previously, generating marginal distributions from
simulating trajectories by the Gillespie algorithm
XM
Time
Count
X1
XM
Time
Probability
10
3
2
11
2
2
11
4
1
10
2
3
Chemical reactions
… …
Connected
configurations
Draw samples
=[

|
]
multiplications
for each sample
Algorithm
System Ansatz
Result
Figure 1. Tracking the joint probability distribution of stochastic reaction networks over time. (Upper) For a reaction network,
the state space scales exponentially with the number of species M, making it difficult to track the time evolution of the joint
distribution. The ansatz VAN, such as with the unit of the RNN, can represent the joint distribution ˆ
Pθ(n). (Middle) Starting
from an initial distribution, we minimize the loss function by the KL-divergence between joint distributions at consecutive time
steps to learn its time evolution. The subscript tdenotes time points, θdenotes parameters of the neural network to be optimized,
and Tis the transition kernel of the CME. To train the VAN at time t+δt, samples are drawn from the distribution ˆ
Pθt+δt
t+δt .
Each sample is illustrated by a column of stacked squares with color specifying the species and their counts inside the square.
For each sample, the number of connected configurations, e.g., those transit into the sample through the transition operator,
is equal to the number of chemical reactions K. (Bottom) The previous method of simulating trajectories by the Gillespie
algorithm can generate marginal distributions, but is computationally prohibitive to accurately produce high-dimensional joint
distributions. Here, the VAN tracks the joint distribution of all species over time.
In this paper, we develop a neural-network approach to investigate the time evolution of the joint probability distribu-
tion for the stochastic reaction network. Our method is inspired by the strong representation power of neural networks
for high-dimensional data [26–28]. In particular, we leverage the variational autoregressive network (VAN) to solve
the CME (Fig. 1). The VAN has been applied to statistical physics [29], quantum many-body systems [30–32], open
quantum systems [33], quantum circuits [34] and computational biology [35], where it enables to efficiently sample
configurations and compute the normalized probability of configurations. Here, we extend the VAN to characterize
the joint probability distribution of species counts for the stochastic reaction network. As the unit of the VAN, we
have employed recurrent neural networks (RNN) [30] and the transformer [36], which are flexible to represent the
high-dimensional probability distribution and to adjust the upper count limit. We also enable the VAN to have the
count constraint on each species or to maintain the conservation on the total count of species for specific systems,
both of which can improve the accuracy with the contracted probability space.
3
The present approach differs significantly from the recently developed methods. Instead of using the simulated data
from the Gillespie algorithm to train the neural network [37, 38], our approach does not need the aid of any existing
data or measurements. This is advantageous, especially for the cases where the Gillespie algorithm itself is inefficient
to capture the multimodal distribution [16]. The present approach gives an automatically normalized distribution as
the solution of the CME at arbitrary finite time, different from learning the transition kernels [39]. The obtained joint
distribution contains more information than estimating the marginal statistics alone [25], providing the probability
for each configuration in the high-dimensional state space.
To demonstrate the advances of the proposed approach, we apply it to representative examples of stochastic reaction
networks, including the genetic toggle switch with multimodal distribution [15], the intracellular signaling cascade in
the high dimension [25], the early life self-replicator with an intrinsic constraint of count conservation [6], and the
epidemic model with time-dependent rates [40]. The results on the marginal statistics match those from the previous
numerical methods, such as the Gillespie algorithm or the finite state projection [18]. The present approach further
efficiently produces the time evolution of the joint probability distribution. Especially, it can learn the multimodal
distribution, is effective for systems with feedback regulations, and is computationally efficient for the high-dimensional
system, where the computational time scales almost linearly to the number of species, allowing the approach to be
applicable and adaptable to general stochastic reaction networks.
A. Chemical master equation
We consider the discrete-state continuous-time Markovian dynamics with configurations n={n1, n2, . . . , nM}and
the size of variables M. The probability distribution at time tevolves under the stochastic master equation [7]:
tPt(n) = X
n
06=n
Wn
0nPt(n
0)rnPt(n),(1)
where Pt(n) is the probability of the configuration state n,Wnn
0are the transition rates from nto n0, and the
escape rate from nis rn=Pn
06=nWnn
0. As the probabilities of transiting into and out from any configuration sum
up to 0, the total probability is conserved over time.
One type of the master equation describing stochastic reaction networks is the CME. Specifically, a stochastic reaction
network with Kreactions and Mspecies (each species Xihas the count ni= 0,1,...Niwith 1 iM) is:
M
X
i=1
rkiXi
ck
M
X
i=1
pkiXi,(2)
with the reaction rates ckfor the k-th reaction. The numbers of reactants and products are denoted by rki and pki,
respectively. The change in the count of species follows the stoichiometric matrix: ski =pki rki. To formulate the
CME, a set of propensity functions need to be specified. The propensities, ak(n), represent the probability of the
reaction koccurring in the infinitesimal time interval at the state n[2]. For example, a conventional way follows the
law of mass action [9]: the propensity akfor each reaction is calculated by multiplying the reaction rate ckand the
count of species nito the power of rki.
Given the stoichiometric matrix, reaction rates, propensities and an initial distribution P0(n), the system evolves
according to the CME [11]:
tPt(n) =
K
X
k=1
[ak(nsk)Pt(nsk)ak(n)Pt(n)],(3)
where skis the k-th row of the stoichiometric matrix. We consider the reflecting boundary condition for both
boundaries at ni= 0 and ni=Ni, by setting the transition probability out of the boundary to zero, where the CME
conserves the probability over time. Other boundary conditions can be employed when necessary. The state space of
the probability distribution scales exponentially with the number of species: NMif each species has a count up to N.
The exponentially increasing state space makes it challenging to solve the CME. Based on a neural-network ansatz,
we next provide a computational approach toward solving this problem.
摘要:

Neural-networksolutionstostochasticreactionnetworksYingTang,1,JiayuWeng,2,3,yandPanZhang4,5,6,z1InternationalAcademicCenterofComplexSystems,BeijingNormalUniversity,Zhuhai519087,China2FacultyofArtsandSciences,BeijingNormalUniversity,Zhuhai519087,China3SchoolofSystemsScience,BeijingNormalUniversity,B...

展开>> 收起<<
Neural-network solutions to stochastic reaction networks Ying Tang1Jiayu Weng2 3yand Pan Zhang4 5 6z 1International Academic Center of Complex Systems.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:1.46MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注