
Neural-network solutions to stochastic reaction networks
Ying Tang,1, ∗Jiayu Weng,2, 3, †and Pan Zhang4, 5, 6, ‡
1International Academic Center of Complex Systems,
Beijing Normal University, Zhuhai 519087, China
2Faculty of Arts and Sciences, Beijing Normal University, Zhuhai 519087, China
3School of Systems Science, Beijing Normal University, Beijing 100875, China
4CAS Key Laboratory for Theoretical Physics, Institute of Theoretical Physics,
Chinese Academy of Sciences, Beijing 100190, China
5School of Fundamental Physics and Mathematical Sciences,
Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China
6International Centre for Theoretical Physics Asia-Pacific, Beijing/Hangzhou, China
The stochastic reaction network in which chemical species evolve through a set of reactions is
widely used to model stochastic processes in physics, chemistry and biology. To characterize the
evolving joint probability distribution in the state space of species counts requires solving a sys-
tem of ordinary differential equations, the chemical master equation, where the size of the counting
state space increases exponentially with the type of species, making it challenging to investigate the
stochastic reaction network. Here, we propose a machine-learning approach using the variational
autoregressive network to solve the chemical master equation. Training the autoregressive network
employs the policy gradient algorithm in the reinforcement learning framework, which does not
require any data simulated in prior by another method. Different from simulating single trajec-
tories, the approach tracks the time evolution of the joint probability distribution, and supports
direct sampling of configurations and computing their normalized joint probabilities. We apply the
approach to representative examples in physics and biology, and demonstrate that it accurately
generates the probability distribution over time. The variational autoregressive network exhibits a
plasticity in representing the multimodal distribution, cooperates with the conservation law, enables
time-dependent reaction rates, and is efficient for high-dimensional reaction networks with allowing
a flexible upper count limit. The results suggest a general approach to investigate stochastic reaction
networks based on modern machine learning.
The stochastic reaction network is a standard model for stochastic processes in physics [1], chemistry [2, 3], biology [4],
and ecology [5, 6]. Representative examples include the birth-death process [7], the model of spontaneous asymmetric
synthesis [8], and gene regulatory networks [9]. In particular, due to the rapid development of measuring molecules at
the single-cell level, it becomes increasingly important to model intracellular reaction networks which have low counts
of molecules and are subject to random noise [10]. These stochastic reaction networks in a well-mixed condition can be
modeled by the chemical master equation (CME) [11], which describes a time-evolving joint probability distribution
of discrete states representing counts of reactive species. However, the number of possible states grows exponentially
with the number (type) of species; thus, it is challenging to exactly represent the joint probability and solve the CME.
Many efforts have been made to approximately solve the CME by numerical methods. The most popular method,
the Gillespie algorithm [12–14] as a type of the kinetic Monte Carlo method, samples from all possible trajectories to
generate statistics of relevant variables. However, achieving high-accuracy statistics of the joint probability distribution
requires a large number of trajectories. Moreover, the dynamics can be dramatically affected by rare but important
trajectories, which are difficult to sample by the Gillespie algorithm [15, 16]. Different from the sampling-based
methods, asymptotic approximations have been proposed to transform the CME into continuous-state equations,
e.g., the chemical Langevin equation [17]. This method is more computationally efficient, but the continuous-state
approximation becomes inaccurate when fluctuations on the count of species are significant, e.g., in the case of
proteins [10]. Another class of methods truncates the CME into a state space covering the majority of the probability
distribution, including the finite state projection [18], the sliding window method [19], and the ACME method [20, 21].
Further advances employ the Krylov subspace approximation [22] and tensor-train representations [23, 24]. However,
the computational cost of these methods is still prohibitive to reach high accuracy when both the number and counts
∗These authors contributed equally; Corresponding authors: jamestang23@gmail.com
†These authors contributed equally
‡Corresponding authors: panzhang@itp.ac.cn
arXiv:2210.01169v2 [q-bio.MN] 7 Feb 2023