
Action Matching:
Learning Stochastic Dynamics from Samples
Kirill Neklyudov 1Rob Brekelmans 1Daniel Severo 1 2 Alireza Makhzani 1 2
Abstract
Learning the continuous dynamics of a system
from snapshots of its temporal marginals is a prob-
lem which appears throughout natural sciences
and machine learning, including in quantum sys-
tems, single-cell biological data, and generative
modeling. In these settings, we assume access
to cross-sectional samples that are uncorrelated
over time, rather than full trajectories of samples.
In order to better understand the systems under
observation, we would like to learn a model of
the underlying process that allows us to propagate
samples in time and thereby simulate entire indi-
vidual trajectories. In this work, we propose Ac-
tion Matching, a method for learning a rich family
of dynamics using only independent samples from
its time evolution. We derive a tractable training
objective, which does not rely on explicit assump-
tions about the underlying dynamics and does
not require back-propagation through differential
equations or optimal transport solvers. Inspired
by connections with optimal transport, we derive
extensions of Action Matching to learn stochastic
differential equations and dynamics involving cre-
ation and destruction of probability mass. Finally,
we showcase applications of Action Matching by
achieving competitive performance in a diverse
set of experiments from biology, physics, and gen-
erative modeling.
1. Introduction
Understanding the time evolution of systems of particles
or individuals is a fundamental problem appearing across
machine learning and the natural sciences. In many scenar-
ios, it is expensive or even physically impossible to observe
entire individual trajectories. For example, in quantum me-
1
Vector Institute
2
University of Toronto. Correspondence to:
<k.necludov@gmail.com>, <makhzani@vectorinstitute.ai>.
Proceedings of the
40 th
International Conference on Machine
Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright
2023 by the author(s).
chanics, the act of measurement at a given point collapses
the wave function (Griffiths & Schroeter, 2018), while in
biological applications, single-cell RNA- or ATAC- sequenc-
ing techniques destroy the cell in question (Macosko et al.,
2015; Klein et al., 2015; Buenrostro et al., 2015).
Instead, from ‘cross-sectional’ or independent samples at
various points in time, we would like to learn a model which
simulates particles such that their density matches that of
the observed samples. The problem of learning stochastic
dynamics from marginal samples is variously referred to as
learning population dynamics (Hashimoto et al., 2016) or
as trajectory inference (Lavenant et al., 2021), in contrast to
time series modeling where entire trajectories are assumed
to be available. Learning such models to predict entire
trajectories holds the promise of facilitating simulation of
complex chemical or physical systems (Vázquez, 2007; Noé
et al., 2020) and understanding developmental processes or
treatment effects in biology (Schiebinger et al., 2019; Tong
et al., 2020; Schiebinger, 2021; Bunne et al., 2021).
Furthermore, recent advances in generative modeling have
been built upon learning stochastic dynamics which interpo-
late between the data distribution and a prior distribution. In
particular, score-based diffusion models (Song et al., 2020b;
Ho et al., 2020) construct a stochastic differential equation
(SDE) to move samples from the data distribution to a prior
distribution, while score matching (Hyvärinen & Dayan,
2005) is used to learn a reverse SDE which models the gradi-
ents of intermediate distributions. However, these methods
rely on analytical forms of the SDEs and/or the tractability
of intermediate Gaussian distributions (Lipman et al., 2022).
Since our proposed method can learn dynamics which simu-
late an arbitrary path of marginal distributions, it can also
be applied in the context of generative modeling. Namely,
we can approach generative modeling by constructing an
interpolating path between the data and an arbitrary prior
distribution, and learning to model the resulting dynamics.
In this work, we propose Action Matching, a method for
learning population dynamics from samples of their tempo-
ral marginals qt. Our contributions are as follows:
•
In Theorem 2.1, we establish the existence of a
unique gradient field
∇s∗
t
which traces any given time-
1
arXiv:2210.06662v3 [cs.LG] 8 Jun 2023