Sequential Neural Score Estimation Likelihood-Free Inference with Conditional Score Based Diffusion Models

2025-05-03 0 0 4.47MB 38 页 10玖币
侵权投诉
Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional
Score Based Diffusion Models
Louis Sharrock *12 Jack Simons * 2 Song Liu 2Mark Beaumont 2
Abstract
We introduce Sequential Neural Posterior Score
Estimation (SNPSE), a score-based method for
Bayesian inference in simulator-based models.
Our method, inspired by the remarkable success
of score-based methods in generative modelling,
leverages conditional score-based diffusion mod-
els to generate samples from the posterior distri-
bution of interest. The model is trained using an
objective function which directly estimates the
score of the posterior. We embed the model into a
sequential training procedure, which guides sim-
ulations using the current approximation of the
posterior at the observation of interest, thereby
reducing the simulation cost. We also introduce
several alternative sequential approaches, and dis-
cuss their relative merits. We then validate our
method, as well as its amortised, non-sequential,
variant on several numerical examples, demon-
strating comparable or superior performance to
existing state-of-the-art methods such as Sequen-
tial Neural Posterior Estimation (SNPE).
1. Introduction
Many applications in science, engineering, and economics
make use of stochastic numerical simulations to model com-
plex phenomena of interest. Such simulator-based models
are often designed by domain experts, using knowledge of
the underlying principles of the process of interest. They are
thus well suited to domains in which observations are best
understood as the result of mechanistic physical processes.
These include, amongst others, neuroscience (Sterratt et al.,
2011;Gon
c¸
alves et al.,2020), evolutionary biology (Beau-
mont et al.,2002;Ratmann et al.,2007), ecology (Beaumont,
*
Equal contribution
1
Department of Mathematics and Statis-
tics, Lancaster University, UK
2
School of Mathematics, Uni-
versity of Bristol, UK. Correspondence to: Louis Sharrock
<l.sharrock@lancaster.ac.uk>.
Proceedings of the
41 st
International Conference on Machine
Learning, Vienna, Austria. PMLR 235, 2024. Copyright 2024 by
the author(s).
2010;Wood,2010), epidemiology (Corander et al.,2017),
climate science (Holden et al.,2018), cosmology (Alsing
et al.,2018), high-energy physics (Brehmer,2021), and
econometrics (Gourieroux et al.,1993).
In many cases, simulator-based models depend on parame-
ters
θ
which cannot be identified experimentally, and must
be inferred from data
x
. Bayesian inference provides a prin-
cipled approach for this task. In particular, given a prior
p(θ)
and a likelihood
p(x|θ)
, Bayes’ Theorem gives the posterior
distribution over the parameters as
p(θ|x) = p(x|θ)p(θ)
p(x)(1)
where
p(x) = Rp(x|θ)p(θ)dθ
is known as the evidence
or marginal likelihood. The major difficulty associated
with simulator-based models is the absence of a tractable
likelihood function
p(x|θ)
. This precludes, in particular,
the use of conventional likelihood-based Bayesian infer-
ence methods such as Markov chain Monte Carlo (MCMC)
(Brooks et al.,2011) or variational inference (VI) (Blei et al.,
2017). The resulting inference problem is often referred to
as likelihood-free inference or simulation-based inference
(SBI) (Cranmer et al.,2020;Sisson et al.,2018).
Traditional methods for performing SBI include approxi-
mate Bayesian computation (ABC) (Beaumont et al.,2002;
Sisson et al.,2018), whose variants include rejection ABC
(Tavar
´
e et al.,1997;Pritchard et al.,1999), MCMC ABC
(Marjoram et al.,2003), and sequential Monte Carlo (SMC)
ABC (Beaumont et al.,2009;Bonassi & West,2015). In
such methods, one repeatedly samples parameters, and only
accepts parameters for which the corresponding samples
from the simulator are similar to the observed data xobs.
More recently, a range of new SBI methods have been intro-
duced, which leverage advances in machine learning such
as normalising flows (Papamakarios et al.,2017;2021) and
generative adversarial networks (Goodfellow et al.,2014).
These methods often include a sequential training proce-
dure, which adaptively guides simulations to yield more
informative data. Such methods include Sequential Neu-
ral Posterior Estimation (SNPE) (Papamakarios & Murray,
2016;Lueckmann et al.,2017;Greenberg et al.,2019), Se-
quential Neural Likelihood Estimation (SNLE) (Lueckmann
1
arXiv:2210.04872v3 [stat.ML] 3 Jun 2024
Sequential Neural Score Estimation 2
Figure 1.
Visualisation of posterior inference using Neural Posterior Score Estimation (NPSE) in the ‘Two Moons’ experiment.
The forward process transforms samples from the target posterior distribution
p(θ|x)
to a tractable reference distribution. The backward
process transports samples from the reference to the target posterior. The backward process depends on the scores
θlog pt(θ|x)
, which
can be estimated using score matching techniques given access to samples (θ, x)p(θ)p(x|θ)(see Section 2.2).
et al.,2019;Papamakarios et al.,2019), and Sequential Neu-
ral Ratio Estimation (SNRE) (Durkan et al.,2020;Hermans
et al.,2020;Miller et al.,2021;Thomas et al.,2022). Other
more recent algorithms of a similar flavour include Sequen-
tial Neural Variational Inference (SNVI) (Glockler et al.,
2022), Generative Adversarial Training for SBI (GATSBI)
(Ramesh et al.,2022), Truncated SNPE (TSNPE) (Deistler
et al.,2022a), and Sequential Unnormalized Neural Likeli-
hood Estimation (SUNLE) (Glaser et al.,2022).
In this paper, we present Neural Posterior Score Estima-
tion (NPSE), as well as its sequential variant (SNPSE). Our
method, inspired by the remarkable success of score-based
generative models (Song & Ermon,2019;Song et al.,2021;
Ho et al.,2020), utilises a conditional score-based diffusion
model to generate samples from the posterior of interest.
While similar approaches (e.g., Batzolis et al.,2021;Dhari-
wal & Nichol,2021;Song et al.,2021;Tashiro et al.,2021;
Chao et al.,2022;Chung & Ye,2022) have previously found
success in a variety of problems, their application to SBI
has not yet been widely investigated.1
In contrast to existing SBI approaches based on normalising
flows (e.g., SNLE, SNPE), our approach only requires esti-
mates for the gradient of the log density, or score function,
of the intractable likelihood or the posterior, which can be
approximated using a neural network via score matching
techniques (Hyv
¨
arinen,2005;Vincent,2011;Song et al.,
2020). Since we do not require a normalisable model, our
method avoids the need for any strong restrictions on the
model architecture. In addition, unlike methods based on
generative adversarial networks (e.g., GATSBI), we do not
require adversarial training objectives, which are notoriously
unstable (Metz et al.,2017;Salimans et al.,2016).
We first discuss how conditional score-based diffusion mod-
els can be used for SBI. We then outline how our approach
can be embedded within a principled sequential training
procedure, which guides simulations towards informative
1
In parallel with an early version of this work, Geffner et al.
(2023) also studied the use of diffusion models for SBI. We provide
a comparison with this paper in Section 4.3 and Appendix D.
regions using the current approximation of the posterior. We
outline in detail a number of possible sequential procedures,
several of which could also be used to develop sequential
variants of amortised algorithms more recently proposed in
the SBI literature (e.g., Dax et al.,2023). We then advo-
cate for our preferred method, Truncated Sequential NPSE
(TSNPSE), which uses a series of truncated proposals in-
spired by the approach in Deistler et al. (2022a). We validate
our methods on several benchmark SBI problems as well as
a real-world neuroscience problem, obtaining comparable
or superior performance to other state-of-the-art methods.
2. Simulation-Based Inference with Diffusion
Models
2.1. Simulation-Based Inference
Suppose that we have access to a simulator which, given
input parameters
θRd
, generates synthetic data
xRp
.
We assume that parameters are distributed according to
some known prior
p(θ)
, but that the likelihood
p(x|θ)
is intractable. Given an observation
xobs
, we are inter-
ested in generating samples from the posterior distribution
p(θ|xobs)p(θ)p(xobs|θ)
, given a finite number of i.i.d.
samples {(θi, xi)}N
i=1 p(θ)p(x|θ).
2.2. Diffusion Models for Simulation-Based Inference
We propose to tackle this problem using conditional score-
based diffusion models (e.g., Song et al.,2021). In such
models, noise is gradually added to the target distribution
using a diffusion process, resulting in a tractable reference
distribution, e.g., a standard Gaussian. The time-reversal
of this process is also a diffusion process, whose dynam-
ics can be approximated using score matching (Hyv
¨
arinen,
2005;Vincent,2011;Song & Ermon,2020;Song et al.,
2021). One can thus generate samples from the target distri-
bution by simulating the approximate reverse-time process,
initialised at samples from the reference distribution.
More concretely, we begin by defining a forward noising
2
Sequential Neural Score Estimation 3
process
(θt)t[0,T ]
which, initialised at
θ0p(·|x)
, evolves
according to the stochastic differential equation (SDE)
dθt=f(θt, t)dt+g(t)dwt,(2)
where
f:Rd×R+Rd
is the drift coefficient,
g:R+
Rd
is the diffusion coefficient, and
(wt)t0
is a standard
Rd
-valued Brownian motion. The coefficients
f
and
g
are
chosen such that, for all
xRp
, the forward noising process
admits a unique stationary distribution
π
from which it is
easy to sample, e.g., a standard Gaussian.
Under mild conditions, the time-reversed process
(¯
θt)t[0,T ]:= (θTt)t[0,T ]
is also a diffusion process
(Anderson,1982;F
¨
ollmer,1985;Haussmann & Pardoux,
1986). Initialised at
¯
θ0pT(·|x)
, this process evolves
according to
d¯
θt=f(¯
θt, T t) + g2(Tt)θlog pTt(¯
θt|x)dt
+g(Tt)dwt,(3)
where
pt(·|x) = Rpt|0(·|θ0)p(θ0|x)dθ0
denotes the time
marginal density of
θt
, conditioned on
x
. By defini-
tion, the marginals of
(¯
θt)t[0,T ]|x
are equal to those of
(θTt)t[0,T ]|x
. Thus, in particular,
¯
θTp0(·|x) :=
p(·|x)
. Hence, if we could sample
¯
θ0pT(·|x)
, and sim-
ulate
(¯
θt)t[0,T ]
according to
(3)
, then its final distribution
would be the desired posterior distribution. This process is
visualised in Figure 1.
Although this procedure provides an elegant sampling
mechanism, it does not allow us to evaluate the density
p0(θ|x) := p(θ|x)
of these samples. Fortunately, there ex-
ists an ODE with the same marginals as
(2)
, which does en-
able density evaluation. This deterministic process, known
as the probability flow ODE (Song et al.,2021), defines
(θt)t[0,T ]according to
dθt
dt=f(θt, t)1
2g2(t)θlog pt(θt|x),(4)
where once again
θ0p(·|x)
. In this case, the log densities
log pt(θt|x)
can be computed exactly via the instantaneous
change-of-variables formula (Chen et al.,2018a):
d log pt(θt|x)
dt(5)
=Trθf(θt, t)1
2g2(t)θlog pt(θt|x).
In practice, we cannot simulate
(3)
or
(4)
directly, since we
do not have access to
pT(·|x)
, or the scores
θlog pt(θt|x)
.
We will therefore rely on two approximations. First, we
will assume that
pTπ
. Second, we will approximate
θlog pt(θt|x)
using score matching (e.g., Song et al.,
2021), and substitute this approximation into
(3)
or
(4)
.
In this case, the ODE in
(4)
is an instance of a continuous
normalising flow (CNF) (Grathwohl et al.,2019).
There are various ways in which we can obtain this approxi-
mation. Here, we choose to train a time-varying score net-
work
sψ(θt, x, t)≈ ∇θlog pt(θt|x)
to directly approximate
the score of the perturbed posterior (Dhariwal & Nichol,
2021;Song et al.,2021;Batzolis et al.,2021).
2
In this case,
a natural objective is the weighted Fisher divergence
JSM
post(ψ) = 1
2ZT
0
λt(6)
Ept(θt,x)||sψ(θt, x, t)− ∇θlog pt(θt|x)||2dt,
where
λt: [0, T ]R+
is a positive weighting function,
and
pt(θt, x)
denotes the joint distribution of
(θt, x)
. In
practice, this objective cannot be evaluated directly, since
it depends on the posterior scores
θlog pt(θt|x)
. Fortu-
nately, one can show (e.g., Batzolis et al.,2021;Tashiro
et al.,2021; Appendix A.1) that it is equivalent to minimise
the conditional denoising posterior score matching objective,
given by
JDSM
post (ψ) = 1
2ZT
0
λtEpt|0(θt|θ0)p(x|θ0)p(θ0)(7)
||sψ(θt, x, t)− ∇θtlog pt|0(θt|θ0)||2dt,
where
pt|0(θt|θ0)
denotes the transition density defined
by
(2)
. In particular, this objective is minimised when
sψ(θt, x, t) = θlog pt(θt|x)
for almost all
θtRd
,
x
Rp, and t[0, T ].
The expectation in
(7)
only depends on samples
θ0p(θ)
from the prior,
xp(x|θ0)
from the simulator, and
θt
pt|0(θt|θ0)
from the forward diffusion
(2)
. Moreover, given
a suitable choice for the drift and diffusion coefficients in
(2)
, the scores
θtlog pt|0(θt|θ0)
can be computed in closed
form. We can thus compute a Monte Carlo estimate of
(7)
,
and minimise this to obtain sψ(θt, x, t)≈ ∇θlog pt(θt|x).
We now have all of the necessary ingredients to generate
approximate samples from the target posterior distribution:
(i)
Draw samples
θ0p(θ)
from the prior,
xp(x|θ0)
from the likelihood, and
θtpt|0(θt|θ0)
using the for-
ward process (2).
(ii)
Using these samples, train a time-varying score network
sψ(θt, x, t)≈ ∇θlog pt(θt|x)
by minimising a Monte
Carlo estimate of (7).
(iii)
Draw samples
¯
θ0π(·)
. Simulate an approximation of
the reverse-time process in
(3)
, or the time-reversal of the
probability flow ODE in
(4)
, with
x=xobs
, replacing
θlog pt(θt|xobs)sψ(θt, xobs, t).
2
In Appendix B, we outline an alternative approach which
instead trains a score-network to approximate the score of the
perturbed likelihood
θlog pt(x|θt)
. We refer to this approach as
Neural Likelihood Score Estimation (NLSE).
3
Sequential Neural Score Estimation 4
In line with the current SBI taxonomy, we will refer to this
approach as Neural Posterior Score Estimation (NPSE).
In Appendix A.2, we provide error bounds for NPSE in the
fully deterministic sampling regime, assuming an
L2
bound
on the approximation error and a mild regularity condition
on the target posterior
p(·|xobs)
. Our result is adapted from
Benton et al. (2024, Theorem 6).
3. Sequential Neural Score Estimation
Given enough data and a sufficiently flexible model, the op-
timal score network
sψ(θt, x, t)
will equal
θlog pt(θt|x)
for almost all
xRp
,
θtRd
, and
t[0, T ]
. Thus, in
theory, we can use the methods in the previous section to
generate samples θp(θ|x)for any observation x.
In practice, we are often only interested in sampling from the
posterior for a particular experimental observation
x=xobs
.
Thus, given a finite simulation budget, it may be more effi-
cient to train the score network using simulated data which
is close to xobs, and thus more informative for learning the
posterior scores
θlog pt(θt|xobs)
. This can be achieved
by drawing initial parameter samples from a suitably cho-
sen proposal prior,
θ0˜p(θ)
, rather than the true prior
θ0p(θ)
. This idea is central to existing sequential SBI al-
gorithms, which use a sequence of adaptively chosen propos-
als in order to guide simulations towards more informative
regions. The central challenge associated with developing a
successful sequential algorithm is how to effectively correct
for the mismatch between the so-called proposal posterior
˜p(θ|x) = p(θ|x)˜p(θ)
p(θ)
p(x)
˜p(x),(8)
and the true posterior
p(θ|x)p(θ)p(x|θ)
. In the following
sections, we introduce several possible sequential variants
of NPSE, which we collectively refer to as SNPSE. We
note, as pointed out in the introduction, that in principle
these approaches could also be used to develop sequential
variants of the recently proposed flow-matching posterior
estimation (FMPE) algorithm (Dax et al.,2023).
We begin by outlining some generic features of the se-
quential procedure, which hold irrespective of the specific
sequential method employed (see Sections 3.1 -3.2). In
all cases, the sequential procedure will take place over
R
rounds, indexed by
r1
. Given a total budget of
N
simulations, we assume the simulations are evenly dis-
tributed across rounds:
Nr=N/R =M
for
r= 1, . . . , R
,
where
Nr
is the number of simulations in round
r
. In the
first round, we follow the standard NPSE algorithm (Sec-
tion 2). In particular, we first generate
{θ1
0,i}M
i=1 p(θ)
from the prior, and
{x1
i}M
i=1 p(x|θ0,i)
using the simu-
lator. These samples are used to train a score network
sψ(θt, x, t)≈ ∇θlog pt(θt|x)
by minimising
(7)
. By sub-
stituting this into
(3)
, we can generate samples approxi-
mately from the target posterior. Following the initial round,
there are several conceivable sequential procedures one
could use to generate samples from
p(θ|xobs)
. We now
describe several such methods. Broadly speaking, these pro-
cedures differ in (i) how they define the proposal prior; and
(ii) how they correct for the mismatch between the proposal
posterior and the true posterior.
3.1. Truncated Approach
We first introduce our preferred method: Truncated SNPSE
(TSNPSE). This algorithm - summarised in Algorithm 1-
utilises a series of proposals given by truncated versions of
the prior, inspired by the approach in Deistler et al. (2022a).
For
r1
, let
pr1
ψ(θ|xobs)
denote the approximation to
the target posterior learned in the
(r1)th
round, with the
convention that
p0
ψ(θ) := p(θ)
. Then, in the
rth
round, we
will use the highest-probability region of this approximation
to define a truncated version of the prior. To be precise, in
the rth round, suppose we define
¯pr(θ)p(θ)·I{θHPRε(pr1
ψ(θ|xobs))},(9)
where
HPRε(·)
denotes the highest
1ε
probability re-
gion, defined as the smallest region which contains
1ε
of
the mass; and we adopt the convention that
¯p0(θ) = p(θ)
.
We then define the proposal distribution for this round as
˜pr(θ) = 1
rPr1
s=0 ¯ps(θ)
. Additional details regarding how
to compute and sample from this proposal distribution are
provided in Appendix E.3.
Crucially, under the assumption that we do not truncate
regions which have non-zero mass under the true posterior
p(θ|xobs)
, this proposal distribution is proportional to the
prior within the support of the posterior. Thus, we do not
need to perform a correction. In particular, our loss function
remains minimised at the score of the target posterior. This
statement is formalised in the following proposition.
Proposition 3.1. Let
˜pr(θ) = 1
rPr1
s=0 ¯ps(θ)
, where
¯p0(θ) = p(θ)
and
¯ps(θ)
is defined by
(9)
for all
s1
.
Suppose that
Θobs HPRϵ(ps
ψ(θ|xobs))
for all
s1
,
where
Θobs = supp(p(·|xobs))
. Then, writing
˜pr
t(θt, x)
for the distribution of
(θt, x)
when
(θ0, x)˜pr(θ, x)
, the
minimiser ψof the loss function
JTSNPSESM
post (ψ) = 1
2ZT
0
λtE˜pr
t(θt,x)(10)
[||sψ(θt, x, t)− ∇θlog pt(θt|x)||2]dt,
or, equivalently, of the loss function
JTSNPSEDSM
post (ψ) = 1
2ZT
0
λtEpt|0(θt|θ0)p(x|θ0)˜pr(θ0)(11)
[||sψ(θt, x, t)− ∇θlog pt0(θt|θ0)||2]dt,
satisfies sψ(θt, xobs, t) = θlog pt(θt|xobs).
4
Sequential Neural Score Estimation 5
Proof. See Appendix C.1.
Algorithm 1 TSNPSE
Inputs: Observation
xobs
, prior
p(θ) =: ¯p0(θ)
, simula-
tor
p(x|θ)
, simulation budget
N
, number of rounds
R
,
(simulations-per-round M=N/R), dataset D={}.
Outputs: pψ(θ|xobs)p(θ|xobs).
for r= 1, . . . , R do
for i= 1, . . . , M do
Draw θi¯pr1(θ),xip(x|θi).
Add (θi, xi)to D.
end for
Learn
sψ(θt, x, t)≈ ∇θlog pt(θt|x)
by minimising a
Monte Carlo estimate of (11) based on dataset D.
Compute
¯pr(θ)
in
(9)
using
sψ(θt, xobs, t)
. See Ap-
pendix E.3 for details.
end for
Get
pψ(θ|xobs)
sampler by substituting
sψ(θt, xobs, t)
θlog pt(θt|xobs)in (4).
Return: pψ(θ|xobs).
3.2. Alternative Approaches
We now outline several other possible sequential approaches
for NPSE. An extensive and detailed discussion of these
methods, as well as supporting numerical results, can be
found in Appendix C. Broadly speaking, these methods can
be viewed as score-based analogues of existing sequential
variants of NPE, namely, SNPE-A (Papamakarios & Murray,
2016), SNPE-B (Lueckmann et al.,2017), and SNPE-C
(Greenberg et al.,2019). We refer to, e.g., Durkan et al.
(2020) for a concise overview of SNPE-A, SNPE-B, and
SNPE-C.
Unlike TSNPSE, in each of these methods, the proposal
prior is defined directly in terms of the most recent approx-
imation of the posterior. In particular, in the
rth
round,
we now sample new parameters
{θr
0,i}M
i=1 pr1
ψ(θ|xobs)
and simulate new data
{xr
i}M
i=1 p(x|θr
0,i)
. We then con-
catenate these samples with those from previous rounds to
form
Sr
s=1{(θs
0,i, xs
i)}M
i=1 ˜pr(θ)p(x|θ)
, where
˜pr(θ) =
1
rPr1
s=0 ps
ψ(θ|xobs), and p0
ψ(θ|xobs) := p(θ).
In this case, if were to minimise the original score match-
ing objective
(7)
, but using samples
θ0˜pr(θ)
rather
than
θ0p(θ)
, we would learn a score network which
approximates
θlog ˜pr
t(θt|x)
, rather than
θlog pt(θt|x)
,
where
˜pr
t(θt|x) = RRdpt|0(θt|θ0)˜pr(θ0|x)dθ0
, and
˜pr(θ|x) = ˜pr(θ)p(x|θ)
˜pr(x)
. Substituting this score network, eval-
uated at
x=xobs
, into
(3)
or
(4)
, would then result in
samples
θ˜pr(θ|xobs)
, rather than
θp(θ|xobs)
. We
thus require a correction to recover samples from the correct
posterior.
SNPSE-A. The first approach is to perform a post-hoc im-
portance weight correction using, e.g., sampling-importance
resampling (SIR) (Rubin,1987;1988;Smith & Gelfand,
1992;Gelman et al.,1995). According to this approach,
we first generate
{˜
θi}M
i=1 ˜pr
ψ(·|xobs)
, where
˜pr
ψ(·|xobs)
denotes the approximate proposal posterior obtained in the
rth
round, and
MM
. We then draw samples
{θi}M
i=1
with or without replacement from
{˜
θi}M
i=1
, with sample
probabilities, ˜wi, proportional to the importance ratios
˜
hi=p(˜
θi|xobs)
˜pr
ψ(˜
θi|xobs).(12)
In the limit as
M→ ∞
, this sample will consist of inde-
pendent draws from
p(·|xobs)
(e.g., Smith & Gelfand,1992).
In practice, we cannot evaluate
p(·|xobs)
in
(12)
, and thus
will instead use sample probabilities wiproportional to
hi=p(˜
θi)
˜pr(˜
θi).(13)
The importance ratios in
(13)
are approximately propor-
tional to the correct importance ratios in (12), since
hi=p(˜
θi)
˜pr(˜
θi)p(˜
θi|xobs)
˜pr(˜
θi|xobs)p(˜
θi|xobs)
˜pr
ψ(˜
θi|xobs)=˜
hi.(14)
Although SNPSE-A can work well in simple settings, it is
fundamentally limited by the approximation introduced in
(14)
. In particular, when there is a significant mismatch
between the true proposal,
˜pr(·|xobs)
, and the approximate
(learned) proposal,
˜pr
ψ(·|xobs)
, this approach can lead to
inaccurate inference (see Appendix C.2).
SNPSE-B. The second approach is to include an impor-
tance weight correction within the denoising score matching
objective
(7)
. In particular, in the
rth
round, we now min-
imise a Monte Carlo estimate of
JSNPSEB
post (ψ) = 1
2ZT
0
λtEpt|0(θt|θ0)p(x|θ0)˜pr(θ0)(15)
p(θ0)
˜pr(θ0)||sψ(θt, x, t)− ∇θtlog pt|0(θt|θ0)||2dt.
It is straightforward to show that this objective is minimised
at the score of the true posterior, that is, by
ψ
such that
sψ(θt, x, t) = θlog pt(θt|x)
(see Appendix C.3). Un-
fortunately, similar to SNPE-B (Lueckmann et al.,2017),
the importance weights are often high variance, resulting in
unstable training and poor overall algorithm performance
(e.g., Papamakarios et al.,2019;Durkan et al.,2019).
SNPSE-C. The third approach is to include a score-
based correction within the denoising posterior score match-
ing objective
(7)
. In this case, we minimise
(7)
, now
5
摘要:

SequentialNeuralScoreEstimation:Likelihood-FreeInferencewithConditionalScoreBasedDiffusionModelsLouisSharrock*12JackSimons*2SongLiu2MarkBeaumont2AbstractWeintroduceSequentialNeuralPosteriorScoreEstimation(SNPSE),ascore-basedmethodforBayesianinferenceinsimulator-basedmodels.Ourmethod,inspiredbytherem...

展开>> 收起<<
Sequential Neural Score Estimation Likelihood-Free Inference with Conditional Score Based Diffusion Models.pdf

共38页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:38 页 大小:4.47MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 38
客服
关注