Sequential Neural Score Estimation Likelihood-Free Inference with Conditional Score Based Diffusion Models

2025-05-03 0 0 4.47MB 38 页 10玖币

侵权投诉

Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional

Score Based Diffusion Models

Louis Sharrock *12 Jack Simons * 2 Song Liu 2Mark Beaumont 2

Abstract

We introduce Sequential Neural Posterior Score

Estimation (SNPSE), a score-based method for

Bayesian inference in simulator-based models.

Our method, inspired by the remarkable success

of score-based methods in generative modelling,

leverages conditional score-based diffusion mod-

els to generate samples from the posterior distri-

bution of interest. The model is trained using an

objective function which directly estimates the

score of the posterior. We embed the model into a

sequential training procedure, which guides sim-

ulations using the current approximation of the

posterior at the observation of interest, thereby

reducing the simulation cost. We also introduce

several alternative sequential approaches, and dis-

cuss their relative merits. We then validate our

method, as well as its amortised, non-sequential,

variant on several numerical examples, demon-

strating comparable or superior performance to

existing state-of-the-art methods such as Sequen-

tial Neural Posterior Estimation (SNPE).

1. Introduction

Many applications in science, engineering, and economics

make use of stochastic numerical simulations to model com-

plex phenomena of interest. Such simulator-based models

are often designed by domain experts, using knowledge of

the underlying principles of the process of interest. They are

thus well suited to domains in which observations are best

understood as the result of mechanistic physical processes.

These include, amongst others, neuroscience (Sterratt et al.,

2011;Gon

c¸

alves et al.,2020), evolutionary biology (Beau-

mont et al.,2002;Ratmann et al.,2007), ecology (Beaumont,

Equal contribution

Department of Mathematics and Statis-

tics, Lancaster University, UK

School of Mathematics, Uni-

versity of Bristol, UK. Correspondence to: Louis Sharrock

<l.sharrock@lancaster.ac.uk>.

Proceedings of the

41 st

International Conference on Machine

the author(s).

2010;Wood,2010), epidemiology (Corander et al.,2017),

climate science (Holden et al.,2018), cosmology (Alsing

et al.,2018), high-energy physics (Brehmer,2021), and

econometrics (Gourieroux et al.,1993).

In many cases, simulator-based models depend on parame-

ters

which cannot be identiﬁed experimentally, and must

be inferred from data

. Bayesian inference provides a prin-

cipled approach for this task. In particular, given a prior

p(θ)

and a likelihood

p(x|θ)

, Bayes’ Theorem gives the posterior

distribution over the parameters as

p(θ|x) = p(x|θ)p(θ)

p(x)(1)

where

p(x) = Rp(x|θ)p(θ)dθ

is known as the evidence

or marginal likelihood. The major difﬁculty associated

with simulator-based models is the absence of a tractable

likelihood function

p(x|θ)

. This precludes, in particular,

the use of conventional likelihood-based Bayesian infer-

ence methods such as Markov chain Monte Carlo (MCMC)

(Brooks et al.,2011) or variational inference (VI) (Blei et al.,

2017). The resulting inference problem is often referred to

as likelihood-free inference or simulation-based inference

(SBI) (Cranmer et al.,2020;Sisson et al.,2018).

Traditional methods for performing SBI include approxi-

mate Bayesian computation (ABC) (Beaumont et al.,2002;

Sisson et al.,2018), whose variants include rejection ABC

(Tavar

e et al.,1997;Pritchard et al.,1999), MCMC ABC

(Marjoram et al.,2003), and sequential Monte Carlo (SMC)

ABC (Beaumont et al.,2009;Bonassi & West,2015). In

such methods, one repeatedly samples parameters, and only

accepts parameters for which the corresponding samples

from the simulator are similar to the observed data xobs.

More recently, a range of new SBI methods have been intro-

duced, which leverage advances in machine learning such

as normalising ﬂows (Papamakarios et al.,2017;2021) and

generative adversarial networks (Goodfellow et al.,2014).

These methods often include a sequential training proce-

dure, which adaptively guides simulations to yield more

informative data. Such methods include Sequential Neu-

ral Posterior Estimation (SNPE) (Papamakarios & Murray,

2016;Lueckmann et al.,2017;Greenberg et al.,2019), Se-

quential Neural Likelihood Estimation (SNLE) (Lueckmann

arXiv:2210.04872v3 [stat.ML] 3 Jun 2024

Sequential Neural Score Estimation 2

Figure 1.

Visualisation of posterior inference using Neural Posterior Score Estimation (NPSE) in the ‘Two Moons’ experiment.

The forward process transforms samples from the target posterior distribution

p(θ|x)

to a tractable reference distribution. The backward

process transports samples from the reference to the target posterior. The backward process depends on the scores

∇θlog pt(θ|x)

, which

can be estimated using score matching techniques given access to samples (θ, x)∼p(θ)p(x|θ)(see Section 2.2).

et al.,2019;Papamakarios et al.,2019), and Sequential Neu-

ral Ratio Estimation (SNRE) (Durkan et al.,2020;Hermans

et al.,2020;Miller et al.,2021;Thomas et al.,2022). Other

more recent algorithms of a similar ﬂavour include Sequen-

tial Neural Variational Inference (SNVI) (Glockler et al.,

2022), Generative Adversarial Training for SBI (GATSBI)

(Ramesh et al.,2022), Truncated SNPE (TSNPE) (Deistler

et al.,2022a), and Sequential Unnormalized Neural Likeli-

hood Estimation (SUNLE) (Glaser et al.,2022).

In this paper, we present Neural Posterior Score Estima-

tion (NPSE), as well as its sequential variant (SNPSE). Our

method, inspired by the remarkable success of score-based

generative models (Song & Ermon,2019;Song et al.,2021;

Ho et al.,2020), utilises a conditional score-based diffusion

model to generate samples from the posterior of interest.

While similar approaches (e.g., Batzolis et al.,2021;Dhari-

wal & Nichol,2021;Song et al.,2021;Tashiro et al.,2021;

Chao et al.,2022;Chung & Ye,2022) have previously found

success in a variety of problems, their application to SBI

has not yet been widely investigated.1

In contrast to existing SBI approaches based on normalising

ﬂows (e.g., SNLE, SNPE), our approach only requires esti-

mates for the gradient of the log density, or score function,

of the intractable likelihood or the posterior, which can be

approximated using a neural network via score matching

techniques (Hyv

arinen,2005;Vincent,2011;Song et al.,

2020). Since we do not require a normalisable model, our

method avoids the need for any strong restrictions on the

model architecture. In addition, unlike methods based on

generative adversarial networks (e.g., GATSBI), we do not

require adversarial training objectives, which are notoriously

unstable (Metz et al.,2017;Salimans et al.,2016).

We ﬁrst discuss how conditional score-based diffusion mod-

els can be used for SBI. We then outline how our approach

can be embedded within a principled sequential training

procedure, which guides simulations towards informative

In parallel with an early version of this work, Geffner et al.

(2023) also studied the use of diffusion models for SBI. We provide

a comparison with this paper in Section 4.3 and Appendix D.

regions using the current approximation of the posterior. We

outline in detail a number of possible sequential procedures,

several of which could also be used to develop sequential

variants of amortised algorithms more recently proposed in

the SBI literature (e.g., Dax et al.,2023). We then advo-

cate for our preferred method, Truncated Sequential NPSE

(TSNPSE), which uses a series of truncated proposals in-

spired by the approach in Deistler et al. (2022a). We validate

our methods on several benchmark SBI problems as well as

a real-world neuroscience problem, obtaining comparable

or superior performance to other state-of-the-art methods.

2. Simulation-Based Inference with Diffusion

Models

2.1. Simulation-Based Inference

Suppose that we have access to a simulator which, given

input parameters

θ∈Rd

, generates synthetic data

x∈Rp

We assume that parameters are distributed according to

some known prior

p(θ)

, but that the likelihood

p(x|θ)

is intractable. Given an observation

xobs

, we are inter-

ested in generating samples from the posterior distribution

p(θ|xobs)∝p(θ)p(xobs|θ)

, given a ﬁnite number of i.i.d.

samples {(θi, xi)}N

i=1 ∼p(θ)p(x|θ).

2.2. Diffusion Models for Simulation-Based Inference

We propose to tackle this problem using conditional score-

based diffusion models (e.g., Song et al.,2021). In such

models, noise is gradually added to the target distribution

using a diffusion process, resulting in a tractable reference

distribution, e.g., a standard Gaussian. The time-reversal

of this process is also a diffusion process, whose dynam-

ics can be approximated using score matching (Hyv

arinen,

2005;Vincent,2011;Song & Ermon,2020;Song et al.,

2021). One can thus generate samples from the target distri-

bution by simulating the approximate reverse-time process,

initialised at samples from the reference distribution.

More concretely, we begin by deﬁning a forward noising

Sequential Neural Score Estimation 3

process

(θt)t∈[0,T ]

which, initialised at

θ0∼p(·|x)

, evolves

according to the stochastic differential equation (SDE)

dθt=f(θt, t)dt+g(t)dwt,(2)

where

f:Rd×R+→Rd

is the drift coefﬁcient,

g:R+→

is the diffusion coefﬁcient, and

(wt)t≥0

is a standard

-valued Brownian motion. The coefﬁcients

and

are

chosen such that, for all

x∈Rp

, the forward noising process

admits a unique stationary distribution

from which it is

easy to sample, e.g., a standard Gaussian.

Under mild conditions, the time-reversed process

(¯

θt)t∈[0,T ]:= (θT−t)t∈[0,T ]

is also a diffusion process

(Anderson,1982;F

ollmer,1985;Haussmann & Pardoux,

1986). Initialised at

θ0∼pT(·|x)

, this process evolves

according to

d¯

θt=−f(¯

θt, T −t) + g2(T−t)∇θlog pT−t(¯

θt|x)dt

+g(T−t)dwt,(3)

where

pt(·|x) = Rpt|0(·|θ0)p(θ0|x)dθ0

denotes the time

marginal density of

θt

, conditioned on

. By deﬁni-

tion, the marginals of

(¯

θt)t∈[0,T ]|x

are equal to those of

(θT−t)t∈[0,T ]|x

. Thus, in particular,

θT∼p0(·|x) :=

p(·|x)

. Hence, if we could sample

θ0∼pT(·|x)

, and sim-

ulate

(¯

θt)t∈[0,T ]

according to

(3)

, then its ﬁnal distribution

would be the desired posterior distribution. This process is

visualised in Figure 1.

Although this procedure provides an elegant sampling

mechanism, it does not allow us to evaluate the density

p0(θ|x) := p(θ|x)

of these samples. Fortunately, there ex-

ists an ODE with the same marginals as

(2)

, which does en-

able density evaluation. This deterministic process, known

as the probability ﬂow ODE (Song et al.,2021), deﬁnes

(θt)t∈[0,T ]according to

dθt

dt=f(θt, t)−1

2g2(t)∇θlog pt(θt|x),(4)

where once again

θ0∼p(·|x)

. In this case, the log densities

log pt(θt|x)

can be computed exactly via the instantaneous

change-of-variables formula (Chen et al.,2018a):

d log pt(θt|x)

dt(5)

=−Tr∇θf(θt, t)−1

2g2(t)∇θlog pt(θt|x).

In practice, we cannot simulate

(3)

(4)

directly, since we

do not have access to

pT(·|x)

, or the scores

∇θlog pt(θt|x)

We will therefore rely on two approximations. First, we

will assume that

pT≈π

. Second, we will approximate

∇θlog pt(θt|x)

using score matching (e.g., Song et al.,

2021), and substitute this approximation into

(3)

(4)

In this case, the ODE in

(4)

is an instance of a continuous

normalising ﬂow (CNF) (Grathwohl et al.,2019).

There are various ways in which we can obtain this approxi-

mation. Here, we choose to train a time-varying score net-

work

sψ(θt, x, t)≈ ∇θlog pt(θt|x)

to directly approximate

the score of the perturbed posterior (Dhariwal & Nichol,

2021;Song et al.,2021;Batzolis et al.,2021).

In this case,

a natural objective is the weighted Fisher divergence

JSM

post(ψ) = 1

2ZT

λt(6)

Ept(θt,x)||sψ(θt, x, t)− ∇θlog pt(θt|x)||2dt,

where

λt: [0, T ]→R+

is a positive weighting function,

and

pt(θt, x)

denotes the joint distribution of

(θt, x)

. In

practice, this objective cannot be evaluated directly, since

it depends on the posterior scores

∇θlog pt(θt|x)

. Fortu-

nately, one can show (e.g., Batzolis et al.,2021;Tashiro

et al.,2021; Appendix A.1) that it is equivalent to minimise

the conditional denoising posterior score matching objective,

given by

JDSM

post (ψ) = 1

2ZT

λtEpt|0(θt|θ0)p(x|θ0)p(θ0)(7)

||sψ(θt, x, t)− ∇θtlog pt|0(θt|θ0)||2dt,

where

pt|0(θt|θ0)

denotes the transition density deﬁned

(2)

. In particular, this objective is minimised when

sψ(θt, x, t) = ∇θlog pt(θt|x)

for almost all

θt∈Rd

x∈

Rp, and t∈[0, T ].

The expectation in

(7)

only depends on samples

θ0∼p(θ)

from the prior,

x∼p(x|θ0)

from the simulator, and

θt∼

pt|0(θt|θ0)

from the forward diffusion

(2)

. Moreover, given

a suitable choice for the drift and diffusion coefﬁcients in

(2)

, the scores

∇θtlog pt|0(θt|θ0)

can be computed in closed

form. We can thus compute a Monte Carlo estimate of

(7)

and minimise this to obtain sψ(θt, x, t)≈ ∇θlog pt(θt|x).

We now have all of the necessary ingredients to generate

approximate samples from the target posterior distribution:

(i)

Draw samples

θ0∼p(θ)

from the prior,

x∼p(x|θ0)

from the likelihood, and

θt∼pt|0(θt|θ0)

using the for-

ward process (2).

(ii)

Using these samples, train a time-varying score network

sψ(θt, x, t)≈ ∇θlog pt(θt|x)

by minimising a Monte

Carlo estimate of (7).

(iii)

Draw samples

θ0∼π(·)

. Simulate an approximation of

the reverse-time process in

(3)

, or the time-reversal of the

probability ﬂow ODE in

(4)

, with

x=xobs

, replacing

∇θlog pt(θt|xobs)≈sψ(θt, xobs, t).

In Appendix B, we outline an alternative approach which

instead trains a score-network to approximate the score of the

perturbed likelihood

∇θlog pt(x|θt)

. We refer to this approach as

Neural Likelihood Score Estimation (NLSE).

Sequential Neural Score Estimation 4

In line with the current SBI taxonomy, we will refer to this

approach as Neural Posterior Score Estimation (NPSE).

In Appendix A.2, we provide error bounds for NPSE in the

fully deterministic sampling regime, assuming an

bound

on the approximation error and a mild regularity condition

on the target posterior

p(·|xobs)

. Our result is adapted from

Benton et al. (2024, Theorem 6).

3. Sequential Neural Score Estimation

Given enough data and a sufﬁciently ﬂexible model, the op-

timal score network

sψ∗(θt, x, t)

will equal

∇θlog pt(θt|x)

for almost all

x∈Rp

θt∈Rd

, and

t∈[0, T ]

. Thus, in

theory, we can use the methods in the previous section to

generate samples θ∼p(θ|x)for any observation x.

In practice, we are often only interested in sampling from the

posterior for a particular experimental observation

x=xobs

Thus, given a ﬁnite simulation budget, it may be more efﬁ-

cient to train the score network using simulated data which

is close to xobs, and thus more informative for learning the

posterior scores

∇θlog pt(θt|xobs)

. This can be achieved

by drawing initial parameter samples from a suitably cho-

sen proposal prior,

θ0∼˜p(θ)

, rather than the true prior

θ0∼p(θ)

. This idea is central to existing sequential SBI al-

gorithms, which use a sequence of adaptively chosen propos-

als in order to guide simulations towards more informative

regions. The central challenge associated with developing a

successful sequential algorithm is how to effectively correct

for the mismatch between the so-called proposal posterior

˜p(θ|x) = p(θ|x)˜p(θ)

p(θ)

p(x)

˜p(x),(8)

and the true posterior

p(θ|x)∝p(θ)p(x|θ)

. In the following

sections, we introduce several possible sequential variants

of NPSE, which we collectively refer to as SNPSE. We

note, as pointed out in the introduction, that in principle

these approaches could also be used to develop sequential

variants of the recently proposed ﬂow-matching posterior

estimation (FMPE) algorithm (Dax et al.,2023).

We begin by outlining some generic features of the se-

quential procedure, which hold irrespective of the speciﬁc

sequential method employed (see Sections 3.1 -3.2). In

all cases, the sequential procedure will take place over

rounds, indexed by

r≥1

. Given a total budget of

simulations, we assume the simulations are evenly dis-

tributed across rounds:

Nr=N/R =M

for

r= 1, . . . , R

where

is the number of simulations in round

. In the

ﬁrst round, we follow the standard NPSE algorithm (Sec-

tion 2). In particular, we ﬁrst generate

{θ1

0,i}M

i=1 ∼p(θ)

from the prior, and

{x1

i}M

i=1 ∼p(x|θ0,i)

using the simu-

lator. These samples are used to train a score network

sψ(θt, x, t)≈ ∇θlog pt(θt|x)

by minimising

(7)

. By sub-

stituting this into

(3)

, we can generate samples approxi-

mately from the target posterior. Following the initial round,

there are several conceivable sequential procedures one

could use to generate samples from

p(θ|xobs)

. We now

describe several such methods. Broadly speaking, these pro-

cedures differ in (i) how they deﬁne the proposal prior; and

(ii) how they correct for the mismatch between the proposal

posterior and the true posterior.

3.1. Truncated Approach

We ﬁrst introduce our preferred method: Truncated SNPSE

(TSNPSE). This algorithm - summarised in Algorithm 1-

utilises a series of proposals given by truncated versions of

the prior, inspired by the approach in Deistler et al. (2022a).

For

r≥1

, let

pr−1

ψ(θ|xobs)

denote the approximation to

the target posterior learned in the

(r−1)th

round, with the

convention that

ψ(θ) := p(θ)

. Then, in the

rth

round, we

will use the highest-probability region of this approximation

to deﬁne a truncated version of the prior. To be precise, in

the rth round, suppose we deﬁne

¯pr(θ)∝p(θ)·I{θ∈HPRε(pr−1

ψ(θ|xobs))},(9)

where

HPRε(·)

denotes the highest

1−ε

probability re-

gion, deﬁned as the smallest region which contains

1−ε

the mass; and we adopt the convention that

¯p0(θ) = p(θ)

We then deﬁne the proposal distribution for this round as

˜pr(θ) = 1

rPr−1

s=0 ¯ps(θ)

. Additional details regarding how

to compute and sample from this proposal distribution are

provided in Appendix E.3.

Crucially, under the assumption that we do not truncate

regions which have non-zero mass under the true posterior

p(θ|xobs)

, this proposal distribution is proportional to the

prior within the support of the posterior. Thus, we do not

need to perform a correction. In particular, our loss function

remains minimised at the score of the target posterior. This

statement is formalised in the following proposition.

Proposition 3.1. Let

˜pr(θ) = 1

rPr−1

s=0 ¯ps(θ)

, where

¯p0(θ) = p(θ)

and

¯ps(θ)

is deﬁned by

(9)

for all

s≥1

Suppose that

Θobs ⊆HPRϵ(ps

ψ(θ|xobs))

for all

s≥1

where

Θobs = supp(p(·|xobs))

. Then, writing

˜pr

t(θt, x)

for the distribution of

(θt, x)

when

(θ0, x)∼˜pr(θ, x)

, the

minimiser ψ∗of the loss function

JTSNPSE−SM

post (ψ) = 1

2ZT

λtE˜pr

t(θt,x)(10)

[||sψ(θt, x, t)− ∇θlog pt(θt|x)||2]dt,

or, equivalently, of the loss function

JTSNPSE−DSM

post (ψ) = 1

2ZT

λtEpt|0(θt|θ0)p(x|θ0)˜pr(θ0)(11)

[||sψ(θt, x, t)− ∇θlog pt0(θt|θ0)||2]dt,

satisﬁes sψ⋆(θt, xobs, t) = ∇θlog pt(θt|xobs).

Sequential Neural Score Estimation 5

Proof. See Appendix C.1.

Algorithm 1 TSNPSE

Inputs: Observation

xobs

, prior

p(θ) =: ¯p0(θ)

, simula-

tor

p(x|θ)

, simulation budget

, number of rounds

(simulations-per-round M=N/R), dataset D={}.

Outputs: pψ(θ|xobs)≈p(θ|xobs).

for r= 1, . . . , R do

for i= 1, . . . , M do

Draw θi∼¯pr−1(θ),xi∼p(x|θi).

Add (θi, xi)to D.

end for

Learn

sψ(θt, x, t)≈ ∇θlog pt(θt|x)

by minimising a

Monte Carlo estimate of (11) based on dataset D.

Compute

¯pr(θ)

(9)

using

sψ(θt, xobs, t)

. See Ap-

pendix E.3 for details.

end for

Get

pψ(θ|xobs)

sampler by substituting

sψ(θt, xobs, t)≈

∇θlog pt(θt|xobs)in (4).

Return: pψ(θ|xobs).

3.2. Alternative Approaches

We now outline several other possible sequential approaches

for NPSE. An extensive and detailed discussion of these

methods, as well as supporting numerical results, can be

found in Appendix C. Broadly speaking, these methods can

be viewed as score-based analogues of existing sequential

variants of NPE, namely, SNPE-A (Papamakarios & Murray,

2016), SNPE-B (Lueckmann et al.,2017), and SNPE-C

(Greenberg et al.,2019). We refer to, e.g., Durkan et al.

(2020) for a concise overview of SNPE-A, SNPE-B, and

SNPE-C.

Unlike TSNPSE, in each of these methods, the proposal

prior is deﬁned directly in terms of the most recent approx-

imation of the posterior. In particular, in the

rth

round,

we now sample new parameters

{θr

0,i}M

i=1 ∼pr−1

ψ(θ|xobs)

and simulate new data

{xr

i}M

i=1 ∼p(x|θr

0,i)

. We then con-

catenate these samples with those from previous rounds to

form

s=1{(θs

0,i, xs

i)}M

i=1 ∼˜pr(θ)p(x|θ)

, where

˜pr(θ) =

rPr−1

s=0 ps

ψ(θ|xobs), and p0

ψ(θ|xobs) := p(θ).

In this case, if were to minimise the original score match-

ing objective

(7)

, but using samples

θ0∼˜pr(θ)

rather

than

θ0∼p(θ)

, we would learn a score network which

approximates

∇θlog ˜pr

t(θt|x)

, rather than

∇θlog pt(θt|x)

where

˜pr

t(θt|x) = RRdpt|0(θt|θ0)˜pr(θ0|x)dθ0

, and

˜pr(θ|x) = ˜pr(θ)p(x|θ)

˜pr(x)

. Substituting this score network, eval-

uated at

x=xobs

, into

(3)

(4)

, would then result in

samples

θ∼˜pr(θ|xobs)

, rather than

θ∼p(θ|xobs)

. We

thus require a correction to recover samples from the correct

posterior.

SNPSE-A. The ﬁrst approach is to perform a post-hoc im-

portance weight correction using, e.g., sampling-importance

resampling (SIR) (Rubin,1987;1988;Smith & Gelfand,

1992;Gelman et al.,1995). According to this approach,

we ﬁrst generate

{˜

θi}M′

i=1 ∼˜pr

ψ(·|xobs)

, where

˜pr

ψ(·|xobs)

denotes the approximate proposal posterior obtained in the

rth

round, and

M′≥M

. We then draw samples

{θi}M

i=1

with or without replacement from

{˜

θi}M′

i=1

, with sample

probabilities, ˜wi, proportional to the importance ratios

hi=p(˜

θi|xobs)

˜pr

ψ(˜

θi|xobs).(12)

In the limit as

M′→ ∞

, this sample will consist of inde-

pendent draws from

p(·|xobs)

(e.g., Smith & Gelfand,1992).

In practice, we cannot evaluate

p(·|xobs)

(12)

, and thus

will instead use sample probabilities wiproportional to

hi=p(˜

θi)

˜pr(˜

θi).(13)

The importance ratios in

(13)

are approximately propor-

tional to the correct importance ratios in (12), since

hi=p(˜

θi)

˜pr(˜

θi)∝p(˜

θi|xobs)

˜pr(˜

θi|xobs)≈p(˜

θi|xobs)

˜pr

ψ(˜

θi|xobs)=˜

hi.(14)

Although SNPSE-A can work well in simple settings, it is

fundamentally limited by the approximation introduced in

(14)

. In particular, when there is a signiﬁcant mismatch

between the true proposal,

˜pr(·|xobs)

, and the approximate

(learned) proposal,

˜pr

ψ(·|xobs)

, this approach can lead to

inaccurate inference (see Appendix C.2).

SNPSE-B. The second approach is to include an impor-

tance weight correction within the denoising score matching

objective

(7)

. In particular, in the

rth

round, we now min-

imise a Monte Carlo estimate of

JSNPSE−B

post (ψ) = 1

2ZT

λtEpt|0(θt|θ0)p(x|θ0)˜pr(θ0)(15)

p(θ0)

˜pr(θ0)||sψ(θt, x, t)− ∇θtlog pt|0(θt|θ0)||2dt.

It is straightforward to show that this objective is minimised

at the score of the true posterior, that is, by

ψ∗

such that

sψ∗(θt, x, t) = ∇θlog pt(θt|x)

(see Appendix C.3). Un-

fortunately, similar to SNPE-B (Lueckmann et al.,2017),

the importance weights are often high variance, resulting in

unstable training and poor overall algorithm performance

(e.g., Papamakarios et al.,2019;Durkan et al.,2019).

SNPSE-C. The third approach is to include a score-

based correction within the denoising posterior score match-

ing objective

(7)

. In this case, we minimise

(7)

, now

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SequentialNeuralScoreEstimation:Likelihood-FreeInferencewithConditionalScoreBasedDiffusionModelsLouisSharrock*12JackSimons*2SongLiu2MarkBeaumont2AbstractWeintroduceSequentialNeuralPosteriorScoreEstimation(SNPSE),ascore-basedmethodforBayesianinferenceinsimulator-basedmodels.Ourmethod,inspiredbytherem...

展开>> 收起<<

Sequential Neural Score Estimation Likelihood-Free Inference with Conditional Score Based Diffusion Models.pdf

共38页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Sequential Neural Score Estimation Likelihood-Free Inference with Conditional Score Based Diffusion Models

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: