Multi-fidelity Monte Carlo a pseudo-marginal approach

2025-05-02 0 0 2.83MB 22 页 10玖币

侵权投诉

MULTI-FIDELITY MONTE CARLO:

A PSEUDO-MARGINAL APPROACH

By Diana Cai∗and Ryan P. Adams∗

Princeton University∗

Markov chain Monte Carlo (MCMC) is an established approach

for uncertainty quantiﬁcation and propagation in scientiﬁc applica-

tions. A key challenge in applying MCMC to scientiﬁc domains is

computation: the target density of interest is often a function of ex-

pensive computations, such as a high-ﬁdelity physical simulation, an

intractable integral, or a slowly-converging iterative algorithm. Thus,

using an MCMC algorithms with an expensive target density becomes

impractical, as these expensive computations need to be evaluated

at each iteration of the algorithm. In practice, these computations

often approximated via a cheaper, low-ﬁdelity computation, leading to

bias in the resulting target density. Multi-ﬁdelity MCMC algorithms

combine models of varying ﬁdelities in order to obtain an approximate

target density with lower computational cost. In this paper, we de-

scribe a class of asymptotically exact multi-ﬁdelity MCMC algorithms

for the setting where a sequence of models of increasing ﬁdelity can

be computed that approximates the expensive target density of inter-

est. We take a pseudo-marginal MCMC approach for multi-ﬁdelity

inference that utilizes a cheaper, randomized-ﬁdelity unbiased esti-

mator of the target ﬁdelity constructed via random truncation of a

telescoping series of the low-ﬁdelity sequence of models. Finally, we

discuss and evaluate the proposed multi-ﬁdelity MCMC approach on

several applications, including log-Gaussian Cox process modeling,

Bayesian ODE system identiﬁcation, PDE-constrained optimization,

and Gaussian process regression parameter inference.

1. Introduction.

Simulation and computational modeling play a key role in science, engi-

neering, economics, and many other areas. When these models are high-quality and accurate,

they are important for scientiﬁc discovery, design, and data-driven decision making. However,

the ability to accurately model complex physical phenomena often comes with a signiﬁcant

cost—many models involve expensive computations that then need to be evaluated repeatedly in,

for instance, a sampling or optimization algorithm. Examples of model classes with expensive

computations include intractable integrals or sums, expensive quantum simulations (Troyer and

Wiese,2005), expensive numerical simulations arising from partial diﬀerential equations (PDEs)

(Raissi et al.,2017) and large systems of ordinary equations (ODEs).

In many situations, one has the ability to trade oﬀ computational cost against ﬁdelity or

accuracy in the result. Such a tradeoﬀ might arise from the choice of discretization or the number

of basis functions when solving a PDE, or the number of quadrature points when estimating

an integral. It is often possible to leverage lower-ﬁdelity models to help accelerate high-quality

solutions, e.g., by using multigrid methods (Hackbusch,2013) for spatial discretizations. More

generally, multi-ﬁdelity methods combine multiple models of varying cost and ﬁdelity to accelerate

computational algorithms and have been applied to solving inverse problems (Higdon et al.,2002;

Cui et al.,2015;Raissi et al.,2017), trust region optimization (Alexandrov et al.,1998;Arian

Keywords and phrases: Markov Chain Monte Carlo, multi-ﬁdelity models, inverse modeling, simulation

arXiv:2210.01534v1 [stat.ML] 4 Oct 2022

2D. CAI AND R.P. ADAMS

(a) Trapezoid rule with 2ktrapezoids (k= 1,2,3) (b) Lokta-Volterra ODE solutions, dt = 1/k

Fig 1: Examples of low-ﬁdelity sequences of models.

(a)

Sequence trapezoid quadrature estimates

where

is the trapzoid rule with 2

trapezoids.

(b)

Lokta-Volterra ODE solutions for prey

(

)(blue)

and predator v(t)(red) using Euler’s method with step size dt.

et al.,2000;Fahl and Sachs,2003;Robinson et al.,2008;March and Willcox,2012), Bayesian

optimization (Jones et al.,1998;Gramacy and Lee,2009;Song et al.,2019;Wu et al.,2020;Li

et al.,2020;Brevault et al.,2020), Bayesian quadrature (Gessner et al.,2020;Xi et al.,2018),

and sequential learning (Gundersen et al.,2021;Palizhati et al.,2022).

One critically important tool for scientiﬁc and engineering computation is Markov chain Monte

Carlo (MCMC), which is widely used for uncertainty quantiﬁcation, optimization, and integration.

MCMC methods are recipes for constructing a Markov chain with some desired target distribution

as the limiting distribution. Pseudo-random numbers are used to simulate transitions of the

Markov chain in order to produce samples from the target distribution. However, MCMC often

becomes impractical for high-ﬁdelity models, where a single step of the Markov chain may, for

instance, involve a numerical simulation that takes hours or days to complete. Multi-ﬁdelity

methods for MCMC focus on constructing Markov chain transition operators that are sometimes

able to use inexpensive low-ﬁdelity evaluations instead of expensive high-ﬁdelity evaluations.

The goal is to increase the eﬀective number of samples generated by the algorithm, given a

constrained computational budget. A large focus of the multi-ﬁdelity MCMC literature is on

two-stage Metropolis-Hastings (M-H) methods (Christen and Fox,2005;Efendiev et al.,2006),

which use a single low-ﬁdelity model for early rejection of a proposed sample, thereby often

short-circuiting the evaluation of the expensive, high-ﬁdelity model.

However, there are several limitations of two-stage multi-ﬁdelity Monte Carlo. First, in many

applications, a hierarchy of cheaper, low-ﬁdelity models is available; for instance, in the case of

integration,

-point quadrature estimates form a hierarchy of low-ﬁdelity models, and in the case

of a PDE, varying the discretization. Thus, the two-stage approach does not fully utilize the

availability of a hierarchy of ﬁdelities and may be more suitable for settings where the high- and

low-ﬁdelity models are not hierarchically related, e.g., semi-empirical methods vs. Hartree-Fock

in computational chemistry. In addition, in such applications, there is often a limiting model of

interest, such as a continuous function that the low-ﬁdelity discretizations approximate. Two-stage

MCMC does not asymptotically sample from this limiting target density and will at best sample

from an approximation of the biased, high-ﬁdelity posterior. Finally, the two-stage method is

unnatural to generalize to more sophisticated MCMC algorithms such as slice sampling and

Hamiltonian Monte Carlo (HMC).

We propose a class of multi-ﬁdelity MCMC methods designed for applications with a hierarchy

MULTI-FIDELITY MONTE CARLO: A PSEUDO-MARGINAL APPROACH 3

of low-ﬁdelity models available. More speciﬁcally, we assume access to a sequence of low-ﬁdelity

models that converge to a “perfect-ﬁdelity” model in the limit. Within an MCMC algorithm, we

can approximate the perfect-ﬁdelity target density with an unbiased estimator constructed from a

randomized truncation of the inﬁnite telescoping series of low-ﬁdelity target densities. This class

of multi-ﬁdelity MCMC is an example of a pseudo-marginal MCMC (PM-MCMC) algorithm—the

unbiased estimator essentially guarantees that the algorithm is asymptotically exact in that the

limiting distribution recovers the perfect-ﬁdelity target distribution as its marginal distribution.

Our approach introduces the ﬁdelity of a model as an auxiliary random variable that is evolved

separately from the target variable according to its own conditional target distribution; this

technique can be used in conjunction with any suitable MCMC update that leaves the conditional

update for the target variable of interest invariant, such as M-H, slice sampling, elliptical slice

sampling, or Hamiltonian Monte Carlo. We apply the pseudo-marginal multi-ﬁdelity MCMC

approach to several problems, including log-Gaussian Cox process modeling, Bayesian ODE

system identiﬁcation, PDE-constrained optimization, and Gaussian process parameter inference.

1.1. Related work. Multi-ﬁdelity MCMC methods are commonly applied in a two-stage proce-

dure, where the goal is to reduce the computational cost of using a single expensive high-ﬁdelity

model by using a cheap low-ﬁdelity model as a low-pass ﬁlter for a delayed acceptance/rejection

algorithm (Christen and Fox,2005;Efendiev et al.,2006;Cui et al.,2015); see Peherstorfer

et al. (2018) for a survey. Higdon et al. (2002) propose coupling a high-ﬁdelity Markov chain

with a low-ﬁdelity Markov chain via a product chain. In constrast, our approach aims to sample

from a “perfect-ﬁdelity” target density while reducing computational cost; two-stage MCMC

algorithms result in biased estimates with respect to this target density. A related class of methods

is multilevel Monte Carlo (Giles,2008,2013;Dodwell et al.,2015;Warne et al.,2021), which uses

a hierarchy of multi-ﬁdelity models for Monte Carlo estimation by expressing the expectation of a

high-ﬁdelity model as a telescoping sum of low-ﬁdelity models. Dodwell et al. (2015) use the M-H

algorithm to form the multilevel Monte Carlo estimates, simulating from a separate Markov chain

for each level of the telescoping sum. In practice multilevel Monte carlo requires choosing a ﬁnite

number of ﬁdelities, inducing bias in the estimator with respect to the (limiting) perfect-ﬁdelity

model. In contrast, our method uses a randomized ﬁdelity within a single Markov chain with the

perfect-ﬁdelity model as the target.

Our approach applies pseudo-marginal MCMC to multi-ﬁdelity problems. There is a rich

literature developing pseudo-marginal MCMC methods (Beaumont,2003;Andrieu and Roberts,

2009) for so-called “doubly-intractable” likelihoods, which are likelihoods that are intractable to

evaluate. Several approaches in the pseudo-marginal MCMC literature are particular relevant to

our work. The ﬁrst are the PM-MCMC methods introduced by Lyne et al. (2015), which describes

a class of pseudo-marginal M-H methods that use Russian roulette estimators to obtain unbiased

estimators of the likelihood. However, this method samples the variable of interest jointly with

the auxillary randomness, which often leads to sticking.

Alternatively, several methods have considered sampling the randomness separately. The idea of

clamping random numbers is explored in depth by Andrieu et al. (2010) and Murray and Graham

(2016); the latter applies to this pseudo-marginal slice sampling. In particular, our approach applies

these ideas to the speciﬁc setting of multi-ﬁdelity models, where the random ﬁdelity is treated as

an auxillary variable. Finally, while our approach applies to doubly-intractable problems, we are

also motivated by a larger class of multi-ﬁdelity problems studied in the computational sciences

that may not even be inference problems, such as quantum simulations and PDE-constrained

optimization.

4D. CAI AND R.P. ADAMS

2. Multi-ﬁdelity MCMC.

Monte Carlo methods approximate integrals and sums that can

be expressed an expectation:

Eπ(h(θ)) = Zh(θ)π(θ)dθ ≈1

t=1

h(θ(t)),where θ(t)∼π, (1)

and where

π:

→R+

is the target density that may only be known up to a constant,

(

)is a

function of interest, and

{θ(t)}T

t=1

are samples from

. Markov chain Monte Carlo methods are

then used to generate samples θ(t)from πby simulating from a Markov chain with target π.

In many settings, pointwise evaluations of the target function

(

)are expensive or even

intractable; from here on we will assume that the goal is to compute statistics of a quantity of

interest

(

)with respect to a perfect-ﬁdelity target density

π∞

(

). In practice, the estimate in

Equation (1) is instead estimated using a cheaper, low-ﬁdelity density

πk

(

), where

k∈N:

{

, . . .}

. In particular, we consider settings where there is a sequence of low-ﬁdelity densities

available that converge to the target, i.e.,

πk

(

)

k→∞

−→ π∞

(

). We assume that as

increases, the

model becomes higher in ﬁdelity (with respect to

π∞

) but more costly to evaluate, increasing in

expense super-linearly with k.

For instance,

π∞

could represent a target density that depends on an intractable integral, the

solution of a PDE, the solution of a large system of ODEs, the solution of a large system of

linear equations, or the minimizer of a function. Thus, a typical evaluation of

π∞

requires an

approximation at a ﬁdelity

with a tolerable level of bias for a given computational budget.

Here increasing

could correspond to ﬁner discretizations of diﬀerential equations, increasing

numbers of quadrature points, or performing a larger number of iterations in a linear solver or

optimization routine.

In the multi-ﬁdelity setting, the goal is to combine several models of varying ﬁdelity within an

MCMC algorithm to reduce the computational cost of estimating Equation (1). In this paper,

we describe a class of MCMC algorithms that leverages the sequence of low-ﬁdelity models

πk

Our strategy for multi-ﬁdelity MCMC (MF-MCMC) will be to construct an unbiased estimator

π∞

(

)using random choices of the ﬁdelity

and then to include

in the Markov chain

as an auxiliary variable. By carefully constructing such a Markov chain, it will be possible to

asymptotically estimate the functional in Equation (1) as though the samples were taken from

the perfect-ﬁdelity model; each step of the Markov chain will nevertheless only require a ﬁnite

amount of computation. Finally, our approach allows us to essentially plug in any valid MCMC

algorithm, and we apply this strategy to develop multi-ﬁdelity variants of a number of MCMC

algorithms, such as M-H and slice sampling.

2.1. Pseudo-marginal MCMC for the multi-ﬁdelity setting. Pseudo-marginal MCMC (Beau-

mont,2003;Andrieu and Roberts,2009) is a class of auxillary-variable MCMC algorithms that

replaces the target density

(

)with an estimator

ˆπ

(

)that is a function of a random variable.

If the estimator is nonnegative and unbiased, i.e., for all

θ∈

Θ,

ˆπ

(

)

≥

0and

[

ˆπ

(

)] =

(

then MCMC transitions that use the estimator still have

(

)as their invariant distribution.

This property is sometimes referred to as “exact-approximate” MCMC as the transitions are

approximate but the limiting distribution is exact. Estimators can be constructed from a variety

of methods, including particle ﬁltering (Andrieu and Roberts,2009); our approach will use

randomized series truncations, which has been consider in pseudo-marginal MCMC methods such

as Lyne et al. (2015), Georgoulas et al. (2017) and Biron-Lattes et al. (2022).

MULTI-FIDELITY MONTE CARLO: A PSEUDO-MARGINAL APPROACH 5

We now apply the pseudo-marginal approach to the multi-ﬁdelity setting. Here the target

density estimator arises from a random choice of the ﬁdelity

K∈N

that is governed by a

distribution

. We denote the estimator using

ˆπK

(

)to make the dependence on the random

ﬁdelity Kexplicit. The estimator is constructed such that it is unbiased with respect to µ, i.e.,

∞

k=1

µ(k)ˆπk(θ) = π∞(θ).(2)

The distribution

is also constructed by the user: ideally, the estimator

ˆπK

(

)will prefer

smaller values of

while having suﬃciently low variance as to allow the Markov chain to mix

eﬀectively. Thus the simulations can be run at inexpensive low-ﬁdelities, while the estimates will

be as though the perfect-ﬁdelity model were being used.

The standard pseudo-marginal MCMC approach is to construct a Markov chain that has the

following joint density as its stationary distribution:

π(θ, K) = µ(K)ˆπK(θ).(3)

Observe that while Equation (3) does not depend on the perfect-ﬁdelity target density

π∞

, it

returns the desired marginal

π∞

via Equation (2). As a concrete example, a pseudo-marginal

M-H algorithm generates a new state

θ0

and ﬁdelity

jointly using

(

θ0

;

)as the proposal

for

θ0

(

;

) =

(

)as the proposal distribution for the ﬁdelity, and accepts/rejects the state

according to

a=π(θ0, K0)q(θ;θ0)q(K;K0)

π(θ, K)q(θ0;θ)q(K0;K)=ˆπK0(θ0)q(θ;θ0)

ˆπK(θ)q(θ0;θ),(4)

where the equality holds since the distribution terms for

and

cancel. Note that the right-hand

side of Equation (4) is the standard M-H ratio but that the target density

is replaced with the

estimator ˆπK.

However, standard pseudo-marginal MCMC using joint proposals of the state and ﬁdelity

can “get stuck” when the estimator is noisy and fail to accept new states. Thus, we apply

the approach in Murray and Graham (2016) that augments the Markov chain to include the

randomness of the estimator via a separate update; here the randomness of the estimator arises

from the ﬁdelity

. Concretely, we construct a Markov chain that simulates from Equation (3)

by alternating sampling between the conditional target densities

(

K|θ

)and

(

θ|K

)(steps 5 and

6 of Algorithm 1, respectively). We refer to this strategy as multi-ﬁdelity MCMC (MF-MCMC),

since by conditioning on

, the update for the state

becomes a standard deterministic

update applied to a low-ﬁdelity model

ˆπk

(

), and any appropriate MCMC update can be used

here, making it straightforward to use complex MCMC methods, such as slice sampling and

HMC. Similarly, any suitable MCMC update for the ﬁdelity

can be used using the conditional

target π(K|θ).

Many techniques can be used to construct an unbiased estimator of

π∞

with randomness

;

we describe a general approach in the next section. However, it is generally diﬃcult to guarantee

the estimator is nonnegative, as required by pseudo-marginal MCMC. One technique considered

by Lin et al. (2000) and Lyne et al. (2015) is to instead sample from the target distribution

induced by the absolute value of the estimator and applying a sign-correction to the ﬁnal Monte

Carlo estimate in Equation (1), an approach borrowed from the quantum Monte Carlo literature

where it is necessary for modeling fermionic particles. This approach has been applied to the M-H

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MULTI-FIDELITYMONTECARLO:APSEUDO-MARGINALAPPROACHByDianaCai*andRyanP.Adams*PrincetonUniversity*MarkovchainMonteCarlo(MCMC)isanestablishedapproachforuncertaintyquanticationandpropagationinscienticapplica-tions.AkeychallengeinapplyingMCMCtoscienticdomainsiscomputation:thetargetdensityofinterestisof...

展开>> 收起<<

Multi-fidelity Monte Carlo a pseudo-marginal approach.pdf

共22页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Multi-fidelity Monte Carlo a pseudo-marginal approach

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: