Transport Reversible Jump Proposals Laurence Davies13Robert Salomone23Matthew Sutton13Christopher Drovandi13 1School of Mathematical Sciences Queensland University of Technology

2025-05-06 0 0 7.13MB 14 页 10玖币

侵权投诉

Transport Reversible Jump Proposals

Laurence Davies1,3 Robert Salomone2,3 Matthew Sutton1,3 Christopher Drovandi1,3

1School of Mathematical Sciences, Queensland University of Technology

2School of Computer Science, Queensland University of Technology

3Centre for Data Science, Queensland University of Technology

Abstract

Reversible jump Markov chain Monte Carlo

(RJMCMC) proposals that achieve reasonable

acceptance rates and mixing are notoriously dif-

ﬁcult to design in most applications. Inspired

by recent advances in deep neural network-based

normalizing ﬂows and density estimation, we

demonstrate an approach to enhance the efﬁ-

ciency of RJMCMC sampling by performing

transdimensional jumps involving reference dis-

tributions. In contrast to other RJMCMC propos-

als, the proposed method is the ﬁrst to apply a

non-linear transport-based approach to construct

efﬁcient proposals between models with compli-

cated dependency structures. It is shown that, in

the setting where exact transports are used, our

RJMCMC proposals have the desirable property

that the acceptance probability depends only on

the model probabilities. Numerical experiments

demonstrate the efﬁcacy of the approach.

1. INTRODUCTION

The problem of interest is sampling from a probability dis-

tribution πwith support

X=[

k∈K

({k} × Θk),(1)

where Kis a discrete index set, and Θk⊆Rnk, where the

nkmay differ, and hence Xis a transdimensional space.

The choice of notation Θkis made as the problem typically

arises in Bayesian model selection, where such sets cor-

respond to the space of model parameters, and k∈ K is

amodel index or indicator. The reversible jump Markov

chain Monte Carlo (RJMCMC) algorithm, formally in-

troduced by Green (1995), generalizes the Metropolis–

Hastings algorithm (Hastings,1970) via the introduction

of user-speciﬁed diffeomorphisms1hk,k0: Θk× Uk→

Θk0× Uk0where Uk,Uk0are (possibly empty) sets chosen

to ensure dimensions are matched. Due to this additional

complexity, RJMCMC proposals that perform well in prac-

tice are generally challenging to design.

RJMCMC methods are a frequently revisited research topic

for which many approaches exist. Brooks et al. (2003)

outline several approaches to improve the efﬁciency of

jump proposals on various problem types, including vari-

able selection and nested models. Green and Mira (2001)

identify the efﬁciency issues associated with naïve transdi-

mensional proposals between non-overlapping targets and

propose a delayed-rejection auxiliary-proposal mechanism.

Al-Awadhi et al. (2004) employ an auxiliary target density

instead of an auxiliary proposal. Hastie (2005) proposes

an approach for (potentially) multi-modal conditional tar-

get densities, achieved by ﬁtting a Gaussian mixture model

to each individual conditional target, and using a shift and

whitening transformation corresponding to a randomly se-

lected mixture component. Fan et al. (2008) propose a con-

ditional factorization of a differentiable target density to

sequentially construct a proposal density. Karagiannis and

Andrieu (2013) propose the construction of a Markov chain

through an annealed sequence of intermediate distributions

between models to encourage higher acceptance rates be-

tween model jumps. Farr et al. (2015) propose a KD-tree

approximation of the target density for auxiliary variable

draws to improve the efﬁciency of RJMCMC proposals.

Gagnon (2021) uses the locally-informed MCMC approach

for discrete spaces developed by Zanella (2020) to improve

the exploration efﬁciency of the model space when global

jump proposals are unavailable. However, there appears to

be no general strategy for the design of across-model pro-

posals that is widely applicable, theoretically justiﬁed, and

of practical use.

The use of measure transport to enhance sampling meth-

ods is an area of recent interest. This was ﬁrst formalized

and demonstrated in Parno and Marzouk (2018), where ap-

1bijective functions that are differentiable and have a differen-

tiable inverse

arXiv:2210.12572v2 [stat.CO] 24 Feb 2023

Transport Reversible Jump Proposals

proximate transport maps (TMs) are used for accelerating

MCMC. The form of the approximate transports as applied

to MCMC sampling is described in general terms, but their

choice of transports uses systems of orthogonal multivari-

ate polynomials. The application of approximate TMs to

enhance sampling methods includes mapping a determinis-

tic step within sequential Monte Carlo (Arbel et al.,2021);

the transformation of a continuous target distribution to

an easier-to-sample distribution via a map learned using

stochastic variational inference (Hoffman et al.,2018); and

the use of approximate TMs for the construction of in-

dependence proposals in an adaptive MCMC algorithm

(Gabrié et al.,2022).

However, despite the considerable promise of incorpo-

rating approximate TMs into sampling methodology, to

our knowledge, such ideas have not been considered in

the transdimensional sampling setting. Such an omission

is somewhat surprising, as RJMCMC samplers are con-

structed using a form of invertible maps involving distri-

butions, and hence it would intuitively appear natural that

distributional transport could feature in a useful capacity.

This work aims to develop such an approach and to demon-

strate its beneﬁts.

Contribution. The primary contributions of this work

are as follows:

I. A new class of RJMCMC proposals for across-model

moves, called transport reversible jump (TRJ) proposals

are developed. In the idealized case where exact transports

are used, the proposals are shown to have a desirable prop-

erty (Proposition 1).

II. A numerical study is conducted on challenging exam-

ples demonstrating the efﬁcacy of the proposed approach

in the setting where approximate transport maps are used.

III. An alternative “all-in-one” approach to training ap-

proximate TMs is developed, which involves combining a

saturated state space formulation of the target distribution

with conditional TMs.

Code for the numerical experiments is made available at

https://github.com/daviesl/trjp.

Structure of this Article. The remainder of this article

is structured as follows: Section 2discusses the required

background concepts regarding RJMCMC, transport maps,

and ﬂow-based models. Section 3introduces a general

strategy for using transport maps within RJMCMC and dis-

cusses its properties. Section 4conducts a numerical study

to demonstrate the efﬁcacy of the strategy in the case where

approximate transports are used. Section 5explores an al-

ternative “all-in-one” approach to training transport maps

and provides an associated numerical example. Section 6

concludes the paper.

Notation. For a function Tand distribution ν,T ]ν de-

notes the pushforward of νunder T. That is, if Z∼ν, then

T ]ν is the probability distribution of T(Z). For a univari-

ate probability distribution ν, deﬁne ⊗nνas ν⊗ · · · ⊗ ν

|{z }

ntimes

Throughout, univariate functions are to be interpreted as

applied element-wise when provided with a vector as in-

put. The symbol denotes the Hadamard (element-wise)

product. The univariate standard normal probability den-

sity function is denoted as φ,φdis the d-dimensional mul-

tivariate standard normal probability density function, and

φΣd×dis the d-dimensional multivariate normal probability

density function centered at 0dwith covariance Σd×d. For

a function f:Rn→Rn, the notation |Jf(θ)|denotes the

absolute value of the determinant of the Jacobian matrix of

fevaluated at some θ∈Rn. For distributions πdeﬁned

on sets of the form in (1), we write πkfor the distribution

conditional on k, and π(k)for its k-marginal distribution.

2. BACKGROUND

2.1 Reversible Jump Markov Chain Monte Carlo

For a distribution πdeﬁned on a space of the form in

(1), with associated probability density function π(x), the

standard method to construct a π-invariant Metropolis–

Hastings kernel (and thus an associated MCMC sampler)

is the reversible jump approach introduced in the seminal

work by Green (1995). The proposal mechanism is con-

structed to take x= (k, θk)to x0= (k0,θ0

k0)where the

dimensions of θkand θ0

k0are nkand nk0, respectively. The

approach employs dimension matching, introducing aux-

iliary random variables uk∼gk,k0and u0

k0∼gk0,k of

dimensions wkand wk0, which are arbitrary provided that

nk+wk=nk0+wk0. A proposal is then made using

these auxiliary random variables and a chosen diffeomor-

phism hk,k0deﬁned so that (θ0

k0,u0

k0) = hk,k0(θk,uk). A

discrete proposal distribution jkis also speciﬁed for each

k∈ K, where jk(k0)deﬁnes the probability of proposing

to model k0from model k. More generally, the distribu-

tions jkmay also depend on θk, but we do not consider

this case. With the above formulation of the proposal, the

RJMCMC acceptance probability is

α(x,x0) = 1 ∧π(x0)jk0(k)gk0,k(u0

k0)

π(x)jk(k0)gk,k0(uk)|Jhk,k0(θk,uk)|.

(2)

2.2 Transport Maps and Flow-Based Models

Consider two random vectors θ∼µθand Z∼µz, such

that their distributions µθand µzare absolutely continuous

with respect to n-dimensional Lebesgue measure. A func-

tion T:Rn→Rnis called a transport map (TM) from

µθto µzif µz=T ]µθ. In this setting, we refer to µθ

as the target distribution and µzas the reference distribu-

tion. Transport maps between two prescribed distributions

Laurence Davies, Robert Salomone, Matthew Sutton, Christopher Drovandi

are known to exist under mild conditions (see e.g., Parno

and Marzouk (2018) and the references therein).

One strategy for obtaining approximate TMs from samples

of a target πis via density estimation with a family of distri-

butions arising from transformations. Let {Tψ}be a fam-

ily of diffeomorphisms parameterized by ψ∈Ψwith do-

main on the support of some arbitrary base distribution µz.

Then, for ﬁxed ψ, the probability density function of the

random vector ζ=Tψ(Z)is

µζ(ζ;ψ) = µz(T−1

ψ(ζ))|JT−1

ψ(ζ)|,ζ∈Rn.(3)

An approximate TM from µzto some prescribed target dis-

tribution πcan be learned as a result of ﬁtting a model of

the above form to samples from πby minimizing KL diver-

gence between the model and an empirical distribution of

possibly-approximate samples from π, which is equivalent

to maximum likelihood estimation.

Families of distributions arising from the speciﬁcation of

highly ﬂexible and scalable {Tψ}have been an area of

intense recent interest in the machine learning literature,

often referred to as ﬂow-based models, with the families

{Tψ}often referred to as ﬂows or normalizing ﬂows (NF).

A general class of transform is autoregressive ﬂows, brieﬂy

described as follows. Let τ(·;ω), called the transformer,

be a univariate diffeomorphism parametrized by ω∈Ω,

where Ωis the set of admissible parameters. Then, the

transformation is deﬁned elementwise via

T(Z)i=τ(Zi;ζi(z<i;ψ)), i = 1, . . . , n, (4)

where the functions {ζi:Ri−1→Ω}are called the condi-

tioners. In practice, the {ζi}are each individually subsets

of the output of a single neural network that takes Zas

input and is designed to respect the autoregressive struc-

ture. When ﬁtting approximate transports to samples in our

experiments in Section 4, for additional ﬂexibility the ﬂow-

based model used arises from several chained transforma-

tions of the form in (4), each with different parameters,

and where the transformer is a piecewise function involving

monotonic rational-quadratic splines (Durkan et al.,2019).

For further details and a comprehensive overview of other

ﬂows, see Papamakarios et al. (2021).

3. TRANSPORT REVERSIBLE JUMP

PROPOSALS

We introduce a special form of general proposal between

two models, corresponding to indices kand k0. Let ν

be some univariate reference distribution, and let {Tk:

Rnk→ Znk}be a collection of diffeomorphisms such

that Tk]πk≈ ⊗nkν, k ∈ K, i.e., Tkis an approxi-

mate TM from the conditional target corresponding to in-

dex k, to an appropriately-dimensioned product of inde-

pendent reference distributions. Similarly, T−1

kis an ap-

θ1

(θ(1)

2,θ(2)

2) = T−1

2(z1, u1)

(θ(1)

2,θ(2)

θ1

θ1=T−1

1(z1)

(z(1)

2, z(2)

2) = T2(θ(1)

2,θ(2)

z1=T1(θ1)

(θ(1)

2,θ(2)

Figure 1: Illustration of the proposal class. Here, the refer-

ence νis Gaussian, and the functions ¯

hk,k0are the identity

map.

proximate TM from the reference to the conditional tar-

get for index k. The general idea of the proposal scheme

from model kto model k0is to ﬁrst apply the transforma-

tion that would (approximately) transport πkto its prod-

uct reference, drawing auxiliary variables (if proposing to

a higher-dimensional model), optionally applying a prod-

uct reference measure preserving diffeomorphism, discard-

ing auxiliary variables (if proposing to a lower dimensional

model), and then applying the transformation that would

(approximately) transport the augmented vector to the new

conditional target. Figure 1illustrates this idea for jumps

between targets of one and two dimensions.

Formally, we ﬁrst restrict the dimension of proposed auxil-

iary variables u(if any) to wk, which is deﬁned to be the

dimension difference between θkand θ0

k0, i.e., nk+wk=

nk0. Then, assuming wk≥0(a jump is proposed to a

model of equal or higher dimension), the proposal is ob-

tained via

zk=Tk(θk),

zk0=¯

hk,k0(zk,uk),where uk∼ ⊗wkν,

θ0

k0=T−1

k0(zk0),

(5)

where each ¯

hk,k0:Rnk×Rwk→Rnk0, is a diffeomor-

phism that both satisﬁes the pushforward-invariance condi-

tion

hk,k0]⊗max{nk,n0

k}ν=⊗max{nk,n0

k}ν, (6)

and is volume-preserving (i.e., the absolute value of the Ja-

cobian determinant is always equal to one). The default

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TransportReversibleJumpProposalsLaurenceDavies1,3RobertSalomone2,3MatthewSutton1,3ChristopherDrovandi1,31SchoolofMathematicalSciences,QueenslandUniversityofTechnology2SchoolofComputerScience,QueenslandUniversityofTechnology3CentreforDataScience,QueenslandUniversityofTechnologyAbstractReversiblejumpM...

展开>> 收起<<

Transport Reversible Jump Proposals Laurence Davies13Robert Salomone23Matthew Sutton13Christopher Drovandi13 1School of Mathematical Sciences Queensland University of Technology.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Transport Reversible Jump Proposals Laurence Davies13Robert Salomone23Matthew Sutton13Christopher Drovandi13 1School of Mathematical Sciences Queensland University of Technology

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: