Transport Reversible Jump Proposals Laurence Davies13Robert Salomone23Matthew Sutton13Christopher Drovandi13 1School of Mathematical Sciences Queensland University of Technology

2025-05-06 0 0 7.13MB 14 页 10玖币
侵权投诉
Transport Reversible Jump Proposals
Laurence Davies1,3 Robert Salomone2,3 Matthew Sutton1,3 Christopher Drovandi1,3
1School of Mathematical Sciences, Queensland University of Technology
2School of Computer Science, Queensland University of Technology
3Centre for Data Science, Queensland University of Technology
Abstract
Reversible jump Markov chain Monte Carlo
(RJMCMC) proposals that achieve reasonable
acceptance rates and mixing are notoriously dif-
ficult to design in most applications. Inspired
by recent advances in deep neural network-based
normalizing flows and density estimation, we
demonstrate an approach to enhance the effi-
ciency of RJMCMC sampling by performing
transdimensional jumps involving reference dis-
tributions. In contrast to other RJMCMC propos-
als, the proposed method is the first to apply a
non-linear transport-based approach to construct
efficient proposals between models with compli-
cated dependency structures. It is shown that, in
the setting where exact transports are used, our
RJMCMC proposals have the desirable property
that the acceptance probability depends only on
the model probabilities. Numerical experiments
demonstrate the efficacy of the approach.
1. INTRODUCTION
The problem of interest is sampling from a probability dis-
tribution πwith support
X=[
k∈K
({k} × Θk),(1)
where Kis a discrete index set, and ΘkRnk, where the
nkmay differ, and hence Xis a transdimensional space.
The choice of notation Θkis made as the problem typically
arises in Bayesian model selection, where such sets cor-
respond to the space of model parameters, and k K is
amodel index or indicator. The reversible jump Markov
chain Monte Carlo (RJMCMC) algorithm, formally in-
troduced by Green (1995), generalizes the Metropolis–
Hastings algorithm (Hastings,1970) via the introduction
of user-specified diffeomorphisms1hk,k0: Θk× Uk
Θk0× Uk0where Uk,Uk0are (possibly empty) sets chosen
to ensure dimensions are matched. Due to this additional
complexity, RJMCMC proposals that perform well in prac-
tice are generally challenging to design.
RJMCMC methods are a frequently revisited research topic
for which many approaches exist. Brooks et al. (2003)
outline several approaches to improve the efficiency of
jump proposals on various problem types, including vari-
able selection and nested models. Green and Mira (2001)
identify the efficiency issues associated with naïve transdi-
mensional proposals between non-overlapping targets and
propose a delayed-rejection auxiliary-proposal mechanism.
Al-Awadhi et al. (2004) employ an auxiliary target density
instead of an auxiliary proposal. Hastie (2005) proposes
an approach for (potentially) multi-modal conditional tar-
get densities, achieved by fitting a Gaussian mixture model
to each individual conditional target, and using a shift and
whitening transformation corresponding to a randomly se-
lected mixture component. Fan et al. (2008) propose a con-
ditional factorization of a differentiable target density to
sequentially construct a proposal density. Karagiannis and
Andrieu (2013) propose the construction of a Markov chain
through an annealed sequence of intermediate distributions
between models to encourage higher acceptance rates be-
tween model jumps. Farr et al. (2015) propose a KD-tree
approximation of the target density for auxiliary variable
draws to improve the efficiency of RJMCMC proposals.
Gagnon (2021) uses the locally-informed MCMC approach
for discrete spaces developed by Zanella (2020) to improve
the exploration efficiency of the model space when global
jump proposals are unavailable. However, there appears to
be no general strategy for the design of across-model pro-
posals that is widely applicable, theoretically justified, and
of practical use.
The use of measure transport to enhance sampling meth-
ods is an area of recent interest. This was first formalized
and demonstrated in Parno and Marzouk (2018), where ap-
1bijective functions that are differentiable and have a differen-
tiable inverse
arXiv:2210.12572v2 [stat.CO] 24 Feb 2023
Transport Reversible Jump Proposals
proximate transport maps (TMs) are used for accelerating
MCMC. The form of the approximate transports as applied
to MCMC sampling is described in general terms, but their
choice of transports uses systems of orthogonal multivari-
ate polynomials. The application of approximate TMs to
enhance sampling methods includes mapping a determinis-
tic step within sequential Monte Carlo (Arbel et al.,2021);
the transformation of a continuous target distribution to
an easier-to-sample distribution via a map learned using
stochastic variational inference (Hoffman et al.,2018); and
the use of approximate TMs for the construction of in-
dependence proposals in an adaptive MCMC algorithm
(Gabrié et al.,2022).
However, despite the considerable promise of incorpo-
rating approximate TMs into sampling methodology, to
our knowledge, such ideas have not been considered in
the transdimensional sampling setting. Such an omission
is somewhat surprising, as RJMCMC samplers are con-
structed using a form of invertible maps involving distri-
butions, and hence it would intuitively appear natural that
distributional transport could feature in a useful capacity.
This work aims to develop such an approach and to demon-
strate its benefits.
Contribution. The primary contributions of this work
are as follows:
I. A new class of RJMCMC proposals for across-model
moves, called transport reversible jump (TRJ) proposals
are developed. In the idealized case where exact transports
are used, the proposals are shown to have a desirable prop-
erty (Proposition 1).
II. A numerical study is conducted on challenging exam-
ples demonstrating the efficacy of the proposed approach
in the setting where approximate transport maps are used.
III. An alternative “all-in-one” approach to training ap-
proximate TMs is developed, which involves combining a
saturated state space formulation of the target distribution
with conditional TMs.
Code for the numerical experiments is made available at
https://github.com/daviesl/trjp.
Structure of this Article. The remainder of this article
is structured as follows: Section 2discusses the required
background concepts regarding RJMCMC, transport maps,
and flow-based models. Section 3introduces a general
strategy for using transport maps within RJMCMC and dis-
cusses its properties. Section 4conducts a numerical study
to demonstrate the efficacy of the strategy in the case where
approximate transports are used. Section 5explores an al-
ternative “all-in-one” approach to training transport maps
and provides an associated numerical example. Section 6
concludes the paper.
Notation. For a function Tand distribution ν,T ]ν de-
notes the pushforward of νunder T. That is, if Zν, then
T ]ν is the probability distribution of T(Z). For a univari-
ate probability distribution ν, define nνas ν · · · ν
|{z }
ntimes
.
Throughout, univariate functions are to be interpreted as
applied element-wise when provided with a vector as in-
put. The symbol denotes the Hadamard (element-wise)
product. The univariate standard normal probability den-
sity function is denoted as φ,φdis the d-dimensional mul-
tivariate standard normal probability density function, and
φΣd×dis the d-dimensional multivariate normal probability
density function centered at 0dwith covariance Σd×d. For
a function f:RnRn, the notation |Jf(θ)|denotes the
absolute value of the determinant of the Jacobian matrix of
fevaluated at some θRn. For distributions πdefined
on sets of the form in (1), we write πkfor the distribution
conditional on k, and π(k)for its k-marginal distribution.
2. BACKGROUND
2.1 Reversible Jump Markov Chain Monte Carlo
For a distribution πdefined on a space of the form in
(1), with associated probability density function π(x), the
standard method to construct a π-invariant Metropolis–
Hastings kernel (and thus an associated MCMC sampler)
is the reversible jump approach introduced in the seminal
work by Green (1995). The proposal mechanism is con-
structed to take x= (k, θk)to x0= (k0,θ0
k0)where the
dimensions of θkand θ0
k0are nkand nk0, respectively. The
approach employs dimension matching, introducing aux-
iliary random variables ukgk,k0and u0
k0gk0,k of
dimensions wkand wk0, which are arbitrary provided that
nk+wk=nk0+wk0. A proposal is then made using
these auxiliary random variables and a chosen diffeomor-
phism hk,k0defined so that (θ0
k0,u0
k0) = hk,k0(θk,uk). A
discrete proposal distribution jkis also specified for each
k∈ K, where jk(k0)defines the probability of proposing
to model k0from model k. More generally, the distribu-
tions jkmay also depend on θk, but we do not consider
this case. With the above formulation of the proposal, the
RJMCMC acceptance probability is
α(x,x0) = 1 π(x0)jk0(k)gk0,k(u0
k0)
π(x)jk(k0)gk,k0(uk)|Jhk,k0(θk,uk)|.
(2)
2.2 Transport Maps and Flow-Based Models
Consider two random vectors θµθand Zµz, such
that their distributions µθand µzare absolutely continuous
with respect to n-dimensional Lebesgue measure. A func-
tion T:RnRnis called a transport map (TM) from
µθto µzif µz=T ]µθ. In this setting, we refer to µθ
as the target distribution and µzas the reference distribu-
tion. Transport maps between two prescribed distributions
Laurence Davies, Robert Salomone, Matthew Sutton, Christopher Drovandi
are known to exist under mild conditions (see e.g., Parno
and Marzouk (2018) and the references therein).
One strategy for obtaining approximate TMs from samples
of a target πis via density estimation with a family of distri-
butions arising from transformations. Let {Tψ}be a fam-
ily of diffeomorphisms parameterized by ψΨwith do-
main on the support of some arbitrary base distribution µz.
Then, for fixed ψ, the probability density function of the
random vector ζ=Tψ(Z)is
µζ(ζ;ψ) = µz(T1
ψ(ζ))|JT1
ψ(ζ)|,ζRn.(3)
An approximate TM from µzto some prescribed target dis-
tribution πcan be learned as a result of fitting a model of
the above form to samples from πby minimizing KL diver-
gence between the model and an empirical distribution of
possibly-approximate samples from π, which is equivalent
to maximum likelihood estimation.
Families of distributions arising from the specification of
highly flexible and scalable {Tψ}have been an area of
intense recent interest in the machine learning literature,
often referred to as flow-based models, with the families
{Tψ}often referred to as flows or normalizing flows (NF).
A general class of transform is autoregressive flows, briefly
described as follows. Let τ(·;ω), called the transformer,
be a univariate diffeomorphism parametrized by ω,
where is the set of admissible parameters. Then, the
transformation is defined elementwise via
T(Z)i=τ(Zi;ζi(z<i;ψ)), i = 1, . . . , n, (4)
where the functions {ζi:Ri1}are called the condi-
tioners. In practice, the {ζi}are each individually subsets
of the output of a single neural network that takes Zas
input and is designed to respect the autoregressive struc-
ture. When fitting approximate transports to samples in our
experiments in Section 4, for additional flexibility the flow-
based model used arises from several chained transforma-
tions of the form in (4), each with different parameters,
and where the transformer is a piecewise function involving
monotonic rational-quadratic splines (Durkan et al.,2019).
For further details and a comprehensive overview of other
flows, see Papamakarios et al. (2021).
3. TRANSPORT REVERSIBLE JUMP
PROPOSALS
We introduce a special form of general proposal between
two models, corresponding to indices kand k0. Let ν
be some univariate reference distribution, and let {Tk:
Rnk→ Znk}be a collection of diffeomorphisms such
that Tkk≈ ⊗nkν, k ∈ K, i.e., Tkis an approxi-
mate TM from the conditional target corresponding to in-
dex k, to an appropriately-dimensioned product of inde-
pendent reference distributions. Similarly, T1
kis an ap-
θ1
u
(θ(1)
2,θ(2)
2) = T1
2(z1, u1)
(θ(1)
2,θ(2)
2)
θ1
θ1=T1
1(z1)
(z(1)
2, z(2)
2) = T2(θ(1)
2,θ(2)
2)
z1=T1(θ1)
(θ(1)
2,θ(2)
2)
Figure 1: Illustration of the proposal class. Here, the refer-
ence νis Gaussian, and the functions ¯
hk,k0are the identity
map.
proximate TM from the reference to the conditional tar-
get for index k. The general idea of the proposal scheme
from model kto model k0is to first apply the transforma-
tion that would (approximately) transport πkto its prod-
uct reference, drawing auxiliary variables (if proposing to
a higher-dimensional model), optionally applying a prod-
uct reference measure preserving diffeomorphism, discard-
ing auxiliary variables (if proposing to a lower dimensional
model), and then applying the transformation that would
(approximately) transport the augmented vector to the new
conditional target. Figure 1illustrates this idea for jumps
between targets of one and two dimensions.
Formally, we first restrict the dimension of proposed auxil-
iary variables u(if any) to wk, which is defined to be the
dimension difference between θkand θ0
k0, i.e., nk+wk=
nk0. Then, assuming wk0(a jump is proposed to a
model of equal or higher dimension), the proposal is ob-
tained via
zk=Tk(θk),
zk0=¯
hk,k0(zk,uk),where uk∼ ⊗wkν,
θ0
k0=T1
k0(zk0),
(5)
where each ¯
hk,k0:Rnk×RwkRnk0, is a diffeomor-
phism that both satisfies the pushforward-invariance condi-
tion
¯
hk,k0]max{nk,n0
k}ν=max{nk,n0
k}ν, (6)
and is volume-preserving (i.e., the absolute value of the Ja-
cobian determinant is always equal to one). The default
摘要:

TransportReversibleJumpProposalsLaurenceDavies1,3RobertSalomone2,3MatthewSutton1,3ChristopherDrovandi1,31SchoolofMathematicalSciences,QueenslandUniversityofTechnology2SchoolofComputerScience,QueenslandUniversityofTechnology3CentreforDataScience,QueenslandUniversityofTechnologyAbstractReversiblejumpM...

展开>> 收起<<
Transport Reversible Jump Proposals Laurence Davies13Robert Salomone23Matthew Sutton13Christopher Drovandi13 1School of Mathematical Sciences Queensland University of Technology.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:7.13MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注