Understanding Linchpin Variables in Markov Chain Monte Carlo Dootika VatsFelipe AcostaMark L. HuberGalin L. Jones October 26 2022

2025-05-06 1 0 399.83KB 14 页 10玖币

侵权投诉

Understanding Linchpin Variables in Markov Chain Monte Carlo

Dootika Vats ∗Felipe Acosta†Mark L. Huber‡Galin L. Jones§

October 26, 2022

Abstract

An introduction to the use of linchpin variables in Markov chain Monte Carlo (MCMC) is provided.

Before the widespread adoption of MCMC methods, conditional sampling using linchpin variables was

essentially the only practical approach for simulating from multivariate distributions. With the advent

of MCMC, linchpin variables were largely ignored. However, there has been a resurgence of interest in

using them in conjunction with MCMC methods and there are good reasons for doing so. A simple

derivation of the method is provided, its validity, beneﬁts, and limitations are discussed, and some

examples in the research literature are presented.

1 Introduction

Modern statistical models are often suﬃciently complicated so as to require the use of simulation for

inference. Since the seminal work of Gelfand and Smith (1990), Markov chain Monte Carlo (MCMC)

has become the default method for doing so, especially in the context of Bayesian inference. The

Metropolis-Hastings (MH) algorithm (Hastings, 1970; Metropolis et al., 1953) is a commonly-used

MCMC method due to its ﬂexibility, ease of implementation, and theoretical validity under weak

conditions. However, it is often challenging to develop eﬀective MH algorithms, particularly when the

target distribution is high-dimensional or has substantial correlation between components. A standard

approach is to consider component-wise MCMC methods (Johnson et al., 2013; Jones et al., 2014)

such as Gibbs samplers or conditional MH, also called Metropolis-within-Gibbs, perhaps using data

augmentation (Hobert, 2011; Tanner and Wong, 1987). However, component-wise approaches can

produce Markov chains that suﬀer from slow mixing (B´elisle, 1998; Jonasson, 2017; Matthews, 1993).

∗Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, dootika@iitk.ac.in

†Natera, San Carlos, California, acosta.felipe@gmail.com

‡Department of Mathematics and Computer Science, Claremont McKenna College, mhuber@cmc.edu

§School of Statistics, University of Minnesota, galin@umn.edu

arXiv:2210.13574v1 [stat.CO] 24 Oct 2022

The limitations of standard MCMC methods in modern applications has brought about a plethora

of new approaches in speciﬁc statistical settings. However, our goal is to highlight an old and now

under-appreciated technique using linchpin variables, that can often serve to simplify the sampling

process and provides an organizing device for many of the novel sampling methods.

Before the widespread use of MCMC, the only potentially practical, general tool for sampling from

multivariate joint distributions was the conditional sampling method (Devroye, 1986; H¨ormann et al.,

2004; Johnson, 1986). Let f(x, y) be a density function on X ×Y ⊆ Rd1×Rd2and fX|Ybe the density

function of the conditional distribution of Xgiven Y. Let fYbe the density function of the marginal

distribution of Y. If sampling from fX|Yis straightforward, then Yis called a linchpin variable (Huber,

2016) since

f(x, y) = fX|Y(x|y)fY(y).(1)

Thus exact samples can be obtained by ﬁrst simulating Y∼fYfollowed by X∼fX|Y. This idea

is easily extended to the setting with more than two variables through the usual properties of joint

probability functions.

Example 1. Consider the Rosenbrock (or banana) density on R2

f(x, y)∝exp −1

20 100(x−y)2+ (1 −y)2.

This has become a popular and useful toy example for illustrating the performance of MCMC methods

in highly correlated settings. In particular, because the contour plots resemble the shape of a banana,

it can be a challenge to implement an eﬀective MH algorithm. Notice that by inspection of the joint

density, X|Y=y∼N(y2,10−1)and integrating f(x, y)with respect to xyields that Y∼N(1,10).

Hence Yis a linchpin variable and it is simple to implement conditional sampling.

Often the linchpin density, fY, is complex enough to prevent direct sampling from it. When it is

diﬃcult to sample from fYdirectly, it is natural to turn to MCMC methods for doing so, yielding a so-

called linchpin variable sampler. Our goal is to present advantages of using linchpin variable samplers,

highlight some fundamental theoretical properties, and illustrate examples from the literature where

they have been employed successfully.

An obvious potential beneﬁt to the linchpin variable sampler is that it naturally reduces the dimen-

sion of the MCMC sampling problem since the target density is the marginal fY(y) instead of the joint

f(x, y). Also, the linchpin variable sampler can be particularly eﬀective when Xand Yare heavily

correlated (as demonstrated in a motivating example below); and ﬁnally, since information on Xis

not required to sample Y, all post-processing (like thinning) can ﬁrst be done on the linchpin variable,

before sampling X; see Owen (2017) for guidance on when thinning a Markov chain simulation might

be useful.

Example 2. Consider sampling from a p-variate normal distribution with mean µand covariance Σ:











∼Np









µ1

µ2





,Σ = 





Σ11 Σ12

Σ21 Σ22









,(2)

where µ1∈Rp−rand µ2∈Rr,r < p. The full conditional distributions are

X1|X2=x2∼Np−rµ1+ Σ12Σ−1

22 (x2−µ2),Σ11 −Σ12Σ−1

22 Σ21and

X2|X1=x1∼Nrµ2+ Σ21Σ−1

11 (x1−µ1),Σ22 −Σ21Σ−1

11 Σ12.

Let p= 5,r= 1, and Σbe the 5×5autocorrelation matrix with autocorrelation ρ∈ {.5, .99}.

MCMC algorithms are easily implemented in this example. For example the above full conditionals

make it easy to implement a Gibbs sampler while a linchpin variable sampler with the linchpin variable

being X2is also straightforward. Here, for the marginal of X2, consider an MH algorithm with proposal

Uniform(x2−h, x2+h)with hchosen to yield the approximate optimal scaling of Roberts et al. (1997).

Starting from the origin, both samplers are run for 5000 steps. The results are given in Figure 1.

When ρ=.5, both methods perform similarly; however, as expected (cf. Raftery and Lewis, 1992), when

ρ=.99, the Gibbs sampler suﬀers from slow convergence. The linchpin variable sampler is unaﬀected

by the higher correlation in the target distribution, as this correlation does not aﬀect the marginal

distribution for X2.

4000 4200 4400 4600 4800 5000

12 14 16 18 20

Index

Trace

Gibbs Linchpin

4000 4200 4400 4600 4800 5000

12 14 16 18 20

Index

Trace

Gibbs Linchpin

Figure 1: Trace plot for the last 1000 samples for ρ=.50 (left) and ρ=.99 (right)

2 Linchpin variable sampler

Linchpin variable samplers yield valid MCMC algorithms and provide an organizing principle for seem-

ingly disconnected Monte Carlo methods, but some basic MCMC concepts are required to get to that

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

UnderstandingLinchpinVariablesinMarkovChainMonteCarloDootikaVats*FelipeAcostaMarkL.HuberGalinL.Jones§October26,2022AbstractAnintroductiontotheuseoflinchpinvariablesinMarkovchainMonteCarlo(MCMC)isprovided.BeforethewidespreadadoptionofMCMCmethods,conditionalsamplingusinglinchpinvariableswasessential...

展开>> 收起<<

Understanding Linchpin Variables in Markov Chain Monte Carlo Dootika VatsFelipe AcostaMark L. HuberGalin L. Jones October 26 2022.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Understanding Linchpin Variables in Markov Chain Monte Carlo Dootika VatsFelipe AcostaMark L. HuberGalin L. Jones October 26 2022

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: