Pseudo-marginal inference for doubly intractable problems
carrying out the MCMC sampling and standard MCMC such as the Metropolis-Hastings (MH) algorithm (Hastings,
1970; Metropolis et al., 1953) can thus be applied. This approach is model dependent and cannot always be applied.
The second category of methods, which applies more generally, approximates the likelihood function (including the
normalising function) and substitutes the approximation in place of the exact likelihood in the estimation procedure.
The pseudo-marginal (PM) method (Andrieu and Roberts, 2009; Beaumont, 2003) is often used when a positive
and unbiased estimator of the likelihood is available through Monte Carlo simulation. However, in some problems,
including doubly intractable models, forming an unbiased estimator that is almost surely positive is prohibitively
expensive (Jacob and Thiery, 2015). The so-called Russian roulette (RR) estimator (Lyne et al., 2015) is an example
of a method that can be used to unbiasedly estimate the likelihood function in doubly intractable models, although the
estimate is not necessarily positive.
We propose a method for exact inference on posterior expectations in doubly intractable problems based on the ap-
proach in Lyne et al. (2015), where an unbiased, but not necessarily positive, estimator of the likelihood function is
used. The algorithm targets a posterior density that uses the absolute value of the likelihood, resulting in iterates from
a perturbed target density. We follow Lyne et al. (2015) and reweight the samples from the perturbed target density
using importance sampling to obtain simulation-consistent estimates of the expectation of any function of the param-
eters with respect to the true posterior density. While our method does not sample from the target of interest, we refer
to it as exact due to its simulation-consistent property.
Our main contribution is to explore the use of the block-Poisson (BP) estimator (Quiroz et al., 2021) in the con-
text of estimating doubly intractable models using the signed PMMH approach. Our method provides the following
advantages over the Russian roulette method. First, the BP estimator has a much simpler structure and is more compu-
tationally efficient. Second, the block form of our estimator makes it possible to correlate the estimators of the doubly
intractable posterior at the current and proposed draws in the MH algorithm. Introducing such correlation dramatically
improves the efficiency of PM algorithms (Deligiannidis et al., 2018; Tran et al., 2016). Finally, under simplifying
assumptions, some statistical properties of the logarithm of the absolute value of our estimator are derived and used to
obtain heuristic guidelines to optimally tune the hyperparameters of the estimator. We demonstrate empirically that our
method outperforms Lyne et al. (2015) when estimating the Ising model. To the best of our knowledge, our method,
that of Lyne et al. (2015) and its extensions are the only alternatives in the PM framework to perform exact inference
(in the sense of consistent estimates of posterior expectations) for general doubly intractable problems. Compared
with algorithms which use auxiliary variables to avoid evaluating the normalising function, signed PMMH algorithms
are more widely applicable and generic as they do not require exact sampling from the likelihood.
The rest of the paper is organised as follows. Section 2 formally introduces the doubly intractable problem and
discusses previous research. Section 3 introduces our methodology and establishes the guidelines for tuning the
hyperparameters of the estimator. Section 4 demonstrates the proposed method in two simulation studies: the Ising
model and the Kent distribution. Section 5 analyses four real-world datasets using the Kent distribution. Section 6
concludes and outlines future research. The paper has an online supplement that contains all proofs and details of
the simulation studies. The supplement also contains an additional example applying our method to a constrained
Gaussian process (GP), where the normalising function arises from the GP prior.
2 Doubly intractable problems
2.1 Doubly intractable posterior distributions
Let p(y|θ)denote the density of the data vector y, where θis the vector of model parameters. Suppose p(y|θ) =
f(y|θ)/Z(θ), where f(y|θ)is computable while the normalising function Z(θ)is not. The reason that Z(θ)is
intractable may be that it is prohibitively expensive to evaluate numerically, or lacks a closed form. Two examples are
given below to demonstrate the intractability for both discrete and continuous observations y.
2