Penalty parameter selection and asymmetry corrections to Laplace approximations in Bayesian P-splines models

2025-05-02 0 0 2.66MB 15 页 10玖币
侵权投诉
Penalty parameter selection and asymmetry
corrections to Laplace approximations
in Bayesian P-splines models
PHILIPPE LAMBERT,1,2and OSWALDO GRESSANI3,
1Institut de Math´ematique, Universit´e de Li`ege, Belgium
2Institut de Statistique, Biostatistique et Sciences Actuarielles (ISBA),
Universit´e catholique de Louvain, Belgium
3Interuniversity Institute for Biostatistics and statistical
Bioinformatics (I-BioStat), Data Science Institute,
Hasselt University, Belgium
Corresponding author: p.lambert@uliege.be
October 5, 2022
Abstract
Laplacian-P-splines (LPS) associate the P-splines smoother and the Laplace approximation
in a unifying framework for fast and flexible inference under the Bayesian paradigm. Gaus-
sian Markov field priors imposed on penalized latent variables and the Bernstein-von Mises
theorem typically ensure a razor-sharp accuracy of the Laplace approximation to the posterior
distribution of these variables. This accuracy can be seriously compromised for some unpenal-
ized parameters, especially when the information synthesized by the prior and the likelihood
is sparse. We propose a refined version of the LPS methodology by splitting the latent space
in two subsets. The first set involves latent variables for which the joint posterior distribu-
tion is approached from a non-Gaussian perspective with an approximation scheme that is
particularly well tailored to capture asymmetric patterns, while the posterior distribution for
parameters in the complementary latent set undergoes a traditional treatment with Laplace
approximations. As such, the dichotomization of the latent space provides the necessary struc-
ture for a separate treatment of model parameters, yielding improved estimation accuracy as
compared to a setting where posterior quantities are uniformly handled with Laplace. In addi-
tion, the proposed enriched version of LPS remains entirely sampling-free, so that it operates at
a computing speed that is far from reach to any existing Markov chain Monte Carlo approach.
The methodology is illustrated on the additive proportional odds model with an application
on ordinal survey data.
Keywords: Additive model ; P-splines ; Laplace approximation ; Skewness.
1 Motivation
By publishing his M´emoire sur la probabilit´e des causes par les ´ev´enements (Laplace,1774), the
young French polymath Pierre-Simon de Laplace (1749-1827) seeded an idea today known as the
Laplace approximation. At that time, Laplace probably could not have imagined that almost
two centuries later, his approximation technique would be resurrected (see e.g. Leonard,1982;
1
arXiv:2210.01668v1 [stat.ME] 4 Oct 2022
Tierney and Kadane,1986;Rue et al.,2009) to play a pivotal role in modern Bayesian literature.
Essentially, the Laplace approximation is a Gaussian distribution centered around the maximum a
posteriori (MAP) of the target distribution with a variance-covariance matrix that coincides with
the inverse of the negative Hessian of the log-posterior target evaluated at the MAP. Recently, the
ingenuity of Laplace’s approximation crossed path with P-splines, the brainchild of Paul Eilers
and Brian Marx (Eilers and Marx,1996) to inaugurate a new approximate Bayesian methodology
labelled “Laplacian-P-splines” (LPS) with promising applications in survival analysis (Gressani
and Lambert,2018;Gressani et al.,2022), generalized additive models (Gressani and Lambert,
2021), nonparametric double additive location-scale models for censored data (Lambert,2021)
and infectious disease epidemiology (Gressani et al.,2021). The sampling-free inference scheme
delivered by Laplace approximations combined with the possibility of smoothing different model
components with P-splines in a flexible fashion paves the way for a robust and much faster alter-
native to existing simulation-based methods. At the same time, the LPS toolbox helps P-spline
users gain access to the full potential of Bayesian methods, without having to endure the long and
burdensome CPU- and real-time often required by Markov chain Monte Carlo (MCMC) samplers.
Although LPS shares some methodological aspects with the popular integrated nested Laplace
approximations (INLA) approach (Rue et al.,2009), there are fundamental points of divergence
worth mentioning. First, the tools in INLA and its associated R-INLA software are originally
built to compute approximate posteriors of univariate latent variables, contrary to LPS that
natively delivers approximations to the (multivariate) joint posterior distribution of the latent
vector. The key benefit of working with an approximate version of the joint posterior is that
pointwise estimators and credible intervals for subsets of the latent vector (and functions thereof)
can be straightforwardly constructed. Second, by working with closed-form expressions for the
gradients and Hessians involved in the model, LPS is computationally more efficient than the
numerical differentiation treatment proposed in INLA. Third, while INLA can be combined with
various techniques for smoothing nonlinear model components, LPS is entirely devoted to P-splines
smoothers with the key advantage of having a full control over the penalization scheme (as the
approximate posterior distribution of the penalty parameter(s) is analytically available) and in
that direction, LPS has a closer connection to the work of Wood and Fasiolo (2017), especially in
the class of (generalized) additive models (Wood,2017).
The success of Laplace approximations in Bayesian statistics owes much to a central limit
type argument. Under certain regularity conditions, the Bernstein-von Mises theorem (see e.g.
Van der Vaart,2000) ensures that posterior distributions in differentiable models converge to a
Gaussian distribution under large samples. In situations involving small to moderate sample sizes,
the asymptotic validity of the Laplace approximation can become seriously shattered as it does
not account for features involving non-zero skewness (i.e. lack of symmetry) (Ruli et al.,2016).
Even under relatively large samples, the Laplace approximation might fail in scenarios involving
binary data as the latter are poorly informative for the model parameters and can result in a
flat log-likelihood function, thus complicating inference (Ferkingstad and Rue,2015;Gressani and
Lambert,2021).
Laplacian-P-splines originally belong to the class of latent Gaussian models, where model pa-
rameters are dichotomized between a vector of latent variables ξ(including penalized B-spline
coefficients, regression coefficients and other parameters of interest) that are assigned a Gaussian
prior and another vector of hyperparameters ηthat involves nuisance parameters, such as the
smoothing parameter inherent to P-splines, and for which prior assumptions need not be Gaus-
sian. Combining Bayes’ rule and a simplified Laplace approximation, the conditional posterior
distribution of ξunder the LPS framework is approximated by a Gaussian distribution denoted by
epG(ξ|b
η,D), where b
ηis a summary statistic of the posterior hyperparameter vector (e.g. the MAP
or posterior mean/median) and Ddenotes the observed data. Although the latter approxima-
tion is typically accurate for penalized B-spline coefficients, it might be less appropriate for other
2



Hyperparameter space
Laplace approximation to all latent variables

 


 


 



 



 



 



 



 


Latent space

 

.
.
..
.
.
..
..
Laplace approximation to

 

Split

 

=(

 

,)
T
TT
Observed data

 











 



 



 








Non-Gaussian
Approximation to



Figure 1 – A different approximation scheme is proposed for disjoint subsets of the latent space.
As such, the asymmetric patterns for parameters that are suspected to have posterior distributions
deviating from Gaussianity can be captured more accurately.
candidates in ξwith large prior variance. In that case, the misfit between the Laplace approxima-
tion and a potentially asymmetric (or heavy-tailed) target posterior distribution for a parameter
can have a detrimental effect on posterior summary statistics and on any results relying on the
generated approximation for the posterior distribution of the model parameters. This motivates
us to develop an approach that corrects for potential posterior misfits provided by the Laplace
approximation.
A recent technique proposed by Chiuchiolo et al. (2022) in the INLA framework consists in
using a skew Gaussian copula to correct for skewness when posterior latent variables have a non-
negligible deviation from Gaussianity. Our proposal in models involving P-splines consists in
splitting the latent parameter space into a set of parameters γfor which the posterior distribution
(conditional on the hyperparameters) is approximated in a non-Gaussian fashion with an empha-
sis on capturing asymmetries, and a set of parameters θfor which the conditional posterior is
approached with Laplace approximations. Figure 1illustrates the separation of the latent space
into two subsets, i.e. ξ= (γ>,θ>)>. The posterior approximation scheme for γtakes asymmetric
patterns into account, while the latent variables in θ(that typically involves penalized B-spline
coefficients) are approximated by a Gaussian density. Our refined LPS approach thus allows to
obtain an approximate version of the joint posterior distribution for all the components in ξto-
gether with a posterior approximation to the hyperparameter components in ηwithout relying
on an MCMC sampling scheme.
A simple motivating example inspired from the infectious disease model of Gressani et al.
(2021) helps framing the problem. Let D={y1, . . . , yn}be an i.i.d. sample of size nfrom
a negative binomial distribution NB(µ(x), γ) having a probability mass function following the
parameterization of Piegorsch (1990) with mean E(y|x) = µ(x), variance V(y|x) = µ(x)+µ(x)2
and overdispersion parameter γ > 0. We model the mean with P-splines log(µ(x)) = θ>b(x),
where b(·) is a cubic B-spline basis on the interval [1, n] and θis a vector of B-spline coefficients.
Note that limγ+V(y|x) = µ(x), i.e. a Poisson is obtained as a limiting distribution when the
3
摘要:

PenaltyparameterselectionandasymmetrycorrectionstoLaplaceapproximationsinBayesianP-splinesmodelsPHILIPPELAMBERT;1;2andOSWALDOGRESSANI3,1InstitutdeMathematique,UniversitedeLiege,Belgium2InstitutdeStatistique,BiostatistiqueetSciencesActuarielles(ISBA),UniversitecatholiquedeLouvain,Belgium3Interun...

展开>> 收起<<
Penalty parameter selection and asymmetry corrections to Laplace approximations in Bayesian P-splines models.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:2.66MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注