Tierney and Kadane,1986;Rue et al.,2009) to play a pivotal role in modern Bayesian literature.
Essentially, the Laplace approximation is a Gaussian distribution centered around the maximum a
posteriori (MAP) of the target distribution with a variance-covariance matrix that coincides with
the inverse of the negative Hessian of the log-posterior target evaluated at the MAP. Recently, the
ingenuity of Laplace’s approximation crossed path with P-splines, the brainchild of Paul Eilers
and Brian Marx (Eilers and Marx,1996) to inaugurate a new approximate Bayesian methodology
labelled “Laplacian-P-splines” (LPS) with promising applications in survival analysis (Gressani
and Lambert,2018;Gressani et al.,2022), generalized additive models (Gressani and Lambert,
2021), nonparametric double additive location-scale models for censored data (Lambert,2021)
and infectious disease epidemiology (Gressani et al.,2021). The sampling-free inference scheme
delivered by Laplace approximations combined with the possibility of smoothing different model
components with P-splines in a flexible fashion paves the way for a robust and much faster alter-
native to existing simulation-based methods. At the same time, the LPS toolbox helps P-spline
users gain access to the full potential of Bayesian methods, without having to endure the long and
burdensome CPU- and real-time often required by Markov chain Monte Carlo (MCMC) samplers.
Although LPS shares some methodological aspects with the popular integrated nested Laplace
approximations (INLA) approach (Rue et al.,2009), there are fundamental points of divergence
worth mentioning. First, the tools in INLA and its associated R-INLA software are originally
built to compute approximate posteriors of univariate latent variables, contrary to LPS that
natively delivers approximations to the (multivariate) joint posterior distribution of the latent
vector. The key benefit of working with an approximate version of the joint posterior is that
pointwise estimators and credible intervals for subsets of the latent vector (and functions thereof)
can be straightforwardly constructed. Second, by working with closed-form expressions for the
gradients and Hessians involved in the model, LPS is computationally more efficient than the
numerical differentiation treatment proposed in INLA. Third, while INLA can be combined with
various techniques for smoothing nonlinear model components, LPS is entirely devoted to P-splines
smoothers with the key advantage of having a full control over the penalization scheme (as the
approximate posterior distribution of the penalty parameter(s) is analytically available) and in
that direction, LPS has a closer connection to the work of Wood and Fasiolo (2017), especially in
the class of (generalized) additive models (Wood,2017).
The success of Laplace approximations in Bayesian statistics owes much to a central limit
type argument. Under certain regularity conditions, the Bernstein-von Mises theorem (see e.g.
Van der Vaart,2000) ensures that posterior distributions in differentiable models converge to a
Gaussian distribution under large samples. In situations involving small to moderate sample sizes,
the asymptotic validity of the Laplace approximation can become seriously shattered as it does
not account for features involving non-zero skewness (i.e. lack of symmetry) (Ruli et al.,2016).
Even under relatively large samples, the Laplace approximation might fail in scenarios involving
binary data as the latter are poorly informative for the model parameters and can result in a
flat log-likelihood function, thus complicating inference (Ferkingstad and Rue,2015;Gressani and
Lambert,2021).
Laplacian-P-splines originally belong to the class of latent Gaussian models, where model pa-
rameters are dichotomized between a vector of latent variables ξ(including penalized B-spline
coefficients, regression coefficients and other parameters of interest) that are assigned a Gaussian
prior and another vector of hyperparameters ηthat involves nuisance parameters, such as the
smoothing parameter inherent to P-splines, and for which prior assumptions need not be Gaus-
sian. Combining Bayes’ rule and a simplified Laplace approximation, the conditional posterior
distribution of ξunder the LPS framework is approximated by a Gaussian distribution denoted by
epG(ξ|b
η,D), where b
ηis a summary statistic of the posterior hyperparameter vector (e.g. the MAP
or posterior mean/median) and Ddenotes the observed data. Although the latter approxima-
tion is typically accurate for penalized B-spline coefficients, it might be less appropriate for other
2