Generalized Bayes Approach to Inverse Problems with Model Misspecification Youngsoo Baek1 Wilkins Aquino2 Sayan Mukherjee1 3 4 5

2025-05-06 0 0 1.12MB 26 页 10玖币

侵权投诉

Generalized Bayes Approach to Inverse Problems

with Model Misspeciﬁcation

Youngsoo Baek1, Wilkins Aquino2, Sayan Mukherjee1 3 4 5

1Department of Statistical Science, Duke University, NC

Department of Mechanical Engineering and Materials Science, Duke University, NC

3Department of Mathematics, Computer Science, Biostatistics & Bioinformatics, Duke

University, NC

Center for Scalable Data Analytics and Artiﬁcial Intelligence, Universit¨at Leipzig

Max Planck Institute for Mathematics in the Sciences, Leipzig

Abstract. We propose a general framework for obtaining probabilistic solutions to PDE-

based inverse problems. Bayesian methods are attractive for uncertainty quantiﬁcation but

assume knowledge of the likelihood model or data generation process. This assumption is

diﬃcult to justify in many inverse problems, where the speciﬁcation of the data generation

process is not obvious. We adopt a Gibbs posterior framework that directly posits a

regularized variational problem on the space of probability distributions of the parameter.

We propose a novel model comparison framework that evaluates the optimality of a given loss

based on its “predictive performance”. We provide cross-validation procedures to calibrate

the regularization parameter of the variational objective and compare multiple loss functions.

Some novel theoretical properties of Gibbs posteriors are also presented. We illustrate

the utility of our framework via a simulated example, motivated by dispersion-based wave

models used to characterize arterial vessels in ultrasound vibrometry.

1. Introduction

Quantiﬁcation of uncertainty in the context of inverse problems is increasingly demanded

by many applications [Stuart,2010]. Bayesian statistics provides a useful viewpoint for this

demand [Cotter et al.,2009]. In a Bayesian framework, one prescribes a prior distribution

summarizing relative uncertainties about possible solutions to the inverse problem. After

observing noisy data, one updates the probabilities to obtain a posterior distribution of

the possible solutions. A fundamental component of a Bayesian formulation is the data-

generating process or likelihood. Speciﬁcation of a likelihood is often invoked as a necessary

condition to guarantee theoretical properties of the posterior. However, it is diﬃcult to

specify the data-generating process in nonlinear inverse problems due to two main sources

of model uncertainty: forward model uncertainty, with respect to the underlying system

dynamics; uncertainty, or lack of knowledge, with respect to the distribution of noise. The

possibility of model misspeciﬁcation raises a serious concern about using Bayesian methods.

In this paper, we propose to solve inverse problems using an alternative, Gibbs posterior or

Generalized Bayes framework, proposed by Jiang and Tanner [2008], Bissiri et al. [2016],

Dunlop and Yang [2021], Zou et al. [2019]. The framework similarly requires a prior

distribution and outputs a probability update conditional on the data. Gibbs posteriors

do not rely on the knowledge of likelihood. They are derived as a solution to a variational

problem on the space of probability measures on the space of solutions. They require the

choice of a loss function that measures the mismatch between the model and the data.

To use the Gibbs/generalized Bayes approach to solve inverse problems, several

questions need to be addressed. Without knowledge of the underlying data-generating

mechanism, how can we make a good choice of loss? In the variational objective that needs

to be minimized, how do we determine the regularization parameter? The regularization

parameter plays a vital role in balancing the trade-oﬀ between ﬁdelity to the observed data

and exploiting prior information. Finally, is the variational problem well-posed?

arXiv:2210.06921v3 [stat.ME] 26 Sep 2023

Generalized Bayes Approach to Inverse Problems with Model Misspeciﬁcation 2

1.1. Contributions

The main contributions of this paper are the following:

1 A theory of model comparison for Gibbs posteriors that enables model comparison for

loss functions. We deﬁne a notion of “predictive performance” for Gibbs posterior and

study its theoretical properties.

2 We develop a particle ﬁlter and importance sampling method to simultaneously sample

from the underlying Gibbs posterior and calibrate the regularization parameter that

balances the loss function and regularization with respect to the prior. Our calibration

procedure minimizes a novel leave-one-out cross-validation (LOOCV) objective. Due

to the distributional nature of the solution, existing cross-validation algorithms are not

immediately applicable.

3 We prove the stability and consistency of Gibbs posteriors. We show the continuity

of Gibbs posterior as a mapping of the data in various distances for probability

distributions. Our proposed upper bound improves on existing upper bounds that are

vacuous when the perturbation to the data is large. We also study the asymptotic

behavior of Gibbs posteriors in the large sample limit. The technical aspects of a

consistency proof rely on tools in the robust Bayes estimation literature. We also study

the asymptotics of a predictive distribution used for model selection associated with the

Gibbs posterior.

1.2. Prior work

The relation between the regularized least-squares problem proposed by Tikhonov and

Arsenin [1977] and the maximum a posteriori (MAP) estimation problem in Bayesian

statistics has been known for some time. Bayesian methods for inverse problems have been

successfully adopted in diverse domains, nicely summarized by Kaipio and Somersalo [2005].

Recent literature [Cotter et al.,2009,Stuart,2010,Cotter et al.,2013] has extended the

Bayesian framework with Gaussian likelihood to inﬁnite-dimensional settings. The Gibbs

posterior framework [Bissiri et al.,2016,Jiang and Tanner,2008,Martin et al.,2017] is not

new, and its application in inverse problems was studied by Zou et al. [2019], Dunlop and

Yang [2021]. Similar concepts have been studied by Gr¨unwald and Langford [2007], Gr¨unwald

and van Ommen [2017], Miller and Dunson [2019], Bhattacharya et al. [2019], among others,

for improving the robustness of Bayesian inference under model misspeciﬁcation. The novel

model selection theory we develop in this paper can be viewed as an analog of the theory

of Bayesian model selection and Bayesian cross-validation under model misspeciﬁcation

[Bernardo and Smith,2009]. Computationally, we rely on sequential Monte Carlo and

particle ﬁlters algorithms. These algorithms have gained recent attention for potential use in

Bayesian inverse problems. [Kantas et al.,2014,Beskos et al.,2015] have used particle ﬁlters

to solve parabolic and elliptic inverse problems. Zou et al. [2019] have proposed a combination

of particle ﬁlter and reduced order models for improved computational eﬃciency.

A vast amount of literature exists on quantifying uncertainty in inverse problems.

We place our method in context with previous ideas. Our variational formulation shares

similarities with variational Bayes methods used in nonlinear inverse problems [Franck

and Koutsourelakis,2016] and stochastic design [Koutsourelakis,2016]. In these works,

an objective involving a complex posterior distribution is minimized under the constraint

that the approximating distribution is easy to sample from. In contrast, we use the

variational problem to deﬁne the distribution of interest. Second, when the likelihood is

intractable, approximate Bayesian computation (ABC) has been proposed as a viable method

of approximating intractable likelihoods [Lyne et al.,2015,Zeng et al.,2019]. However, these

procedures can be computationally costly and do not address model misspeciﬁcation.

Finally, several methods have been proposed for solving stochastic inverse problems and

nonparametric probability measure estimation. Gradient-based optimization methods have

been used to solve stochastic inverse problems [Narayanan and Zabaras,2004,Borggaard and

Generalized Bayes Approach to Inverse Problems with Model Misspeciﬁcation 3

Van Wyk,2015,Warner et al.,2015]. When the unknown parameter itself is a probability

distribution, Banks et al. [2015], Banks and Thompson [2015] have proposed minimizing

a discretized objective stated in terms of the Prohorov metric. An exciting avenue we do

not pursue in this work is to investigate possible connections between these non-Bayesian

approaches and the Bayes/Gibbs posterior frameworks.

1.3. Outline of the Paper

Section 2presents the foundations for the Gibbs posterior framework in the setup of inverse

problems with model uncertainty. In Section 3, we oﬀer results on stability and asymptotic

properties of Gibbs posteriors. We also present our novel contributions to model selection for

diﬀerent loss functions that are not intrinsically comparable to each other. In Section 4, we

describe a Monte Carlo algorithm that simultaneously learns the regularization parameter

based on the LOOCV criterion and samples from the underlying Gibbs posterior. The

algorithm is novel and relies on recent advances in particle ﬁltering and importance sampling

for Bayesian LOOCV. Section 5presents numerical experiments illustrating the beneﬁts of

our approach. We conclude the paper with a discussion of future directions in Section 6. All

proofs are collected in the Appendix.

2. Gibbs Posterior with Model Selection

We review the foundations for the Gibbs posterior framework and describe properties of

the Gibbs posterior proposed by Bissiri et al. [2016]. Section 2.4 describes the problem

of model comparison and our original contribution of predictive model selection theory for

Gibbs posteriors.

2.1. Notations

We ﬁx some notation used throughout the text. We denote as || · || the norm in an Euclidean

space Rm. We write as ∆(X) the space of all probability distributions on X ⊂ Rm, assuming

standard Borel σ-algebras. For two probability measures µ, ν ∈∆(X), dT V (µ, ν), dH(µ, ν),

and DKL(µ, ν) denote the total variation metric, Hellinger metric, and Kullback-Leibler (KL)

divergence [Gibbs and Su,2002], respectively. For two probability measures µ, ν (possibly

on diﬀerent spaces X1and X2), we denote by µ⊗νtheir product measure. For a probability

measure µ∈∆(X), Lq(X;µ) is the space of all functions f:X → Rthat are Lq-integrable

with respect to µ, where q∈[1,∞].

2.2. Parametric Inverse Problems with Model Uncertainty

Throughout this paper, we assume observing ni.i.d. variables that take values in Y ⊂ Rd

with an unknown probability distribution P:

iid

∼P≡PF(θ0).(1)

Here, the parameter θ0is a physically meaningful parameter that characterizes the observed

system. The parameter-to-observation map F(θ) is often deﬁned in relation to the

parameterized PDE model:

M(u(θ); θ)=0, u(θ)∈ U,M:U → V∗.(2)

where U,Vare Hilbert spaces with V∗being the dual space of V. We assume for every θ

there exists a unique u(θ) that satisﬁes (2). The parameter-to-observation map is deﬁned as

F(θ) := Du(θ), where Dis the observation operator.

In the classical Bayesian framework, the parameterization of the sampling distribution

(1) by the forward model (2) is known. Examples include the additive white noise model

[Knapik et al.,2011] and Poisson likelihood [Barmherzig and Sun,2022]. However, in

practice, this need not be true because either the hypothesized parameterization is incorrect,

Generalized Bayes Approach to Inverse Problems with Model Misspeciﬁcation 4

or the model uncertainties are so large that such a parameterization is diﬃcult. Both

errors in the forward model and errors in the noise distribution contribute to an incorrect

parameterization of the likelihood. There are many ways in which such a mismatch can arise.

A concrete example in ultrasound vibrometry application is reviewed in Section 5.2. Here,

we only mention that both philosophical and asymptotic justiﬁcation of Bayesian inference

is more tenuous with model misspeciﬁcation. While the modeler may use a “surrogate

likelihood” to deﬁne a misspeciﬁed Bayes posterior and argue that one can obtain good

approximations when the surrogate is “close” to Pθ, deﬁning this “closeness” in nonlinear

inverse problems is not trivial.

In the next Section, we review a variational formulation that bypasses these diﬃculties.

Instead of trying to deﬁne the correctly parameterized Qθ=Pθ, the variational perspective

deﬁnes a discrepancy between the posited forward model and the observed data. The relative

weights given to possible parameters θare higher if they yield smaller discrepancy or loss.

2.3. Variational Framework for Gibbs Posteriors

Let L: Θ×Rdbe a loss function. Let ρ0∈∆(Θ). We propose to solve the problem proposed

by Bissiri et al. [2016]:

ˆρW

n(dθ) := arg min

ρ∈∆(Θ) "RW(ρ) = ZΘ

i=1

L(θ, yi)ρ(dθ) + 1

nW DKL(ρ||ρ0)#.(3)

Here, ρ0is the distribution quantifying our prior and W > 0 is a regularization parameter

that we assume is given for now. If ρis not absolutely continuous with respect to ρ0, the

divergence is deﬁned to be +∞. Often we will abbreviate the average loss over all data by

Rn(θ) := 1

nPn

i=1 L(θ, yi).

To ensure the existence of a solution, we will make assumptions on the structure of the

problem, motivated by the assumptions of Cotter et al. [2009] and Stuart [2010].

Assumption 1. Let the loss L(θ, y) have the form l(F(θ), y) and satisfy the following.

(i) L(θ, y) is uniformly bounded from below:

inf

θ,y L(θ, y)≥B > −∞.

We assume B= 0 without loss of generality.

(ii) For every θ, y there exists K≡K(||θ||,||y||)∈L1(Θ × Y;ρ0⊗P) such that L(θ, y)≤

K(||θ||,||y||).

(iii) For every r > 0 there exists C1(r, y)>0 such that whenever ||θ1||,||θ2|| < r,

|L(θ1, y)−L(θ2, y)| ≤ C1(r, y)||θ1−θ2||

with C1(y)≡C1(r, y)∈L1(Y;P).

(iv) For every r > 0 there exists C2(r, θ)>0 such that whenever ||y1||,||y2|| < r,

|L(θ, y1)−L(θ, y2)| ≤ C2(r, θ)||y1−y2||

with exp(C2(θ)) ≡C2(r, θ)∈L1(Θ; ρ0).

Remark 1. Note that because Lis deﬁned to be a mapping on Θ× Y, the regularity

assumptions implicitly place restrictions on the forward model F. We will use the squared

ℓ2loss as an example to understand the regularity conditions

L(θ, y)≡l(F(θ), y) = ||y− F(θ)||2.

The properties of Fdictate whether the loss Lsatisﬁes the assumptions. Various PDE-based

models used in the literature, combined with the popular Gaussian prior distribution, satisfy

assumptions iand iv for squared ℓ2loss; see, e.g., Section 3, Stuart [2010]. On the other

hand, the integrability conditions ii and iii depend on the unknown P. One can check how

mild or severe these conditions turn are for a speciﬁc loss function, by hypothesizing models

like (20), without specifying a likelihood.

Generalized Bayes Approach to Inverse Problems with Model Misspeciﬁcation 5

We will also make a mild smoothness assumption on the density of the prior distribution

ρ0. The Gaussian prior satisﬁes the assumption.

Assumption 2. ρ0has positive density everywhere. Furthermore, for every r > 0 there

exists C3(r)>0 such that whenever ||θ1||,||θ2|| < r,

|log ρ0(θ1)−log ρ0(θ2)| ≤ C3(r)||θ1−θ2||.

There exists a unique solution to (3) in ∆(Θ), which has the following density for ﬁxed

W > 0:

ˆρW

n(dθ) := exp{−nW Rn(θ)}ρ0(dθ)

,(4)

where the normalizing constant, or the “partition function” ZW

n, is deﬁned as

n≡Ze−nW R(θ)ρ0(dθ).(5)

To derive this formula, the objective functional can be rewritten as

RW(ρ) = 1

nW DKL(ρ||ˆρW

n)−log ZW

n; (6)

The ﬁrst term is non-negative and uniquely attains zero at ρ≡ˆρ. The second term does

not depend on ρ, so the minimum of the functional is achieved at RW(ˆρW

n) = −log ZW

The possible technical issues are measurability of exp(−nW Rn(θ)) and ﬁniteness of log ZW

which follow from Assumption 1.

Remark 2. When we ﬁx W= 1 and choose the loss to be the negative log-likelihood:

L(θ, y) = −log p(y|F(θ)), the Gibbs posterior coincides with a Bayes posterior update, using

p(y|F(θ)) as its likelihood component. Thus, our Gibbs posterior solution strictly generalizes

the Bayes posterior.

We close the Section with some intuition of the role of Win the Gibbs posterior. In

the limit W→0, ˆρW

n≡ρ0, so there is no update of information from the prior. Smaller

Wthus more heavily weighs prior information. In the limit W→ ∞, ˆρW

nconcentrates

on a set of θs minimizing the loss over the observed data. Larger Wthus more heavily

weighs information from the data and leads to increased sensitivity to perturbation under

noise. Intuition suggests that a prior ρ0that is strictly positive on Θ should reﬂect large

uncertainty and that Wmust be carefully chosen based on the amount of information in the

data relative to the prior. Inspection of (3) suggests that we are implicitly implementing a

discrepancy principle [Nair,2009] since the divergence penalty has less inﬂuence when the

sample size is large.

2.4. Extension to Model Selection

2.4.1. Predictive Model Selection Solving the variational problem (3) still requires a pre-

speciﬁed choice of loss L. It may appear this requirement is as restrictive as positing the

generating process as a better choice of loss hinges on knowledge or assumptions on P. Our

ﬁrst proposal is to deﬁne a valid way to compare two diﬀerent losses without requiring

knowledge of the data-generating mechanism. The key idea is to compare them based on the

ability to make accurate predictions, measuring their discrepancy on a future observation.

As mentioned, this principle is not new and has been used to improve the robustness of

Bayesian model prediction and model checking. The novelty lies in the deﬁnition of the

predictive density without assuming the likelihood, which will serve as a natural discrepancy

measure between the new observation and prediction.

Consider a common prior distribution ρ0and multiple competing losses, L1, . . . , Lk,

deﬁned on subsets Θ1,...,Θkof Θ. Given the corresponding set of Gibbs posteriors

ˆρW1

n,1,...,ˆρWk

n,k , we propose the following predictive model comparison principle: map each

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

GeneralizedBayesApproachtoInverseProblemswithModelMisspecificationYoungsooBaek1,WilkinsAquino2,SayanMukherjee13451DepartmentofStatisticalScience,DukeUniversity,NCDepartmentofMechanicalEngineeringandMaterialsScience,DukeUniversity,NC3DepartmentofMathematics,ComputerScience,Biostatistics&Bioinformatic...

展开>> 收起<<

Generalized Bayes Approach to Inverse Problems with Model Misspecification Youngsoo Baek1 Wilkins Aquino2 Sayan Mukherjee1 3 4 5.pdf

共26页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Generalized Bayes Approach to Inverse Problems with Model Misspecification Youngsoo Baek1 Wilkins Aquino2 Sayan Mukherjee1 3 4 5

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: