Consistent inference for diffusions from low frequency measurements

2025-05-02 1 0 1.34MB 34 页 10玖币

侵权投诉

Submitted to the Annals of Statistics

CONSISTENT INFERENCE FOR DIFFUSIONS FROM

LOW FREQUENCY MEASUREMENTS

BYRICHARD NICKL 1

1Department of Pure Mathematics and Mathematical Statistics, University of Cambridge*; nickl@maths.cam.ac.uk

Let (Xt)be a reﬂected diffusion process in a bounded convex domain in

Rd, solving the stochastic differential equation

dXt=∇f(Xt)dt +p2f(Xt)dWt, t ≥0,

with Wtad-dimensional Brownian motion. The data X0, XD,...,XND

consist of discrete measurements and the time interval Dbetween consecu-

tive observations is ﬁxed so that one cannot ‘zoom’ into the observed path of

the process. The goal is to infer the diffusivity fand the associated transition

operator Pt,f . We prove injectivity theorems and stability inequalities for

the maps f7→ Pt,f 7→ PD,f , t < D. Using these estimates we establish the

statistical consistency of a class of Bayesian algorithms based on Gaussian

process priors for the inﬁnite-dimensional parameter f, and show optimality

of some of the convergence rates obtained. We discuss an underlying rela-

tionship between the degree of ill-posedness of this inverse problem and the

‘hot spots’ conjecture from spectral geometry.

CONTENTS

1 Introduction...................................... 1

2 Mainresults...................................... 5

2.1 Optimal recovery of the transition operator PD,f ............... 6

2.2 Injectivity of f7→ Pt,f 7→ PD,f ,t < D ..................... 6

2.3 Bayesian inference in the diffusion model . . . . . . . . . . . . . . . . . . . 10

2.4 Posterior consistency theorems . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Proofs......................................... 13

3.1 Analytical background: reﬂected diffusions and their generators . . . . . . . 13

3.2 Heat equation, transition operator, and a perturbation identity . . . . . . . . . 15

3.3 Information distances and small ball probabilities . . . . . . . . . . . . . . . 17

3.4 Proofs of stability estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.5 Minimax estimation of the transition operator PD,f .............. 21

3.6 Bayesian contraction results . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.7 Neumann eigenfunctions on cylindrical domains . . . . . . . . . . . . . . . 28

3.8 Proofs of auxiliary results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

References......................................... 32

1. Introduction. Diffusion describes a random process for the evolution over time of

phenomena such as heat ﬂow, electric conductance, chemical reactions, or molecular dy-

namics, to name just a few examples. The density of a diffusing substance in an insulated

medium, say a bounded convex subset Oof Rd,d≥1, is described by the solutions uto the

parabolic partial differential equation (PDE) known as the heat equation, ∂u/∂t =Lf,U u,

with a divergence form elliptic second order differential operator

Lf,U =1

m∇·(mf∇), m ≡mU∝e−U,

*I would like to thank James Norris and Gabriel Paternain for helpful discussions, three anonymous referees

and the associate editor for their critical remarks, and Matteo Giordano for generating Fig.s 1-2.

arXiv:2210.13008v3 [math.ST] 27 Jan 2024

FIG 1. Left: a reﬂected diffusion path (Xt: 0 ≤t≤T)initialised at X0and ran until time T= 5. Right: N= 500

discrete observations (XiD)N

i=0 at sampling frequency D= 0.05 (T= 25). The diffusivity fis given in Fig. 2.

and equipped with Neumann boundary conditions. Here f:O → [fmin,∞), fmin >0,is a

positive scalar ‘diffusivity’ function and U:O → Ris a ‘force’ potential inducing a Gibbs

measure µ=µUwith (Lebesgue-) probability density mU. If Wtis a d-dimensional Brown-

ian motion then the corresponding ‘microscopic’ statistical model for a diffusing particle is

provided by solutions (Xt)to the stochastic differential equation (SDE)

(1) dXt=∇f(Xt)dt +f(Xt)∇U(Xt)dt +p2f(Xt)dWt+ν(Xt)dLt, t ≥0,

started at X0=x∈ O. The process is reﬂected when hitting the boundary ∂Oof its state

space: Ltis a ‘local time’ process acting only when Xt∈∂Oand ν(x)is the (inward)

pointing normal vector at x∈∂O. When f, ∇f, ∇Uare Lipschitz maps on O, a continuous

time Markov process (Xt:t≥0) giving a unique pathwise solution to (1) exists [69].

Real world observations of diffusion are necessarily discrete and often subject to a lower

bound on the time that elapses between consecutive measurements. We denote this ‘observa-

tion distance’ by D > 0and assume for simplicity that it is the same at each measurement.

The data is X0, XD,...,XND for some N∈N, that is, we are tracking the trajectory of a

given particle along discrete points in time, see Fig. 1. In practice one may be observing

several independent particles which essentially corresponds to (linearly) augmenting sample

size N– we consider the one-particle model without loss of generality. We investigate the

possibility to infer f, U and the transition operator Pt,f,U of the Markov process (Xt)both

at t=Dand at ‘unobserved’ times t > 0by a statistical algorithm, that is, by a computable

function of (XiD :i= 1,...,N). We are interested in the scenario where D > 0is ﬁxed (but

known) as sample size N→ ∞. This is often the most appropriate observational model: for

instance the speed at which particles or molecules transverse the medium Omay be much

faster than the frequency at which images can be taken. Following [29] we refer to this as the

‘low measurement frequency’ scenario. See [33,34] or also Ch. 4 in [49] for such situations

in the biological sciences, or [48,62,44] in the context of data assimilation problems.

The invariant ‘equilibrium’ distribution of the Markov process (1) is well known ([5],

Sec.1.11.3) to equal µUand hence identiﬁes the potential Uvia its probability density mU.

The inﬁnite-dimensional parameter µU(and thus U) can then be estimated from a discrete

sample by standard linear density estimators ˆµthat smooth the empirical measure of the

XiD’s near any point x∈ O (cf. [29] or also, with continuous data, [19,66,28]). Using

exponentially fast mixing of ergodic averages of the Markov process towards their µU-

expectations (e.g., via [60] combined with Thm 4.9.3 and Sec.1.11.3 in [5], or also with

[15]) one can then obtain excellent statistical guarantees for ∥ˆµ−µU∥in relevant norms ∥·∥

INFERENCE FOR DIFFUSIONS 3

(e.g., as after (30) in [56]). But the invariant measure µUcontains no information about the

diffusivity fin eq. (1), and in a ‘low frequency’ measurement scheme, standard statistics of

the data such as the quadratic variation (‘mean square displacements’) of the process provide

no valid inference on feither (not even along the observed path). We conclude not only that

recovering fis a much harder problem than estimation of U, but also that the problems es-

sentially decouple and can be treated separately. Therefore, to simplify the exposition of our

main contributions we henceforth assume that U= 1 in (1) and consider the model

(2) dXt=∇f(Xt)dt +p2f(Xt)dWt+ν(Xt)dLt, t ≥0,

started uniformly at random X0∼Unif(O). We denote by Pfthe resulting probability law

of (Xt:t≥0) (in path space). Our statistical results could be generalised to the case of

unknown Uin (1) as we discuss in Remark 4below.

The problem to determine diffusivity parameters ffrom data has a long history in math-

ematical inverse problems – we mention here [13,41,68,52,73,1] in the context of the

Calderón problem as well as [63,22,67,38,10,27,54] in the context of Darcy’s ﬂow prob-

lem, and the many references therein. All these settings consider a simpliﬁed observational

model where one is given a ‘steady state’ measurement of diffusion, returning the (typically

‘noisy’) solution of a time-independent elliptic PDE. The potential inferential barrier arising

with low frequency measurements disappears in the reduction from a time evolution equation

to the elliptic PDE and hence does not inform the statistical setting investigated here.

As the invariant measure µis identical for all f, the information contained in low fre-

quency discrete data from (2) is encoded in the transition operator PD,f of the underlying

Markov process (Xt). Little is known about how to conduct statistically valid inference in

this setting, with notable exceptions being the one-dimensional case d= 1 studied in [29,56].

We also mention the consistency results [75,32] as well as [46] for Markovian transition op-

erators, but these do not concern the conductivities fthemselves. A ﬁrst question is whether

the task of identifying ffrom PD,f for ﬁxed observation distance D > 0is even well-posed,

that is, whether the (non-linear) map f7→ PD,f is injective. The answer to this question is

positive at least if fis prescribed near ∂O. Denote by L2(O)the Hilbert space of square

Lebesgue integrable functions on O.

THEOREM 1. Suppose positive diffusion coefﬁcients f1, f2∈C2(O)are bounded away

from zero on Oand such that f1=f2near ∂O. Then if PD,f1=PD,f2coincide as bounded

linear operators on L2(O)for some D > 0, we must have f1=f2on O.

See Theorem 5for details. That fshould be known near ∂Ocan be explained by the fact

that the reﬂection (which is independent of f) dominates the local dynamics near ∂O.

Statistical algorithms are often motivated by ‘population version’ identiﬁcation equations

for unknown parameters, as in the one-dimensional case d= 1 considered in [29,56], who

use ordinary differential equation (ODE) techniques to derive identities for fin terms of the

ﬁrst eigenfunction of the transition operator PD,f . This approach appears of limited use in the

present multi-dimensional context d > 1. Instead we shall maintain {PD,f :f∈ F} as our

statistical model for natural choices of parameter spaces F ⊂ L2(O)of sufﬁciently smooth,

positive, functions. This makes available the algorithmic toolbox of Bayesian statistics in

inﬁnite-dimensional parameter spaces which does not require any identiﬁcation equations or

inversion formulae. Instead one employs a Gaussian process prior Πfor the function-valued

parameter f, see [76,67,24,54], and updates according to Bayes’ rule: if pD,f are the

transition densities of PD,f (fundamental solutions), the posterior distribution is

Π(B|X0, XD,...,XND) = RBQN

i=1 pD,f (X(i−1)D, XiD)dΠ(f)

RFQN

i=1 pD,f (X(i−1)D, XiD)dΠ(f), B measurable.

As the ‘forward map’ f7→pD,f can be evaluated by numerical PDE techniques for parabolic

equations, one can leverage ideas from [16] (see also [30,18,8]) to propose computationally

feasible MCMC methodology that draws approximate samples from Π(·|X0, XD,...,XND),

and the resulting ergodic averages approximate the posterior mean vector, which in turn gives

an estimated output for f. See Section 2.3, speciﬁcally Remark 3, for details.

Recent progress in Bayesian theory for non-linear inverse problems [53,50,59,57], [54]

has clariﬁed that such Bayesian methods can solve non-linear problems without ‘inversion

formulae’ as long as appropriate stability estimates for the forward map, here f7→ PD,f ,

are available. Following this strategy we prove here a ﬁrst statistical consistency result in

multi-dimensional diffusion models with such ‘low frequency’ measurements.

THEOREM 2. Let D > 0and consider data X0, XD,...,XND generated from the diffu-

sion (2) in a bounded smooth convex domain O. Assume the ground truth f0>1/4is sufﬁ-

ciently regular in a Sobolev sense and equals 1/2near ∂O. Assign an appropriate Gaussian

process prior Πto (θ(x) : x∈O), form fθ= (1 + eθ)/4, and consider the random ﬁeld

(¯

fN≡f¯

θN(x) : x∈O),¯

θN=EΠ[θ|X0, XD,...,XND],

arising from the posterior mean function. Then the posterior inference for the transition op-

erators Pt,f0, t > 0,as well as for f0is consistent, that is, as N→ ∞ and in Pf0-probability,

∥Pt, ¯

fN−Pt,f0∥L2→L2→0,and ∥¯

fN−f0∥L2→0,

where ∥·∥L2→L2denotes the operator norm on L2=L2(O).

See Theorems 9and 10 for details. Next to the stability estimates underlying Theorem 1,

a main ingredient of our proofs is an estimate (Theorem 11) on the ‘information’ (Kullback-

Leibler) distance of the underlying statistical experiment in terms of a negative Sobolev norm

on F. This result is of independent interest and also sharp (in view of Theorems 3,4).

Our proofs provide a rate of convergence in the last limits, and the rate obtained for Pt,f

cannot be improved (as we show) at the ‘observed time’ t=D, corresponding to ‘predic-

tion risk’. For the parameters fand Pt,f , t < D, our inversion rates are potentially slow (i.e.,

only inverse logarithmic in N). The question of optimal recovery in these non-linear inverse

problems is delicate as they (implicitly or explicitly) involve solving a ‘backward heat equa-

tion’ from knowledge of PD,f alone. We shed some light on the issue and exhibit inﬁnite-

dimensional parameter spaces of f’s where faster than logarithmic rates (algebraic in 1/N )

can be obtained. These are based on certain spectral ‘symmetry’ hypotheses on the domain

Oand on the diffusion process. For d= 1 these hypotheses are always satisﬁed and our the-

ory thus recovers the one-dimensional results from [29,56] as a special case (but with novel

proofs based on PDE theory). In multi-dimensions d≥2and for fin a ∥·∥∞-neighbourhood

of the constant function, we show that the required symmetries of Ocan be related to the ‘hot

spots conjecture’ from spectral geometry [4,12,36,3,64,37], providing further incentives

for the study of this topic. The topic of ‘fast’ rates beyond that conjecture will be investigated

in future research.

In principle, the Bayesian approach can be expected to give valid inferences for any mea-

surement regime and hence should work irrespectively of whether D→0or not. In fact, a

‘high frequency’ regime is explicitly investigated in the recent contribution [35] who show

posterior consistency if D→0sufﬁciently fast compared to N(but still such that the observa-

tion horizon ND →∞). We also refer to Sec. 3.3 in [28] for a discussion of the hypothetical

case when the entire trajectory of (Xt)is observed. More generally, the recent contributions

[65,55,28,2] to non-parametric inference for multi-dimensional diffusions (Bayesian or not)

contain many further references.

INFERENCE FOR DIFFUSIONS 5

2. Main results. We are given discrete observations X0, XD,...,XND, N ∈N,of the

solution (Xt:t≥0) of the SDE (2) where X0∼Unif(O), that is, the diffusion is started

in its (constant) invariant distribution. If X0=xfor some ﬁxed x, then our proofs work as

well in view of the exponentially fast mixing (36) of the process towards the uniform law µ,

by just discarding the ‘burn-in phase’, that is, by letting the process evolve for a while before

one starts to record measurements. We emphasise again that the time interval D > 0between

consecutive observations remains ﬁxed in the N→∞ asymptotics.

The domain Osupporting our diffusion process is a bounded convex open subset of Rd,

and to avoid technicalities we assume that the boundary of Ois smooth, ensuring in particular

the existence of all ‘reﬂecting’ normal vectors νat ∂O. Throughout L2(O)will denote the

Hilbert space of square integrable functions for Lebesgue measure dx on O. We also assume

(solely for notational convenience) that the volume of Ois normalised to one, vol(O) = 1.

The physical model underlying (2) describes the intensity (u(t, x) : t > 0, x ∈ O)of dif-

fusion in an insulated medium by the equation ∂u/∂t =−∇ · Jfor ﬂux J=−f∇u(e.g.,

p.361f. in [70], and after (31) below). For smooth test functions ϕ, let the elliptic operator Lf

be given by the action

(3) Lfϕ=∇·(f∇ϕ) = ∇f·∇ϕ+f∆ϕ=

j=1

∂

∂xjf∂

∂xj

ϕ,

where ∇,∇·,∆denote the gradient, divergence and Laplace operator, respectively. Then u

solves the heat equation for Lfwith Neumann boundary conditions ∂u/∂ν = 0 on ∂O. Its

fundamental solutions pt,f (·,·) : O × O → [0,∞)describe the probabilities RUpt,f (x, y)dy

for the position of a diffusing particle to lie in a region Uat time t0+twhen it was at x∈O at

time t0. More generally the transition operator Pt,f describes a self-adjoint action on L2(O),

(4) Pt,f (ϕ) = ZO

pt,f (·, y)ϕ(y)dy, ϕ ∈L2(O).

The process (Xt:t≥0) from (2) is the unique Markov random process with these transition

probabilities, inﬁnitesimal generator Lf, and equilibrium (invariant) probability density dµ =

1on O. The generator Lfwith Neumann boundary condition is characterised by an inﬁnite

sequence of (orthonormal) eigen-pairs (ej,f ,−λj,f )∈L2(O)×(−∞,0], j ≥0,where e0,f is

the constant eigenfunction corresponding to λ0= 0. By ellipticity the ﬁrst eigenvalue satisﬁes

the spectral gap estimate λ1,f >0(see (25) below). The transition operators Pt,f from (4) can

be described in this eigen-basis via the eigenvalues µj,f =e−tλj,f , and their densities pt,f are

uniformly bounded over O × O. These well-known facts are reviewed in Sec. 3.

Some more notation: C(¯

O)denotes the space of uniformly continuous functions on O.

The Sobolev and Hölder spaces Hα(O), Cα(O)of maps deﬁned on Oare deﬁned as all

functions that have partial derivatives up to order α∈Ndeﬁning elements of L2(O), C(¯

O),

respectively, and we set C∞(O) = ∩α>0Cα(O),C0(O) = C(¯

O)by convention. Attaching

the subscript cto any of the preceding spaces denotes the linear subspaces of such functions

of compact support within O. The Sobolev sub-spaces Hk

0of Hkare the completions of

C∞

c(O)for the Hk-norms. The symbols ∥ ·∥H→H,∥ ·∥HS denote the operator and Hilbert-

Schmidt (HS) norm of a linear operator on a Banach space H, respectively. We denote by

∥·∥∞the supremum norm and by ∥·∥Bthe norm of a normed space B, with dual space B∗.

Throughout, ≲,≳,≃denotes inequalities (in the last case two-sided) up to ﬁxed multiplica-

tive constants, while Z∼µmeans that a random variable Zhas law µ.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SubmittedtotheAnnalsofStatisticsCONSISTENTINFERENCEFORDIFFUSIONSFROMLOWFREQUENCYMEASUREMENTSBYRICHARDNICKL11DepartmentofPureMathematicsandMathematicalStatistics,UniversityofCambridge*;nickl@maths.cam.ac.ukLet(Xt)beareflecteddiffusionprocessinaboundedconvexdomaininRd,solvingthestochasticdifferentiale...

展开>> 收起<<

Consistent inference for diffusions from low frequency measurements.pdf

共34页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Consistent inference for diffusions from low frequency measurements

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: