Consistent inference for diffusions from low frequency measurements

2025-05-02 0 0 1.34MB 34 页 10玖币
侵权投诉
Submitted to the Annals of Statistics
CONSISTENT INFERENCE FOR DIFFUSIONS FROM
LOW FREQUENCY MEASUREMENTS
BYRICHARD NICKL 1
1Department of Pure Mathematics and Mathematical Statistics, University of Cambridge*; nickl@maths.cam.ac.uk
Let (Xt)be a reflected diffusion process in a bounded convex domain in
Rd, solving the stochastic differential equation
dXt=f(Xt)dt +p2f(Xt)dWt, t 0,
with Wtad-dimensional Brownian motion. The data X0, XD,...,XND
consist of discrete measurements and the time interval Dbetween consecu-
tive observations is fixed so that one cannot ‘zoom’ into the observed path of
the process. The goal is to infer the diffusivity fand the associated transition
operator Pt,f . We prove injectivity theorems and stability inequalities for
the maps f7→ Pt,f 7→ PD,f , t < D. Using these estimates we establish the
statistical consistency of a class of Bayesian algorithms based on Gaussian
process priors for the infinite-dimensional parameter f, and show optimality
of some of the convergence rates obtained. We discuss an underlying rela-
tionship between the degree of ill-posedness of this inverse problem and the
‘hot spots’ conjecture from spectral geometry.
CONTENTS
1 Introduction...................................... 1
2 Mainresults...................................... 5
2.1 Optimal recovery of the transition operator PD,f ............... 6
2.2 Injectivity of f7→ Pt,f 7→ PD,f ,t < D ..................... 6
2.3 Bayesian inference in the diffusion model . . . . . . . . . . . . . . . . . . . 10
2.4 Posterior consistency theorems . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Proofs......................................... 13
3.1 Analytical background: reflected diffusions and their generators . . . . . . . 13
3.2 Heat equation, transition operator, and a perturbation identity . . . . . . . . . 15
3.3 Information distances and small ball probabilities . . . . . . . . . . . . . . . 17
3.4 Proofs of stability estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5 Minimax estimation of the transition operator PD,f .............. 21
3.6 Bayesian contraction results . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.7 Neumann eigenfunctions on cylindrical domains . . . . . . . . . . . . . . . 28
3.8 Proofs of auxiliary results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
References......................................... 32
1. Introduction. Diffusion describes a random process for the evolution over time of
phenomena such as heat flow, electric conductance, chemical reactions, or molecular dy-
namics, to name just a few examples. The density of a diffusing substance in an insulated
medium, say a bounded convex subset Oof Rd,d1, is described by the solutions uto the
parabolic partial differential equation (PDE) known as the heat equation, u/∂t =Lf,U u,
with a divergence form elliptic second order differential operator
Lf,U =1
m·(mf), m mUeU,
*I would like to thank James Norris and Gabriel Paternain for helpful discussions, three anonymous referees
and the associate editor for their critical remarks, and Matteo Giordano for generating Fig.s 1-2.
1
arXiv:2210.13008v3 [math.ST] 27 Jan 2024
2
FIG 1. Left: a reflected diffusion path (Xt: 0 tT)initialised at X0and ran until time T= 5. Right: N= 500
discrete observations (XiD)N
i=0 at sampling frequency D= 0.05 (T= 25). The diffusivity fis given in Fig. 2.
and equipped with Neumann boundary conditions. Here f:O [fmin,), fmin >0,is a
positive scalar ‘diffusivity’ function and U:O Ris a ‘force’ potential inducing a Gibbs
measure µ=µUwith (Lebesgue-) probability density mU. If Wtis a d-dimensional Brown-
ian motion then the corresponding ‘microscopic’ statistical model for a diffusing particle is
provided by solutions (Xt)to the stochastic differential equation (SDE)
(1) dXt=f(Xt)dt +f(Xt)U(Xt)dt +p2f(Xt)dWt+ν(Xt)dLt, t 0,
started at X0=x∈ O. The process is reflected when hitting the boundary Oof its state
space: Ltis a ‘local time’ process acting only when XtOand ν(x)is the (inward)
pointing normal vector at xO. When f, f, Uare Lipschitz maps on O, a continuous
time Markov process (Xt:t0) giving a unique pathwise solution to (1) exists [69].
Real world observations of diffusion are necessarily discrete and often subject to a lower
bound on the time that elapses between consecutive measurements. We denote this ‘observa-
tion distance’ by D > 0and assume for simplicity that it is the same at each measurement.
The data is X0, XD,...,XND for some NN, that is, we are tracking the trajectory of a
given particle along discrete points in time, see Fig. 1. In practice one may be observing
several independent particles which essentially corresponds to (linearly) augmenting sample
size N– we consider the one-particle model without loss of generality. We investigate the
possibility to infer f, U and the transition operator Pt,f,U of the Markov process (Xt)both
at t=Dand at ‘unobserved’ times t > 0by a statistical algorithm, that is, by a computable
function of (XiD :i= 1,...,N). We are interested in the scenario where D > 0is fixed (but
known) as sample size N→ ∞. This is often the most appropriate observational model: for
instance the speed at which particles or molecules transverse the medium Omay be much
faster than the frequency at which images can be taken. Following [29] we refer to this as the
‘low measurement frequency’ scenario. See [33,34] or also Ch. 4 in [49] for such situations
in the biological sciences, or [48,62,44] in the context of data assimilation problems.
The invariant ‘equilibrium’ distribution of the Markov process (1) is well known ([5],
Sec.1.11.3) to equal µUand hence identifies the potential Uvia its probability density mU.
The infinite-dimensional parameter µU(and thus U) can then be estimated from a discrete
sample by standard linear density estimators ˆµthat smooth the empirical measure of the
XiDs near any point x∈ O (cf. [29] or also, with continuous data, [19,66,28]). Using
exponentially fast mixing of ergodic averages of the Markov process towards their µU-
expectations (e.g., via [60] combined with Thm 4.9.3 and Sec.1.11.3 in [5], or also with
[15]) one can then obtain excellent statistical guarantees for ˆµµUin relevant norms ∥·∥
INFERENCE FOR DIFFUSIONS 3
(e.g., as after (30) in [56]). But the invariant measure µUcontains no information about the
diffusivity fin eq. (1), and in a ‘low frequency’ measurement scheme, standard statistics of
the data such as the quadratic variation (‘mean square displacements’) of the process provide
no valid inference on feither (not even along the observed path). We conclude not only that
recovering fis a much harder problem than estimation of U, but also that the problems es-
sentially decouple and can be treated separately. Therefore, to simplify the exposition of our
main contributions we henceforth assume that U= 1 in (1) and consider the model
(2) dXt=f(Xt)dt +p2f(Xt)dWt+ν(Xt)dLt, t 0,
started uniformly at random X0Unif(O). We denote by Pfthe resulting probability law
of (Xt:t0) (in path space). Our statistical results could be generalised to the case of
unknown Uin (1) as we discuss in Remark 4below.
The problem to determine diffusivity parameters ffrom data has a long history in math-
ematical inverse problems – we mention here [13,41,68,52,73,1] in the context of the
Calderón problem as well as [63,22,67,38,10,27,54] in the context of Darcy’s flow prob-
lem, and the many references therein. All these settings consider a simplified observational
model where one is given a ‘steady state’ measurement of diffusion, returning the (typically
‘noisy’) solution of a time-independent elliptic PDE. The potential inferential barrier arising
with low frequency measurements disappears in the reduction from a time evolution equation
to the elliptic PDE and hence does not inform the statistical setting investigated here.
As the invariant measure µis identical for all f, the information contained in low fre-
quency discrete data from (2) is encoded in the transition operator PD,f of the underlying
Markov process (Xt). Little is known about how to conduct statistically valid inference in
this setting, with notable exceptions being the one-dimensional case d= 1 studied in [29,56].
We also mention the consistency results [75,32] as well as [46] for Markovian transition op-
erators, but these do not concern the conductivities fthemselves. A first question is whether
the task of identifying ffrom PD,f for fixed observation distance D > 0is even well-posed,
that is, whether the (non-linear) map f7→ PD,f is injective. The answer to this question is
positive at least if fis prescribed near O. Denote by L2(O)the Hilbert space of square
Lebesgue integrable functions on O.
THEOREM 1. Suppose positive diffusion coefficients f1, f2C2(O)are bounded away
from zero on Oand such that f1=f2near O. Then if PD,f1=PD,f2coincide as bounded
linear operators on L2(O)for some D > 0, we must have f1=f2on O.
See Theorem 5for details. That fshould be known near Ocan be explained by the fact
that the reflection (which is independent of f) dominates the local dynamics near O.
Statistical algorithms are often motivated by ‘population version’ identification equations
for unknown parameters, as in the one-dimensional case d= 1 considered in [29,56], who
use ordinary differential equation (ODE) techniques to derive identities for fin terms of the
first eigenfunction of the transition operator PD,f . This approach appears of limited use in the
present multi-dimensional context d > 1. Instead we shall maintain {PD,f :f∈ F} as our
statistical model for natural choices of parameter spaces F L2(O)of sufficiently smooth,
positive, functions. This makes available the algorithmic toolbox of Bayesian statistics in
infinite-dimensional parameter spaces which does not require any identification equations or
inversion formulae. Instead one employs a Gaussian process prior Πfor the function-valued
parameter f, see [76,67,24,54], and updates according to Bayes’ rule: if pD,f are the
transition densities of PD,f (fundamental solutions), the posterior distribution is
Π(B|X0, XD,...,XND) = RBQN
i=1 pD,f (X(i1)D, XiD)dΠ(f)
RFQN
i=1 pD,f (X(i1)D, XiD)dΠ(f), B measurable.
4
As the ‘forward map’ f7→pD,f can be evaluated by numerical PDE techniques for parabolic
equations, one can leverage ideas from [16] (see also [30,18,8]) to propose computationally
feasible MCMC methodology that draws approximate samples from Π(·|X0, XD,...,XND),
and the resulting ergodic averages approximate the posterior mean vector, which in turn gives
an estimated output for f. See Section 2.3, specifically Remark 3, for details.
Recent progress in Bayesian theory for non-linear inverse problems [53,50,59,57], [54]
has clarified that such Bayesian methods can solve non-linear problems without ‘inversion
formulae’ as long as appropriate stability estimates for the forward map, here f7→ PD,f ,
are available. Following this strategy we prove here a first statistical consistency result in
multi-dimensional diffusion models with such ‘low frequency’ measurements.
THEOREM 2. Let D > 0and consider data X0, XD,...,XND generated from the diffu-
sion (2) in a bounded smooth convex domain O. Assume the ground truth f0>1/4is suffi-
ciently regular in a Sobolev sense and equals 1/2near O. Assign an appropriate Gaussian
process prior Πto (θ(x) : xO), form fθ= (1 + eθ)/4, and consider the random field
(¯
fNf¯
θN(x) : xO),¯
θN=EΠ[θ|X0, XD,...,XND],
arising from the posterior mean function. Then the posterior inference for the transition op-
erators Pt,f0, t > 0,as well as for f0is consistent, that is, as N and in Pf0-probability,
Pt, ¯
fNPt,f0L2L20,and ¯
fNf0L20,
where ·L2L2denotes the operator norm on L2=L2(O).
See Theorems 9and 10 for details. Next to the stability estimates underlying Theorem 1,
a main ingredient of our proofs is an estimate (Theorem 11) on the ‘information’ (Kullback-
Leibler) distance of the underlying statistical experiment in terms of a negative Sobolev norm
on F. This result is of independent interest and also sharp (in view of Theorems 3,4).
Our proofs provide a rate of convergence in the last limits, and the rate obtained for Pt,f
cannot be improved (as we show) at the ‘observed time’ t=D, corresponding to ‘predic-
tion risk’. For the parameters fand Pt,f , t < D, our inversion rates are potentially slow (i.e.,
only inverse logarithmic in N). The question of optimal recovery in these non-linear inverse
problems is delicate as they (implicitly or explicitly) involve solving a ‘backward heat equa-
tion’ from knowledge of PD,f alone. We shed some light on the issue and exhibit infinite-
dimensional parameter spaces of fs where faster than logarithmic rates (algebraic in 1/N )
can be obtained. These are based on certain spectral ‘symmetry’ hypotheses on the domain
Oand on the diffusion process. For d= 1 these hypotheses are always satisfied and our the-
ory thus recovers the one-dimensional results from [29,56] as a special case (but with novel
proofs based on PDE theory). In multi-dimensions d2and for fin a ·-neighbourhood
of the constant function, we show that the required symmetries of Ocan be related to the ‘hot
spots conjecture’ from spectral geometry [4,12,36,3,64,37], providing further incentives
for the study of this topic. The topic of ‘fast’ rates beyond that conjecture will be investigated
in future research.
In principle, the Bayesian approach can be expected to give valid inferences for any mea-
surement regime and hence should work irrespectively of whether D0or not. In fact, a
‘high frequency’ regime is explicitly investigated in the recent contribution [35] who show
posterior consistency if D0sufficiently fast compared to N(but still such that the observa-
tion horizon ND ). We also refer to Sec. 3.3 in [28] for a discussion of the hypothetical
case when the entire trajectory of (Xt)is observed. More generally, the recent contributions
[65,55,28,2] to non-parametric inference for multi-dimensional diffusions (Bayesian or not)
contain many further references.
INFERENCE FOR DIFFUSIONS 5
2. Main results. We are given discrete observations X0, XD,...,XND, N N,of the
solution (Xt:t0) of the SDE (2) where X0Unif(O), that is, the diffusion is started
in its (constant) invariant distribution. If X0=xfor some fixed x, then our proofs work as
well in view of the exponentially fast mixing (36) of the process towards the uniform law µ,
by just discarding the ‘burn-in phase’, that is, by letting the process evolve for a while before
one starts to record measurements. We emphasise again that the time interval D > 0between
consecutive observations remains fixed in the Nasymptotics.
The domain Osupporting our diffusion process is a bounded convex open subset of Rd,
and to avoid technicalities we assume that the boundary of Ois smooth, ensuring in particular
the existence of all ‘reflecting’ normal vectors νat O. Throughout L2(O)will denote the
Hilbert space of square integrable functions for Lebesgue measure dx on O. We also assume
(solely for notational convenience) that the volume of Ois normalised to one, vol(O) = 1.
The physical model underlying (2) describes the intensity (u(t, x) : t > 0, x ∈ O)of dif-
fusion in an insulated medium by the equation u/∂t =−∇ · Jfor flux J=fu(e.g.,
p.361f. in [70], and after (31) below). For smooth test functions ϕ, let the elliptic operator Lf
be given by the action
(3) Lfϕ=·(fϕ) = f·ϕ+fϕ=
d
X
j=1
xjf
xj
ϕ,
where ,∇·,denote the gradient, divergence and Laplace operator, respectively. Then u
solves the heat equation for Lfwith Neumann boundary conditions u/∂ν = 0 on O. Its
fundamental solutions pt,f (·,·) : O × O [0,)describe the probabilities RUpt,f (x, y)dy
for the position of a diffusing particle to lie in a region Uat time t0+twhen it was at xO at
time t0. More generally the transition operator Pt,f describes a self-adjoint action on L2(O),
(4) Pt,f (ϕ) = ZO
pt,f (·, y)ϕ(y)dy, ϕ L2(O).
The process (Xt:t0) from (2) is the unique Markov random process with these transition
probabilities, infinitesimal generator Lf, and equilibrium (invariant) probability density =
1on O. The generator Lfwith Neumann boundary condition is characterised by an infinite
sequence of (orthonormal) eigen-pairs (ej,f ,λj,f )L2(O)×(−∞,0], j 0,where e0,f is
the constant eigenfunction corresponding to λ0= 0. By ellipticity the first eigenvalue satisfies
the spectral gap estimate λ1,f >0(see (25) below). The transition operators Pt,f from (4) can
be described in this eigen-basis via the eigenvalues µj,f =ej,f , and their densities pt,f are
uniformly bounded over O × O. These well-known facts are reviewed in Sec. 3.
Some more notation: C(¯
O)denotes the space of uniformly continuous functions on O.
The Sobolev and Hölder spaces Hα(O), Cα(O)of maps defined on Oare defined as all
functions that have partial derivatives up to order αNdefining elements of L2(O), C(¯
O),
respectively, and we set C(O) = α>0Cα(O),C0(O) = C(¯
O)by convention. Attaching
the subscript cto any of the preceding spaces denotes the linear subspaces of such functions
of compact support within O. The Sobolev sub-spaces Hk
0of Hkare the completions of
C
c(O)for the Hk-norms. The symbols ·HH, ·HS denote the operator and Hilbert-
Schmidt (HS) norm of a linear operator on a Banach space H, respectively. We denote by
·the supremum norm and by ·Bthe norm of a normed space B, with dual space B.
Throughout, ,,denotes inequalities (in the last case two-sided) up to fixed multiplica-
tive constants, while Zµmeans that a random variable Zhas law µ.
摘要:

SubmittedtotheAnnalsofStatisticsCONSISTENTINFERENCEFORDIFFUSIONSFROMLOWFREQUENCYMEASUREMENTSBYRICHARDNICKL11DepartmentofPureMathematicsandMathematicalStatistics,UniversityofCambridge*;nickl@maths.cam.ac.ukLet(Xt)beareflecteddiffusionprocessinaboundedconvexdomaininRd,solvingthestochasticdifferentiale...

收起<<
Consistent inference for diffusions from low frequency measurements.pdf

共34页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:34 页 大小:1.34MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 34
客服
关注