
INFERENCE FOR DIFFUSIONS 3
(e.g., as after (30) in [56]). But the invariant measure µUcontains no information about the
diffusivity fin eq. (1), and in a ‘low frequency’ measurement scheme, standard statistics of
the data such as the quadratic variation (‘mean square displacements’) of the process provide
no valid inference on feither (not even along the observed path). We conclude not only that
recovering fis a much harder problem than estimation of U, but also that the problems es-
sentially decouple and can be treated separately. Therefore, to simplify the exposition of our
main contributions we henceforth assume that U= 1 in (1) and consider the model
(2) dXt=∇f(Xt)dt +p2f(Xt)dWt+ν(Xt)dLt, t ≥0,
started uniformly at random X0∼Unif(O). We denote by Pfthe resulting probability law
of (Xt:t≥0) (in path space). Our statistical results could be generalised to the case of
unknown Uin (1) as we discuss in Remark 4below.
The problem to determine diffusivity parameters ffrom data has a long history in math-
ematical inverse problems – we mention here [13,41,68,52,73,1] in the context of the
Calderón problem as well as [63,22,67,38,10,27,54] in the context of Darcy’s flow prob-
lem, and the many references therein. All these settings consider a simplified observational
model where one is given a ‘steady state’ measurement of diffusion, returning the (typically
‘noisy’) solution of a time-independent elliptic PDE. The potential inferential barrier arising
with low frequency measurements disappears in the reduction from a time evolution equation
to the elliptic PDE and hence does not inform the statistical setting investigated here.
As the invariant measure µis identical for all f, the information contained in low fre-
quency discrete data from (2) is encoded in the transition operator PD,f of the underlying
Markov process (Xt). Little is known about how to conduct statistically valid inference in
this setting, with notable exceptions being the one-dimensional case d= 1 studied in [29,56].
We also mention the consistency results [75,32] as well as [46] for Markovian transition op-
erators, but these do not concern the conductivities fthemselves. A first question is whether
the task of identifying ffrom PD,f for fixed observation distance D > 0is even well-posed,
that is, whether the (non-linear) map f7→ PD,f is injective. The answer to this question is
positive at least if fis prescribed near ∂O. Denote by L2(O)the Hilbert space of square
Lebesgue integrable functions on O.
THEOREM 1. Suppose positive diffusion coefficients f1, f2∈C2(O)are bounded away
from zero on Oand such that f1=f2near ∂O. Then if PD,f1=PD,f2coincide as bounded
linear operators on L2(O)for some D > 0, we must have f1=f2on O.
See Theorem 5for details. That fshould be known near ∂Ocan be explained by the fact
that the reflection (which is independent of f) dominates the local dynamics near ∂O.
Statistical algorithms are often motivated by ‘population version’ identification equations
for unknown parameters, as in the one-dimensional case d= 1 considered in [29,56], who
use ordinary differential equation (ODE) techniques to derive identities for fin terms of the
first eigenfunction of the transition operator PD,f . This approach appears of limited use in the
present multi-dimensional context d > 1. Instead we shall maintain {PD,f :f∈ F} as our
statistical model for natural choices of parameter spaces F ⊂ L2(O)of sufficiently smooth,
positive, functions. This makes available the algorithmic toolbox of Bayesian statistics in
infinite-dimensional parameter spaces which does not require any identification equations or
inversion formulae. Instead one employs a Gaussian process prior Πfor the function-valued
parameter f, see [76,67,24,54], and updates according to Bayes’ rule: if pD,f are the
transition densities of PD,f (fundamental solutions), the posterior distribution is
Π(B|X0, XD,...,XND) = RBQN
i=1 pD,f (X(i−1)D, XiD)dΠ(f)
RFQN
i=1 pD,f (X(i−1)D, XiD)dΠ(f), B measurable.