CONVERGENCE RATES FOR ANSATZ-FREE DATA-DRIVEN INFERENCE IN PHYSICALLY CONSTRAINED PROBLEMS S. CONTI1 F. HOFFMANN1AND M. ORTIZ23

2025-05-06 1 0 524.53KB 21 页 10玖币

侵权投诉

CONVERGENCE RATES FOR ANSATZ-FREE DATA-DRIVEN

INFERENCE IN PHYSICALLY CONSTRAINED PROBLEMS

S. CONTI1, F. HOFFMANN1AND M. ORTIZ2,3

1Institut f¨ur Angewandte Mathematik, Universit¨at Bonn, Germany

2Hausdorﬀ Center for Mathematics, Universit¨at Bonn, Germany

3Division of Engineering and Applied Science, California Institute of Technology,

Pasadena

Abstract. We study a Data-Driven approach to inference in physical systems in

a measure-theoretic framework. The systems under consideration are characterized

by two measures deﬁned over the phase space: i) A physical likelihood measure

expressing the likelihood that a state of the system be admissible, in the sense of

satisfying all governing physical laws; ii) A material likelihood measure expressing

the likelihood that a local state of the material be observed in the laboratory. We

assume deterministic loading, which means that the ﬁrst measure is supported on

a linear subspace. We additionally assume that the second measure is only known

approximately through a sequence of empirical (discrete) measures. We develop

a method for the quantitative analysis of convergence based on the ﬂat metric and

obtain error bounds both for annealing and the discretization or sampling procedure,

leading to the determination of appropriate quantitative annealing rates. Finally,

we provide an example illustrating the application of the theory to transportation

networks.

1. Introduction

We consider the problem of inferring the probability of ﬁnding a physical system

in a given state zin a linear space Z, or phase space, which we assume to be ﬁnite-

dimensional. For instance, if the system under consideration is an electrical circuit, then

the state of the system consists of the array of potential diﬀerences across the elements of

the circuit and the corresponding array of electric currents; if the system is a hydraulic

network, then the state of the system consists of the array of head diﬀerences across

each pipe and the corresponding array of mass ﬂuxes; if the system is a mechanical truss

structure, then the state of the system consists of the array of displacement diﬀerences,

or strains, across each member and the corresponding array of internal forces, or stresses;

et cetera. We note that, in all these examples, the state of the system consists of a pair

of dual variables and the dimension of phase space is even.

Physical systems obey ﬁeld equations, which place hard constraints on the possible

states attainable by the system. These constraints are material independent and can

be regarded as a restriction of the set of admissible states of the system. The view of

ﬁeld equations as constraints for purposes of analysis has a long-standing tradition in

continuum mechanics and electromagnetism, and constitutes the foundation of recent

arXiv:2210.02846v1 [math.OC] 20 Sep 2022

2 S. CONTI, F. HOFFMANN AND M. ORTIZ

methods of data-driven analysis [8, 3], physically-informed neural networks (PINNs) [11]

and other applications of modern data science. Classically, deterministic problems in

mathematical physics are closed by further restricting the states of the system to lie in

a subset representing the material law of the system, i. e., the locus of states attainable

by a speciﬁc material.

(a)

(b)

µD,h

Figure 1. Classical inference. a) Material likelihood function LD, here

in the form of a sliding Gaussian (dark: low likelihood; light: high likeli-

hood), constraint set Eand likelihood function Lobtained by restricting

LDto E. b) Empirical likelihood measure µD,h sampled from LD.

In this paper, we work within a general framework [2] for systems in which the ma-

terial law and the admissibility constraints are described by positive Radon likelihood

measures µD∈ M(Z) and µE∈ M(Z), respectively, representing the likelihood of y∈Z

being a (local) material state observed in the laboratory and of z∈Zbeing admissible.

Before presenting our new contributions, the main ideas underlying the work may be

summarized as follows. The admissible states of the system may be random, e. g., due

to the application of random forcing to the system. The observed material states may be

random either because the material itself is random or because of experimental scatter,

cf. Fig. 1a. We expect the material states y∈Zand admissible states z∈Zof the sys-

tem to be distributed according to a notion of intersection measure µD∩µE∈ M(Z×Z),

which can be qualitatively understood as the product measure µD×µEconditioned to

y=z. In the special case in which µDand µEare regular with respect to the Lebesgue

measure, with continuous densities LDand LE, the likelihood of ﬁnding the system at

state z∈Zis, simply, LD(z)LE(z), which determines the intersection µD∩µE. In

particular, if LE(z)LD(z) is integrable and non-zero, the expression

(1) E[f] = RZf(z)LD(z)LE(z)dL2N(z)

RZLD(z)LE(z)dL2N(z)≡ZZ

f(z)L(z)dL2N(z)

gives the expected value of a quantity of interest f∈Cc(Z). Similarly, if µD=LDL2N

with LDcontinuous and µE=HNE, corresponding to deterministic loading, then

µD∩µE=LDHNE, Fig. 1a. If Z=R2,µD=H1Raand µE=H1Rb, with a

and b∈R2not parallel, then µD∩µE=δ0.

ANSATZ-FREE DATA-DRIVEN INFERENCE 3

Suppose now that, as is often the case, the likelihood measure µDis not known exactly,

but only approximately through sequences of empirical measures (µD,h) obtained, e. g.,

by means of material testing. Suppose further that the empirical measures supply an

increasingly better approximation of µD, e. g., as a result of increasingly accurate and

extensive measurements. We then may expect that, under appropriate conditions of

convergence of (µD,h), the sequence of approximate intersections (µD,h ∩µE) converge to

the exact limiting likelihood measure µD∩µE, thus deﬁning a convergent approximation

scheme for the inference problem.

A fundamental diﬃculty that arises immediately is that, for most notions of intersec-

tions of measures, the intersection of certain pairs of measures may not be well-deﬁned

or may be zero. Consider for example the setting in which both µDand µEare ap-

proximated by empirical measures (µD,h) and (µE,h). A conventional response to this

challenge is to introduce Lebesgue-regular approximations (˜µD,h) and (˜µE,h) ﬁtted to

the data (µD,h) and (µE,h) by means of some method of regression. By regularity, (˜µD,h)

and (˜µE,h) then have well-deﬁned, continuous densities ˜

LD,h and ˜

LE,h, respectively, and

the intersections (˜µD,h ∩˜µE,h), which are intended as approximations of the exact in-

tersection µD∩µE, are simply given by ( ˜

LD,h(z)˜

LE,h(z))L2N(z). However, there is no

guarantee that this approximation will work in general and the approximations (˜µD,h)

and (˜µE,h) need to be chosen appropriately. Here, we take a diﬀerent approach using

thermalizations.

Overall, there are three main cases of interest:

(1) Lebesgue-regular likelihoods;

(2) empirical measures;

(3) likelihood measures supported on linear subspaces.

The general framework presented here allows for the physical likelihood and the material

likelihood to be in either of these three classes independently of each other. For (2), we

may consider that there is a sequence of approximating empirical measures with the

limiting measure belonging to class (1) or (3). In this work, we focus on the case in

which the physical likelihood µDis in (2) approximating (1), and the material likelihood

µEis in (3), see (5) and (6) below.

Whereas the measure-theoretical framework just outlined is remarkable for its direct-

ness and simplicity, a Bayesian reinterpretation of the rules of inference is often favored

in the literature (cf. [14] and references therein). A common ansatz is to introduce the

representation z= (, σ) and a sequence of functions gD,h, parameterized by a set of

parameters ph, providing the model

(2) σ=gD,h(;ph) + η ,

where ηis a random variable, interpreted as observational noise, with likelihood fD,h(·;qh),

parameterized by further parameters qh, and to assume the approximate material likeli-

hood to be of the form

(3) ˜

LD,h((, σ)) = fD,hσ−gD,h(;ph); qh.

Evidently, if fD,h attains its maximum at 0, then (2) represents the most likely material

law given the ansatz and may thus be regarded as an identiﬁed, or learned, material

model. A common choice for gD,h are neural networks, in the context of machine learn-

ing [7], whereas a common choice of fD,h is Gaussian [10, 4]. Common methods of

regression used to determine the parameters from the data include classical methods of

4 S. CONTI, F. HOFFMANN AND M. ORTIZ

statistical inference such as maximum likelihood [9, 5], variational approaches based on

the introduction of a loss function [14], measure-theoretical approaches based, e. g., on

the Wasserstein distance [1] or the Kullback-Leibler discrepancy [12].

An essential problem with this approach is that the choice of material models gD,h,

observational noise fD,h, priors, loss functions and parameterizations thereof are often

not prescribed by theory or fundamental considerations but instead dictated by conve-

nience. Worse still, the form of fD,h is often ﬁxed throughout the sequence, e. g., to be

Gaussian, which renders the approximation scheme non-convergent in cases where the

underlying likelihood measures µDand µEare not of the same form. Since the limit-

ing likelihood measures µDand µEare often not known in practice, it is generally not

possible to ensure that approximation schemes tied to particular choices of models and

priors be convergent. In addition, it is clear that, even in the best of circumstances,

representations of the form (2) and (3) introduce modeling bias and error and incur in

loss of information relative to the data sets themselves.

The ansatz-free approach of [2] adopted here leads to a direct connection between data

and inference and is therefore lossless and free of modeling bias. In addition, it allows

to treat unbounded likelihoods, a setting where it is not clear how to set-up a Bayesian

framework that is able to address the questions of inference and approximation. Our

approach overcomes the problem of unbounded likelihoods and zero intersection between

the approximating likelihood measures (µD,h) with (µE) by recourse to thermalization

and annealing. Speciﬁcally, we consider a sequence βh→+∞of reciprocal temperatures

for h→+∞, and replace µh=µD,h ×µEby its thermalization

(4) µh,βh:= B−1

βhe−βhky−zk2µh, Bβh:= ZZ

e−βhkξk2dL2N(ξ).

As h→ ∞, this regularization increasingly concentrates µhto the diagonal diag(Z×Z)

and is therefore expected to deliver the sought intersection µD∩µEin the limit. Suppose,

for instance, that the approximate material likelihood measure is

(5) µD,h =X

p∈Ph

mpδp,

where (Ph) are point data sets in Zand mh

p≥0 are weights. Suppose, in addition, that

the loading is deterministic,

(6) µE=HNE,

where Eis an aﬃne subspace of Zof dimension Nand HNEis the Hausdorﬀ mea-

sure restricted to E. Then, the approximate expectation of a quantity f∈Cb(Z×Z)

corresponding to (4) is, cf. Section 4.2,

Eh[f] = Pp∈PhREmpB−1

βhe−βhkp−zk2f(p, z)dHN(z)

Pp∈PhREmpB−1

βhe−βhkp−zk2dHN(z),(7)

which is explicit in the data and eschews the need for ans¨atze of any type, be they

material models or priors. We do not consider the deﬁnition of the constraint set Eas a

modelling step given that it encodes the governing physical laws. Data-driven inference

rules such as (7) are amenable to eﬃcient numerical implementation in combination with

stochastic quadrature formulas for the evaluation of the integrals [13].

ANSATZ-FREE DATA-DRIVEN INFERENCE 5

However, the analysis of [2], based on the concept of weak convergence, is not quan-

titative. It does not permit to obtain convergence rates, both for the convergence of µβ

to some limit µ∞and for the convergence of µh,β to µβ. In particular, it is not clear the

thermalization parameter βhin (7) should be chosen in practice.

1.1. Main Results. Our aim is to obtain quantitative estimates for the convergence

of µβto its limit µ∞and the convergence of µh,β to µβ, leading in particular to a

prescription for the choice βhwhich ensures the desired convergence µh,βh→µ∞. In

order to make convergence quantitative, we work in a metric setting and not only in

terms of weak convergence as in [2]. Our starting point is this observation:

•The ﬂat norm metrizes weak convergence on bounded and tight sets of measures.

We adopt, in this metric setting, the notion of transversality and of diagonal concentra-

tion for (possibly unbounded) measures via thermalization as developed in [2]. We in-

troduce a weaker concept, weak transversality, which corresponds to transversality along

subsequences, see Deﬁnition 3.1, and circumvents the need for regularity assumptions on

the measure µ. Then, Prokhorov’s theorem shows that:

•If µis such that the measures µβare uniformly bounded and uniformly tight,

then it is weakly transversal and, in particular, it has one or more diagonal

concentrations.

Having framed the thermalization problem in a metric setting, we may make use of

standard devices such as uniform convergence and diagonal subsequences. This permits

to decouple the thermalization problem (β→ ∞) from the approximation problem (h→

∞). A typical statement (not detailed here) is:

•Assume that i) µhare uniformly transversal and ii) µhhave thermalizations

that are uniformly bounded, uniformly tight and uniformly approximate the

thermalization of µ. Then µis transversal and its diagonal concentration is

approximated by the diagonal concentrations of µh.

In particular, the diagonal concentration of an unknown measure µis recovered from the

diagonal concentrations of an approximating sequence of measures.

We demonstrate the usefulness of this abstract framework by considering speciﬁc

classes of measures. We ﬁrst focus on sub-Gaussian material likelihoods combined with

a deterministic physical likelihoods. Speciﬁcally, we consider measures of the form

(8) µ= e−ΦL2N× HNE

for some N-dimensional aﬃne subspace Eof Zand Φ : Z→Rsatisfying

(9) β0ky−zk2+ Φ(y)≥ckyk2+kzk2−bfor all y∈Z,z∈E

for some constants β0>0, c > 0 and b > 0. Condition (9) can be interpreted as a

transversality condition in view of the following result.

Theorem 1.1 (informal, see Prop. 3.3 and Prop. 3.4).Measures µsatisfying (8)-(9)

are weakly transversal and admit a diagonal concentration. Further, if Φ∈C1and its

derivative does not grow too fast, then µis strongly transversal and the thermalizations

µβof µconverge to the diagonal concentration µ∞with rate β−1/2.

We remark that, for fully deterministic systems, the potential Φ is the indicator

function of a set D⊂Z, in which case (9) corresponds precisely to the deﬁnition of

transversality introduced in [3].

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

CONVERGENCERATESFORANSATZ-FREEDATA-DRIVENINFERENCEINPHYSICALLYCONSTRAINEDPROBLEMSS.CONTI1,F.HOFFMANN1ANDM.ORTIZ2;31InstitutfurAngewandteMathematik,UniversitatBonn,Germany2HausdorCenterforMathematics,UniversitatBonn,Germany3DivisionofEngineeringandAppliedScience,CaliforniaInstituteofTechnology,Pa...

展开>> 收起<<

CONVERGENCE RATES FOR ANSATZ-FREE DATA-DRIVEN INFERENCE IN PHYSICALLY CONSTRAINED PROBLEMS S. CONTI1 F. HOFFMANN1AND M. ORTIZ23.pdf

共21页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

CONVERGENCE RATES FOR ANSATZ-FREE DATA-DRIVEN INFERENCE IN PHYSICALLY CONSTRAINED PROBLEMS S. CONTI1 F. HOFFMANN1AND M. ORTIZ23

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: