Mean-field neural networks learning mappings on Wasserstein space Huyˆ en PhamXavier Warin

2025-05-02 0 0 3.01MB 35 页 10玖币

侵权投诉

Mean-ﬁeld neural networks: learning mappings on

Wasserstein space ∗†

Huyˆen Pham‡Xavier Warin §

September 19, 2023

Abstract

We study the machine learning task for models with operators map-

ping between the Wasserstein space of probability measures and a

space of functions, like e.g. in mean-ﬁeld games/control problems.

Two classes of neural networks based on bin density and on cylin-

drical approximation, are proposed to learn these so-called mean-ﬁeld

functions, and are theoretically supported by universal approximation

theorems. We perform several numerical experiments for training these

two mean-ﬁeld neural networks, and show their accuracy and eﬃciency

in the generalization error with various test distributions. Finally, we

present diﬀerent algorithms relying on mean-ﬁeld neural networks for

solving time-dependent mean-ﬁeld problems, and illustrate our results

with numerical tests for the example of a semi-linear partial diﬀerential

equation in the Wasserstein space of probability measures.

1 Introduction

Deep neural networks have been successfully used for approximating solu-

tions to high dimensional partial diﬀerential equations (PDEs) and control

problems, and various methods either based on physics informed representa-

tion ([1], [2]), or probabilistic and backward stochastic diﬀerential equations

(BSDEs) representation ([3], [4], [5]) have been recently developed in the

literature, see e.g. the survey papers [6] and [7].

∗This work is supported by FiME, Laboratoire de Finance des March´es de l’Energie,

and the ”Finance and Sustainable Development” EDF - CACIB Chair.

†We thank Maximilien Germain and Mathieu Lauri`ere for helpful discussions.

‡LPSM, Universit´e Paris Cit´e, & FiME pham at lpsm.paris

§EDF R&D & FiME xavier.warin at edf.fr

arXiv:2210.15179v3 [math.OC] 18 Sep 2023

In the last years, a novel class of control problems has emerged with the

theory of mean ﬁeld game/control dealing with models of large population

of interacting agents. Solutions to mean-ﬁeld problems are represented by

functions that depend not only on the state variable of the system, but also

on its probability distribution, representing the population state distribu-

tion, and can be characterized in terms of PDEs in the Wasserstein space of

probability measures (called Master equation) or BSDEs of McKean-Vlasov

(MKV) type, and we refer to the two-volume monograph [8], [9] for a compre-

hensive treatment of this topic. In such problems, the input is a probability

measure on Rd, hence valued in the inﬁnite dimensional Wasserstein space,

and the output is a function deﬁned on the support of the input probability

measure.

In this paper, we aim to approximate the inﬁnite dimensional mean-ﬁeld

function by proposing two classes of neural network architectures. The ﬁrst

approach starts from the approximation of a probability measure with den-

sity by a piecewise constant density function on some given ﬁxed partition

of size Kof a truncated support of the measure, called bins, see Figure 1

in the case of a Gaussian distribution. This allows us to approximate the

inﬁnite dimensional mapping by a function that maps an input space of di-

mension Kcorresponding to the bin density weights that can be learned by a

standard deep neural network. We show a universal approximation theorem

that justiﬁes theoretically the use of such bin density neural network. The

second approach maps directly probability measures as input but through a

ﬁnite-dimensional neural network function in cylindrical form, for which we

also state a universal approximation theorem.

Next, we show how to eﬀectively learn mean-ﬁeld function by means of

these two classes of mean-ﬁeld neural networks. This is achieved by gener-

ating a data set consisting of simulated probability measures following two

proposed methods, and then by training via stochastic gradient method the

parameters of the mean-ﬁeld neural networks. We perform several numerical

tests for illustrating the eﬃciency and accuracy of these two mean-ﬁeld neu-

ral networks on various examples of mean-ﬁeld functions, and we validate

our results on diﬀerent test distributions by computing the generalization

error.

As an application of these mean-ﬁeld neural networks, we consider dy-

namic mean-ﬁeld problems arising typically from mean-ﬁeld type control,

and design diﬀerent algorithms of local or global type, based on regression

or BSDE representation, for computing the solution. We illustrate the per-

formance of our algorithms with the example of a semi-linear PDE on the

Wasserstein space. More applications and examples from mean-ﬁeld con-

trol problems and Master equations are investigated in a companion paper

[10] where we provide a global comparison of the diﬀerent neural network

algorithms.

Related works. Several methods have been recently proposed for solving

numerically mean ﬁeld game/control problems. We mention for instance the

papers [11], [12] using Hamilton-Jacobi-equations and Lagrangian methods,

the works by [13], [14] relying on backward stochastic diﬀerential equations

and maximum principle or the work in [15] that approximates the mean-ﬁeld

control problem by particle systems for reducing the problem to a ﬁnite, but

possibly very high-dimensional problem. Actually, in the latter paper, sym-

metry of the particle system is exploited in the numerical resolution by using

a speciﬁc class of neural networks, called DeepSets [16], which allows to re-

duce signiﬁcantly the computational complexity. However, in all these cited

references, as the distribution probability of the state process is a determin-

istic function of time, the value function and optimal control are viewed as

functions of time and of the state, and approximated by neural networks on

ﬁnite-dimensional space. However, the solution obtained is valid for a given

initial distribution of the population state, but when varying the initial dis-

tribution, the solution has to be computed again by another neural network.

In this work, we develop instead a numerical scheme for approximating by

a suitable neural network the solution at any initial distribution.

The paper is organized as follows. In Section 2, we formulate the learn-

ing problem, present two network architectures: bin-density and cylindrical

neural networks, and explain the data generation and training procedures.

Numerical tests are developed in Section 3, and applications to time depen-

dent mean-ﬁeld problems are given in Section 4 with various algorithms and

numerical results. The proofs of the universal approximation theorem for

mean-ﬁeld neural networks are postponed in Appendix A.

Notations. Denote by P2(Rd) the Wasserstein space of square integrable

probability measures equipped with the 2-Wasserstein distance W2. Given

µ∈ P2(Rd), we denote by L2(µ) as the set of measurable functions ϕon Rd

s.t.

|ϕ|2

µ:= Z|ϕ(x)|2µ(dx)<∞.

(Here |.|denotes the Euclidian norm). Given some µ∈ P2(Rd), and ϕa

measurable function on Rdwith quadratic growth condition, hence in L2(µ),

we set: EX∼µ[ϕ(X)] := Rϕ(x)µ(dx). We also denote by ¯µ:= EX∼µ[X].

2 Learning mean-ﬁeld functions

Given a function Von Rd×P2(Rd), valued on Rp, with quadratic growth

condition w.r.t. the ﬁrst argument in Rd, we aim to approximate the inﬁnite-

dimensional mapping

V:µ∈ P2(Rd)7−→ V(·, µ)∈L2(µ),(2.1)

called mean-ﬁeld function, by a map Nconstructed from suitable classes

of neural networks. The mean-ﬁeld network Ntakes inputs composed of

two parts: µa probability measure and xin the support of µ, and outputs

N(µ)(x). The quality of this approximation is measured by the error:

L(N) := ZP2(Rd)EN(µ)ν(dµ),

with EN(µ) := V(µ)− N(µ)2

µ=EX∼µV(X, µ)− N(µ)(X)2,

where νis a probability measure on P2(Rd), called training measure. The

learning of the mean-ﬁeld functional Vwill be then performed by minimizing

over the parameters of the neural network operator Nthe loss function

LM(N) := 1

m=1 EN(µ(m)),(2.2)

where µ(m),m= 1, . . . , M are training samples of ν. We denote by b

NMthe

learned functional from this minimization problem, and for test data µtest

(diﬀerent from the training data set (µ(m))m), we shall compute the test

(generalization) error Eb

NM(µtest).

2.1 Neural networks approximations

Bin density-based approximation Let us denote by D2(Rd) the subset

of probability measures µin P2(Rd) which admit density functions pµwith

respect to the Lebesgue measure λdon Rd. Fix Kas a bounded rectangular

domain in Rd, and divide Kinto a number Kof bins, Bin(k), k= 1, . . . , K:

∪K

k=1Bin(k) = K, of center xk, and with same area size h=λd(K)/K. Given

µ∈ D2(Rd), we consider the bin approximation of its density function (see

ﬁgure 1), that is the truncated piecewise-constant density function deﬁned

on Kby

ˆpµ

K(x) = pµ

k:= pµ(xk)

k=1 pµ(xk)h,if x∈Bin(k), k = 1, . . . , K,

and ˆpµ

K(x) = 0 for x∈Rd\K, set pµ:= (pµ

k)k∈J1,KK, which lies in DK:=

{p= (pk)k∈J1,KK∈RK

+:PK

k=1 pkh= 1}, and called density bins of the

probability measure in D2(Rd) of density function ˆpµ

K, denoted by ˆµKwith

support on K

Figure 1: Bin approximation of a Gaussian distribution.

Conversely, given p= (pk)k∈J1,KK∈ DK, we can associate the piecewise-

constant density function deﬁned on Rdby

p(x) = pk,if x∈Bin(k), k = 1, . . . , K, p(x)=0, x ∈Rd\K.(2.3)

We then denote by µ=LDpthe bin density probability measure on P2(Rd)

with piecewise-constant density function p as in (2.3), hence with support

on K, and we note that ˆµK=LD(pµ).

A mean-ﬁeld density-based network is an operator on D2(Rd) in the form

ND(µ) = Φ(·,pµ),

where Φ = Φθis a neural network function from Rd×DKinto Rp, whose

architecture can be constructed as follows:

(i) Classical feedforward neural network, i.e. in the form:

(x, p)∈Rd×DK7→ Φθ(x, p) = AL+1 ◦σ◦ AL

| {z }

L−layer ◦. . . ◦σ◦ A1(x, p)

| {z }

1−layer

∈Rp,

Aℓ(x, p) = wℓx

p+bℓ∈Rdℓ, dL+1 =p,

with Lhidden layers (layer ℓwith dℓneurons), parameters θ= (wℓ, bℓ)ℓ:

wℓweight, bℓbias, an activation function σfrom Rinto R(composition

is componentwise), like e.g. tanh, sigmoid, or Relu.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Mean-fieldneuralnetworks:learningmappingsonWassersteinspace∗†HuyˆenPham‡XavierWarin§September19,2023AbstractWestudythemachinelearningtaskformodelswithoperatorsmap-pingbetweentheWassersteinspaceofprobabilitymeasuresandaspaceoffunctions,likee.g.inmean-fieldgames/controlproblems.Twoclassesofneuralnetwo...

展开>> 收起<<

Mean-field neural networks learning mappings on Wasserstein space Huyˆ en PhamXavier Warin.pdf

共35页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Mean-field neural networks learning mappings on Wasserstein space Huyˆ en PhamXavier Warin

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: