Lasso Monte Carlo a Variation on Multi Fidelity Methods for High Dimensional Uncertainty Quantification Arnau Albà12 Romana Boiger1 Dimitri Rochman1 and Andreas Adelmann1

2025-05-03 0 0 1.07MB 30 页 10玖币

侵权投诉

Lasso Monte Carlo, a Variation on Multi Fidelity Methods for High

Dimensional Uncertainty Quantiﬁcation

Arnau Albà1,2, Romana Boiger1, Dimitri Rochman1, and Andreas Adelmann*,1

1Paul Scherrer Institut, Forschungstrasse 111, 5232 Villigen, Switzerland

2ETH Zürich, Rämistrasse 101, 8092 Zurich Switzerland

*andreas.adelmann@psi.ch

1st September 2023

Abstract

Uncertainty quantiﬁcation (UQ) is an active area of research, and an essential technique used in all ﬁelds of science

and engineering. The most common methods for UQ are Monte Carlo and surrogate-modelling. The former method is

dimensionality independent but has slow convergence, while the latter method has been shown to yield large computational

speedups with respect to Monte Carlo. However, surrogate models suﬀer from the so-called curse of dimensionality, and

become costly to train for high-dimensional problems, where UQ might become computationally prohibitive. In this paper

we present a new technique, Lasso Monte Carlo (LMC), which combines a Lasso surrogate model with the multiﬁdelity

Monte Carlo technique, in order to perform UQ in high-dimensional settings, at a reduced computational cost. We provide

mathematical guarantees for the unbiasedness of the method, and show that LMC can be more accurate than simple Monte

Carlo. The theory is numerically tested with benchmarks on toy problems, as well as on a real example of UQ from the ﬁeld

of nuclear engineering. In all presented examples LMC is more accurate than simple Monte Carlo and other multiﬁdelity

methods. Thanks to LMC, computational costs are reduced by more than a factor of 5 with respect to simple MC, in

relevant cases.

1 Introduction

Uncertainty Quantiﬁcation (UQ) aims to calculate the eﬀect of unknown or uncertain system parameters on the outcome

of an experiment or computation. It is an active area of research, and an essential tool to test the robustness and accuracy

of methods used in many domains of science and engineering, such as risk assessment in civil engineering [1], design and

optimisation of particle accelerators [2,3], weather prediction [4], medical physics [5], and nuclear engineering [6,7,8].

The UQ process can be described as follows: let 𝑓(𝒙)be a deterministic function which represents the numerical

experiment

𝑓∶ℝ𝑑→ℝ𝑚

𝒙↦𝑓(𝒙),

with input and output dimensions of size 𝑑and 𝑚, respectively. Let 𝒙0= (𝑥1, 𝑥2, ..., 𝑥𝑑)be an input vector, with an

associated uncertainty. The uncertainty can be modelled by letting the input be random variable 𝑋, centred around 𝒙0,

such that 𝔼[𝑋] = 𝒙0. A common approach is to model the input with a multivariate normal distribution 𝑋∼(𝒙0,Σ),

where Σis a known covariance matrix. The aim of UQ, and more speciﬁcally response variability methods, is to estimate

the mean 𝜇and variance 𝜎2of the output distribution 𝑓(𝑋)which is then written as 𝑓(𝒙0) = 𝜇±𝜎. Without loss of

generality, in the rest of the paper it is assumed that the output is one-dimensional, i.e. 𝑚= 1.

When 𝑓is a black-box function one has to rely on non-intrusive UQ methods, such as Monte Carlo (henceforth referred

to as simple MC) [9,10] or surrogate modelling [11,12].

With simple MC, 𝑁independent and identically distributed (i.i.d). multivariate random variables 𝑋1, 𝑋2, ..., 𝑋𝑁

are sampled to obtain a set of input vectors 𝒙1,𝒙2, ..., 𝒙𝑁. Then 𝑓is evaluated at each input to obtain a set of outputs

arXiv:2210.03634v2 [stat.CO] 31 Aug 2023

𝑓(𝒙1), 𝑓 (𝒙2), ..., 𝑓 (𝒙𝑁), from which the sample mean and sample variance are calculated. Monte Carlo methods are

known to converge as (𝑁−1

2)which, if 𝑓is expensive to evaluate, can make such methods computationally expensive or

even prohibitive.

The slow convergence of the errors with simple MC can in some cases be bypassed by using surrogate models [11].

With this approach, the 𝑁input-output samples of 𝑓are used to train a surrogate model 𝑓(𝑁). Then 𝑓(𝑁)is evaluated

𝑀times, and the outputs are used to estimate the mean and variance. The advantage of this method is that 𝑓(𝑁)is

computationally cheaper than 𝑓, and evaluating it 𝑀times with 𝑀 ≫ 𝑁 has negligible runtime. The bottleneck of this

method is in obtaining the 𝑁samples for the training set. If 𝑓is low-dimensional such that 𝑁 ≫ 𝑑, the surrogate model

is likely to have a small bias, however in high-dimensional cases with 𝑁 < 𝑑 the surrogate will be biased (see the curse

of dimensionality [13,14]), making also the mean and variance estimations biased. The speciﬁc case where 𝑁 < 𝑑, and

where increasing 𝑁is not possible or computationally expensive, is the main focus of this paper. The aim is to ﬁnd a

method more accurate than simple MC for a ﬁxed computational budget 𝑁.

In this regard, multiﬁdelity Monte Carlo (MFMC) [15,16,17,18] oﬀers a promising approach. Multiﬁdelity methods

combine low and high ﬁdelity models to accelerate the solution of an outer-loop application, like uncertainty quantiﬁc-

ation, sensitivity analysis or optimisation. The low ﬁdelity models, that are either simpliﬁed models, projection-based

models or data-ﬁt surrogates, are used for speed up, while the high ﬁdelity models are kept in the loop for accuracy and/or

convergence. Thus MFMC oﬀers a framework to combine a biased surrogate model 𝑓(𝑛)with the high ﬁdelity model 𝑓,

in such a manner that unbiased estimates of the mean and variance are computed. The crucial point with MFMC is then,

for a given number of high ﬁdelity evaluations 𝑁, optimising the trade-oﬀ between how many of them are used in the

multiﬁdelity estimators and how many for training the surrogate. This optimisation problem has been recently addressed

in [19]. Nevertheless, there is no guarantee that, for a given 𝑁, the MFMC estimates will be more accurate than simple

MC, especially in high-dimensional cases with 𝑁 < 𝑑, where 𝑓(𝑛)is likely to have a large bias.

To address the challenges presented by high-dimensional UQ in existing approaches, we introduce the Lasso Monte

Carlo (LMC) method, a variation on MFMC. With LMC we propose a new data management strategy in which the high

ﬁdelity samples are reused several times both to train multiple surrogates and in the multiﬁdelity estimators. The resulting

algorithm is such that the estimations are guaranteed to be equally or more accurate than simple MC and MFMC, under

certain assumptions. This new approach can be viewed as a variance reduction technique, based on a two-level version

of MFMC, and Lasso regression (least absolute shrinkage and selection operator), simultaneously performing regression

analysis, variable selection and regularization [20].

It is worth mentioning the relation between multiﬁdelity, multilevel, and multi-indexing methods: Multilevel Monte

Carlo (MLMC) [21] is a special case within the broader framework of multiﬁdelity methods, in which typically a hierarchy

of low ﬁdelity models is derived by varying a parameter (e.g. mesh width). A generalization of MLMC is introduced in

[22], so called multi-index Monte Carlo methods, which use multidimensional levels and high-order mixed diﬀerences to

reduce the computational cost and the variance of the resulting estimator.

The remainder of the paper is organised as follows: we start with a review of MFMC for the estimation of central

moments, and in particular we derive the expressions for the two-level estimators. This is followed by a discussion on the

trade-oﬀ between accuracy and computational costs of MFMC. The new method LMC is then introduced, and we prove

that it is equally or more accurate than simple MC. We then review the theory behind the Lasso regression method, and

show how it can be used in the LMC algorithm. Finally, in section 3LMC is benchmarked on a variety of examples. Proofs

for all the theorems and lemmas are provided in appendix A.

1.1 Notation & Assumptions

Throughout the paper, bold letters represent vectors 𝒙= (𝑥1, 𝑥2, ..., 𝑥𝑑), with 𝑑the dimension. A lower case letter 𝑥

is a realisation of a random variable 𝑋, and in the case where 𝑋is a multivariate random variable of dimension 𝑑, a

realisation of it will be a vector (𝑥1, 𝑥2, ..., 𝑥𝑑). For a random variable 𝑋with probability density function (PDF) 𝜙(𝑥),

we calculate the expectation value or mean with 𝔼[𝑋] = ∫ℝ𝑥 𝜙(𝑥)𝑑𝑥 . We also use 𝔼[𝑓(𝑋)] or 𝔼[𝑓]for the mean of

the function 𝑓, whose input follows the distribution of 𝑋. The variance is deﬁned as Var[𝑋] = 𝔼𝑋−𝔼[𝑋]2, the

covariance Cov[𝑋, 𝑌 ] = 𝔼𝑋−𝔼[𝑋]𝑌−𝔼[𝑌], the fourth central moment 𝑚4[𝑋] = 𝔼𝑋−𝔼[𝑋]4, and ﬁnally

the multivariate second moment 𝑚2,2[𝑋, 𝑌 ] = 𝔼𝑋−𝔼[𝑋]2𝑌−𝔼[𝑌]2.

The function space 𝐿𝑝ℝ𝑑, 𝜙(𝒙)𝑑𝒙for 1≤𝑝 < ∞is deﬁned as the space of functions satisfying

∫ℝ𝑑𝑓(𝒙)𝑝𝜙(𝒙)𝑑𝒙1∕𝑝<∞.

The function 𝑓is the ground truth, i.e. the expensive model, while 𝑓(𝑛)is a cheap-to-evaluate surrogate model that

was ﬁtted to a training set of 𝑛samples of 𝑓. Similarly, 𝜇𝑁and 𝜇(𝑛)

𝑁are the sample estimators for the mean of a sample

set of size 𝑁, calculated with the true and surrogate model respectively. Also 𝜎2𝑁and 𝜎2(𝑛)

𝑁are the sample estimators of

the variance of a sample set of size 𝑁, computed with the true and surrogate models respectively. We write the two-level

estimators as 𝜇(𝑛)

𝑁,𝑀 and 𝜎2(𝑛)

𝑁,𝑀 , and the LMC estimators as 𝑁,𝑀 and Σ2

𝑁,𝑀 . It is assumed that the computational cost

of training and evaluating 𝑓(𝑛)is negligible compared to the cost of evaluating 𝑓. Therefore the cost of an estimator is

given by 𝑁, the number of evaluations of 𝑓.

A normal distribution with mean 𝝁and covariance matrix Σis written as (𝝁,Σ), and 𝑈[𝑎, 𝑏]is a uniform distribution

between 𝑎and 𝑏with 𝑎<𝑏. With mean squared error (MSE) is deﬁned as MSE 𝜇𝑁,𝔼[𝑓]=𝔼𝜇𝑁−𝔼[𝑓]2.

Without loss of generality, we assume that all the sets of data used have been centred around zero, such that 1

𝑁𝑁

𝑖=1 𝑓(𝒙𝑖) =

0,and 1

𝑁𝑁

𝑖=1 𝑥𝑖𝑘 = 0 , 𝑘 = 1,2, ..., 𝑑 .

The following assumption is made for any surrogate model 𝑓(𝑛), and section 2.4 discusses its validity.

Assumption 1.1 Let 𝑓(𝑛)be a surrogate model that has been ﬁtted on a training set of 𝑛inputs 𝒙1,𝒙2, ..., 𝒙𝑛sampled

from i.i.d. random variables distributed as 𝑋, and 𝑛outputs 𝑓(𝒙1), 𝑓 (𝒙2), ..., 𝑓 (𝒙𝑛). Then the following inequalities are

satisﬁed

Var 𝑓−𝑓(𝑛)≤Var [𝑓],(1a)

𝑚2,2𝑓+𝑓(𝑛), 𝑓 −𝑓(𝑛)+1

𝑁− 1 Var 𝑓+𝑓(𝑛)Var 𝑓−𝑓(𝑛)

−𝑁− 2

𝑁− 1 Var [𝑓]− Var 𝑓(𝑛)2≤𝑚4[𝑓] − 𝑁− 3

𝑁− 1 Var2[𝑓],

(1b)

for any integers 𝑛 > 0and 𝑁 > 3.

2 Theory

2.1 Two-Level Monte Carlo

The proposed method, LMC, is based on multilevel and multiﬁdelity Monte Carlo methods [21,16,15]. As with any

MC method, one wants to estimate 𝔼[𝑓]for some function 𝑓. Several models 𝑓1, 𝑓2, ..., 𝑓𝐿are available that approximate

𝑓, which have increasing cost and increasing accuracy, i.e. 𝑓𝐿is the most accurate and expensive model, while 𝑓1is

computationally cheap but inaccurate. The diﬀerence in naming between multilevel and multiﬁdelity methods comes from

how these levels of accuracy are deﬁned: in multilevel Monte Carlo the levels are obtained by coarsening or reﬁning a

grid, or changing the step-size of the integrator, whereas in multiﬁdelity Monte Carlo the diﬀerent functions are given by

any general lower-order model such as data-ﬁt surrogates or projection-based models. In both cases, however, the goal

of the method is to reduce the overall computational cost of computing 𝔼[𝑓]with respect to traditional MC, by optimally

balancing the amount of evaluations at each level of accuracy.

Originally MLMC was developed for estimating only the mean of a distribution, but in more recent years it has been

extended [23,24,25] to estimate higher order moments of the distribution, which are necessary for UQ.

In the following paragraphs, a two-level version of MFMC is derived, in which two levels of accuracy are considered:

the expensive, unbiased, true model 𝑓, and a computationally cheap, biased, surrogate model 𝑓(𝑛).

2.1.1 Mean Estimator

Let 𝑓(𝒙) ∈ 𝐿2ℝ𝑑, 𝜙(𝒙)𝑑𝒙be a function, whose input is distributed according to the multivariate 𝑑-dimensional random

variable 𝑋, with probability density function (PDF) 𝜙(𝒙). Let 𝒙1,𝒙2, ..., 𝒙𝑁and 𝒛1,𝒛2, ..., 𝒛𝑀be two sets of input samples

of size 𝑁and 𝑀, drawn from the i.i.d. random variables distributed as 𝑋. The aim is to estimate the mean 𝔼[𝑓], with the

minimum number of evaluations of the function 𝑓. The simple MC estimator is

𝜇𝑁=1

𝑁



𝑖=1

𝑓(𝒙𝑖).(2)

The mean squared error of the estimator is

MSE 𝜇𝑁,𝔼[𝑓]=Var[𝑓]

𝑁.(3)

Therefore it is an unbiased estimator since lim𝑁→∞𝜇𝑁=𝔼[𝑓]. With this method we require 𝑁=Var[𝑓]

MSE samples to

obtain an estimation with a mean squared error of MSE.

Now let 𝑓(𝑁)∈𝐿2ℝ𝑑, 𝜙(𝒙)𝑑𝒙be a surrogate model that was trained on 𝑁evaluations of 𝑓, and is much cheaper

to evaluate. Using this surrogate model to compute the sample mean, the estimator is

𝜇(𝑁)

𝑀=1

𝑀



𝑖=1

𝑓(𝑁)(𝒛𝑖).(4)

Note that this estimator has the same cost as (2) since the number of evaluations of 𝑓is the same. The error is

MSE 𝜇(𝑁)

𝑀,𝔼[𝑓]=𝔼[𝑓(𝑁)] − 𝔼[𝑓]2+Var 𝑓(𝑁)

𝑀.(5)

The ﬁrst term is the bias, and the second term is the variance. The variance term quickly vanishes if we assume that 𝑓(𝑁)

has a negligible runtime, and that thus 𝑀→∞. However, the bias term can only be reduced by increasing the training

set size 𝑁(and hence improving the accuracy of the surrogate model), which might be impossible or computationally

demanding. This is especially problematic in high-dimensional cases, since the volume of the input space to be sampled

increases exponentially with 𝑑[14], and unless the training set was large 𝑁 ≫ 𝑑 the surrogate will be heavily biased.

Additionally, even if the training set were large and 𝑁→∞, the bias would in general not decay to zero due to model

bias.

Let us now combine the surrogate model with 𝑓into a two-level estimator. For this, assume that the set of 𝑁samples

is split into a training subset 𝒙1,𝒙2, ..., 𝒙𝑛of size 𝑛with 1≤𝑛≤𝑁, and an evaluation subset 𝒙𝑛+1,𝒙𝑛+2, ..., 𝒙𝑁of size

𝑁−𝑛. The training samples are used to ﬁt a surrogate 𝑓(𝑛). Then the two-level estimator reads

𝜇(𝑛)

𝑁−𝑛,𝑀 =1

𝑀



𝑖=1

𝑓(𝑛)(𝒛𝑖) + 1

𝑁−𝑛

𝑁



𝑖=𝑛+1

𝑓(𝒙𝑖) − 𝑓(𝑛)(𝒙𝑖) = 𝜇(𝑛)

𝑀+𝜇𝑁−𝑛−𝜇(𝑛)

𝑁−𝑛.(6)

As with the previous estimator, the cost is 𝑁, but in this case it is an unbiased estimate since the error

MSE 𝜇(𝑛)

𝑁−𝑛,𝑀 ,𝔼[𝑓]=Var 𝑓(𝑛)

𝑀+Var 𝑓−𝑓(𝑛)

𝑁−𝑛(7)

has variance terms, but no bias term. Here again we assume that 𝑓(𝑛)has negligible runtime and that 𝑀→∞, and thus

the ﬁrst term vanishes. Then, the following statement can be made regarding the MSE of simple MC and of the two-level

estimator:

lim

𝑀→∞MSE 𝜇(𝑛)

𝑁−𝑛,𝑀 ,𝔼[𝑓]≤MSE 𝜇𝑁,𝔼[𝑓]⟺𝑁

𝑁−𝑛≤Var[𝑓]

Var 𝑓−𝑓(𝑛).(8)

Therefore, for a given computational budget 𝑁, the two-level estimator is equally or more accurate than simple MC if

and only if 𝑛and 𝑓(𝑛)are such that (8) is satisﬁed. Note that the fraction on the far right of (8) is guaranteed to be larger

than 1by assumption 1a. Also in this fraction note that the denominator could in principle be zero, however we assume

that this will never happen due to the model bias of 𝑓(𝑛).

2.1.2 Variance Estimation

Let 𝑓(𝒙) ∈ 𝐿4ℝ𝑑, 𝜙(𝒙)𝑑𝒙. Let there be two sets of input samples of size 𝑁and 𝑀, as in section 2.1.1. Then the simple

MC estimator for the variance is

𝜎2

𝑁=1

𝑁− 1

𝑁



𝑖=1 𝑓(𝒙𝑖) −

𝑁



𝑗=1

𝑓(𝒙𝑗)

𝑁2

,(9)

which is unbiased and has an error

MSE 𝜎2

𝑁,Var[𝑓]=1

𝑁𝑚4[𝑓] − 𝑁− 3

𝑁− 1 Var2[𝑓].(10)

Using a surrogate model 𝑓(𝑁)∈𝐿4ℝ𝑑, 𝜙(𝒙)𝑑𝒙trained on 𝑁evaluations of 𝑓, the variance estimator is

𝜎2(𝑁)

𝑀=1

𝑀− 1

𝑀



𝑖=1 𝑓(𝑁)(𝒙𝑖) −

𝑀



𝑗=1

𝑓(𝑁)(𝒙𝑗)

𝑀2

,(11)

which has an error

MSE 𝜎2(𝑁)

𝑀,Var[𝑓]=Var[𝑓(𝑁)] − Var[𝑓]2

𝑀𝑚4[𝑓(𝑁)] − 𝑀− 3

𝑀− 1 Var2[𝑓(𝑁)].

The surrogate estimation of the variance is biased. The second term vanishes if 𝑀→∞, while the ﬁrst term is aﬀected

by model bias, and decays slowly due to the curse of dimensionality.

Now assume that the 𝑁input samples are split into a subset of 𝑛training samples which are used to ﬁt a surrogate

𝑓(𝑛), and a subset of 𝑁−𝑛evaluation samples. Then the two-level estimator for the variance is

𝜎2(𝑛)

𝑁−𝑛,𝑀 =𝜎2(𝑛)

𝑀+𝜎2

𝑁−𝑛−𝜎2(𝑛)

𝑁−𝑛,(12)

This estimator has a computational cost of 𝑁and is unbiased. The error is

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LassoMonteCarlo,aVariationonMultiFidelityMethodsforHighDimensionalUncertaintyQuantificationArnauAlbà1,2,RomanaBoiger1,DimitriRochman1,andAndreasAdelmann*,11PaulScherrerInstitut,Forschungstrasse111,5232Villigen,Switzerland2ETHZürich,Rämistrasse101,8092ZurichSwitzerland*andreas.adelmann@psi.ch1stSepte...

展开>> 收起<<

Lasso Monte Carlo a Variation on Multi Fidelity Methods for High Dimensional Uncertainty Quantification Arnau Albà12 Romana Boiger1 Dimitri Rochman1 and Andreas Adelmann1.pdf

共30页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Lasso Monte Carlo a Variation on Multi Fidelity Methods for High Dimensional Uncertainty Quantification Arnau Albà12 Romana Boiger1 Dimitri Rochman1 and Andreas Adelmann1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: