Lasso Monte Carlo a Variation on Multi Fidelity Methods for High Dimensional Uncertainty Quantification Arnau Albà12 Romana Boiger1 Dimitri Rochman1 and Andreas Adelmann1

2025-05-03 0 0 1.07MB 30 页 10玖币
侵权投诉
Lasso Monte Carlo, a Variation on Multi Fidelity Methods for High
Dimensional Uncertainty Quantification
Arnau Albà1,2, Romana Boiger1, Dimitri Rochman1, and Andreas Adelmann*,1
1Paul Scherrer Institut, Forschungstrasse 111, 5232 Villigen, Switzerland
2ETH Zürich, Rämistrasse 101, 8092 Zurich Switzerland
*andreas.adelmann@psi.ch
1st September 2023
Abstract
Uncertainty quantification (UQ) is an active area of research, and an essential technique used in all fields of science
and engineering. The most common methods for UQ are Monte Carlo and surrogate-modelling. The former method is
dimensionality independent but has slow convergence, while the latter method has been shown to yield large computational
speedups with respect to Monte Carlo. However, surrogate models suffer from the so-called curse of dimensionality, and
become costly to train for high-dimensional problems, where UQ might become computationally prohibitive. In this paper
we present a new technique, Lasso Monte Carlo (LMC), which combines a Lasso surrogate model with the multifidelity
Monte Carlo technique, in order to perform UQ in high-dimensional settings, at a reduced computational cost. We provide
mathematical guarantees for the unbiasedness of the method, and show that LMC can be more accurate than simple Monte
Carlo. The theory is numerically tested with benchmarks on toy problems, as well as on a real example of UQ from the field
of nuclear engineering. In all presented examples LMC is more accurate than simple Monte Carlo and other multifidelity
methods. Thanks to LMC, computational costs are reduced by more than a factor of 5 with respect to simple MC, in
relevant cases.
1 Introduction
Uncertainty Quantification (UQ) aims to calculate the effect of unknown or uncertain system parameters on the outcome
of an experiment or computation. It is an active area of research, and an essential tool to test the robustness and accuracy
of methods used in many domains of science and engineering, such as risk assessment in civil engineering [1], design and
optimisation of particle accelerators [2,3], weather prediction [4], medical physics [5], and nuclear engineering [6,7,8].
The UQ process can be described as follows: let 𝑓(𝒙)be a deterministic function which represents the numerical
experiment
𝑓𝑑𝑚
𝒙𝑓(𝒙),
with input and output dimensions of size 𝑑and 𝑚, respectively. Let 𝒙0= (𝑥1, 𝑥2, ..., 𝑥𝑑)be an input vector, with an
associated uncertainty. The uncertainty can be modelled by letting the input be random variable 𝑋, centred around 𝒙0,
such that 𝔼[𝑋] = 𝒙0. A common approach is to model the input with a multivariate normal distribution 𝑋(𝒙0,Σ),
where Σis a known covariance matrix. The aim of UQ, and more specifically response variability methods, is to estimate
the mean 𝜇and variance 𝜎2of the output distribution 𝑓(𝑋)which is then written as 𝑓(𝒙0) = 𝜇±𝜎. Without loss of
generality, in the rest of the paper it is assumed that the output is one-dimensional, i.e. 𝑚= 1.
When 𝑓is a black-box function one has to rely on non-intrusive UQ methods, such as Monte Carlo (henceforth referred
to as simple MC) [9,10] or surrogate modelling [11,12].
With simple MC, 𝑁independent and identically distributed (i.i.d). multivariate random variables 𝑋1, 𝑋2, ..., 𝑋𝑁
are sampled to obtain a set of input vectors 𝒙1,𝒙2, ..., 𝒙𝑁. Then 𝑓is evaluated at each input to obtain a set of outputs
1
arXiv:2210.03634v2 [stat.CO] 31 Aug 2023
𝑓(𝒙1), 𝑓 (𝒙2), ..., 𝑓 (𝒙𝑁), from which the sample mean and sample variance are calculated. Monte Carlo methods are
known to converge as (𝑁1
2)which, if 𝑓is expensive to evaluate, can make such methods computationally expensive or
even prohibitive.
The slow convergence of the errors with simple MC can in some cases be bypassed by using surrogate models [11].
With this approach, the 𝑁input-output samples of 𝑓are used to train a surrogate model 𝑓(𝑁). Then 𝑓(𝑁)is evaluated
𝑀times, and the outputs are used to estimate the mean and variance. The advantage of this method is that 𝑓(𝑁)is
computationally cheaper than 𝑓, and evaluating it 𝑀times with 𝑀 ≫ 𝑁 has negligible runtime. The bottleneck of this
method is in obtaining the 𝑁samples for the training set. If 𝑓is low-dimensional such that 𝑁 ≫ 𝑑, the surrogate model
is likely to have a small bias, however in high-dimensional cases with 𝑁 < 𝑑 the surrogate will be biased (see the curse
of dimensionality [13,14]), making also the mean and variance estimations biased. The specific case where 𝑁 < 𝑑, and
where increasing 𝑁is not possible or computationally expensive, is the main focus of this paper. The aim is to find a
method more accurate than simple MC for a fixed computational budget 𝑁.
In this regard, multifidelity Monte Carlo (MFMC) [15,16,17,18] offers a promising approach. Multifidelity methods
combine low and high fidelity models to accelerate the solution of an outer-loop application, like uncertainty quantific-
ation, sensitivity analysis or optimisation. The low fidelity models, that are either simplified models, projection-based
models or data-fit surrogates, are used for speed up, while the high fidelity models are kept in the loop for accuracy and/or
convergence. Thus MFMC offers a framework to combine a biased surrogate model 𝑓(𝑛)with the high fidelity model 𝑓,
in such a manner that unbiased estimates of the mean and variance are computed. The crucial point with MFMC is then,
for a given number of high fidelity evaluations 𝑁, optimising the trade-off between how many of them are used in the
multifidelity estimators and how many for training the surrogate. This optimisation problem has been recently addressed
in [19]. Nevertheless, there is no guarantee that, for a given 𝑁, the MFMC estimates will be more accurate than simple
MC, especially in high-dimensional cases with 𝑁 < 𝑑, where 𝑓(𝑛)is likely to have a large bias.
To address the challenges presented by high-dimensional UQ in existing approaches, we introduce the Lasso Monte
Carlo (LMC) method, a variation on MFMC. With LMC we propose a new data management strategy in which the high
fidelity samples are reused several times both to train multiple surrogates and in the multifidelity estimators. The resulting
algorithm is such that the estimations are guaranteed to be equally or more accurate than simple MC and MFMC, under
certain assumptions. This new approach can be viewed as a variance reduction technique, based on a two-level version
of MFMC, and Lasso regression (least absolute shrinkage and selection operator), simultaneously performing regression
analysis, variable selection and regularization [20].
It is worth mentioning the relation between multifidelity, multilevel, and multi-indexing methods: Multilevel Monte
Carlo (MLMC) [21] is a special case within the broader framework of multifidelity methods, in which typically a hierarchy
of low fidelity models is derived by varying a parameter (e.g. mesh width). A generalization of MLMC is introduced in
[22], so called multi-index Monte Carlo methods, which use multidimensional levels and high-order mixed differences to
reduce the computational cost and the variance of the resulting estimator.
The remainder of the paper is organised as follows: we start with a review of MFMC for the estimation of central
moments, and in particular we derive the expressions for the two-level estimators. This is followed by a discussion on the
trade-off between accuracy and computational costs of MFMC. The new method LMC is then introduced, and we prove
that it is equally or more accurate than simple MC. We then review the theory behind the Lasso regression method, and
show how it can be used in the LMC algorithm. Finally, in section 3LMC is benchmarked on a variety of examples. Proofs
for all the theorems and lemmas are provided in appendix A.
1.1 Notation & Assumptions
Throughout the paper, bold letters represent vectors 𝒙= (𝑥1, 𝑥2, ..., 𝑥𝑑), with 𝑑the dimension. A lower case letter 𝑥
is a realisation of a random variable 𝑋, and in the case where 𝑋is a multivariate random variable of dimension 𝑑, a
2
realisation of it will be a vector (𝑥1, 𝑥2, ..., 𝑥𝑑). For a random variable 𝑋with probability density function (PDF) 𝜙(𝑥),
we calculate the expectation value or mean with 𝔼[𝑋] = 𝑥 𝜙(𝑥)𝑑𝑥 . We also use 𝔼[𝑓(𝑋)] or 𝔼[𝑓]for the mean of
the function 𝑓, whose input follows the distribution of 𝑋. The variance is defined as Var[𝑋] = 𝔼𝑋𝔼[𝑋]2, the
covariance Cov[𝑋, 𝑌 ] = 𝔼𝑋𝔼[𝑋]𝑌𝔼[𝑌], the fourth central moment 𝑚4[𝑋] = 𝔼𝑋𝔼[𝑋]4, and finally
the multivariate second moment 𝑚2,2[𝑋, 𝑌 ] = 𝔼𝑋𝔼[𝑋]2𝑌𝔼[𝑌]2.
The function space 𝐿𝑝𝑑, 𝜙(𝒙)𝑑𝒙for 1𝑝 < is defined as the space of functions satisfying
𝑑𝑓(𝒙)𝑝𝜙(𝒙)𝑑𝒙1∕𝑝<.
The function 𝑓is the ground truth, i.e. the expensive model, while 𝑓(𝑛)is a cheap-to-evaluate surrogate model that
was fitted to a training set of 𝑛samples of 𝑓. Similarly, 𝜇𝑁and 𝜇(𝑛)
𝑁are the sample estimators for the mean of a sample
set of size 𝑁, calculated with the true and surrogate model respectively. Also 𝜎2𝑁and 𝜎2(𝑛)
𝑁are the sample estimators of
the variance of a sample set of size 𝑁, computed with the true and surrogate models respectively. We write the two-level
estimators as 𝜇(𝑛)
𝑁,𝑀 and 𝜎2(𝑛)
𝑁,𝑀 , and the LMC estimators as 𝑁,𝑀 and Σ2
𝑁,𝑀 . It is assumed that the computational cost
of training and evaluating 𝑓(𝑛)is negligible compared to the cost of evaluating 𝑓. Therefore the cost of an estimator is
given by 𝑁, the number of evaluations of 𝑓.
A normal distribution with mean 𝝁and covariance matrix Σis written as (𝝁,Σ), and 𝑈[𝑎, 𝑏]is a uniform distribution
between 𝑎and 𝑏with 𝑎<𝑏. With mean squared error (MSE) is defined as MSE 𝜇𝑁,𝔼[𝑓]=𝔼𝜇𝑁𝔼[𝑓]2.
Without loss of generality, we assume that all the sets of data used have been centred around zero, such that 1
𝑁𝑁
𝑖=1 𝑓(𝒙𝑖) =
0,and 1
𝑁𝑁
𝑖=1 𝑥𝑖𝑘 = 0 , 𝑘 = 1,2, ..., 𝑑 .
The following assumption is made for any surrogate model 𝑓(𝑛), and section 2.4 discusses its validity.
Assumption 1.1 Let 𝑓(𝑛)be a surrogate model that has been fitted on a training set of 𝑛inputs 𝒙1,𝒙2, ..., 𝒙𝑛sampled
from i.i.d. random variables distributed as 𝑋, and 𝑛outputs 𝑓(𝒙1), 𝑓 (𝒙2), ..., 𝑓 (𝒙𝑛). Then the following inequalities are
satisfied
Var 𝑓𝑓(𝑛)Var [𝑓],(1a)
𝑚2,2𝑓+𝑓(𝑛), 𝑓 𝑓(𝑛)+1
𝑁− 1 Var 𝑓+𝑓(𝑛)Var 𝑓𝑓(𝑛)
𝑁− 2
𝑁− 1 Var [𝑓]− Var 𝑓(𝑛)2𝑚4[𝑓] − 𝑁− 3
𝑁 1 Var2[𝑓],
(1b)
for any integers 𝑛 > 0and 𝑁 > 3.
2 Theory
2.1 Two-Level Monte Carlo
The proposed method, LMC, is based on multilevel and multifidelity Monte Carlo methods [21,16,15]. As with any
MC method, one wants to estimate 𝔼[𝑓]for some function 𝑓. Several models 𝑓1, 𝑓2, ..., 𝑓𝐿are available that approximate
𝑓, which have increasing cost and increasing accuracy, i.e. 𝑓𝐿is the most accurate and expensive model, while 𝑓1is
computationally cheap but inaccurate. The difference in naming between multilevel and multifidelity methods comes from
how these levels of accuracy are defined: in multilevel Monte Carlo the levels are obtained by coarsening or refining a
grid, or changing the step-size of the integrator, whereas in multifidelity Monte Carlo the different functions are given by
any general lower-order model such as data-fit surrogates or projection-based models. In both cases, however, the goal
of the method is to reduce the overall computational cost of computing 𝔼[𝑓]with respect to traditional MC, by optimally
balancing the amount of evaluations at each level of accuracy.
3
Originally MLMC was developed for estimating only the mean of a distribution, but in more recent years it has been
extended [23,24,25] to estimate higher order moments of the distribution, which are necessary for UQ.
In the following paragraphs, a two-level version of MFMC is derived, in which two levels of accuracy are considered:
the expensive, unbiased, true model 𝑓, and a computationally cheap, biased, surrogate model 𝑓(𝑛).
2.1.1 Mean Estimator
Let 𝑓(𝒙) ∈ 𝐿2𝑑, 𝜙(𝒙)𝑑𝒙be a function, whose input is distributed according to the multivariate 𝑑-dimensional random
variable 𝑋, with probability density function (PDF) 𝜙(𝒙). Let 𝒙1,𝒙2, ..., 𝒙𝑁and 𝒛1,𝒛2, ..., 𝒛𝑀be two sets of input samples
of size 𝑁and 𝑀, drawn from the i.i.d. random variables distributed as 𝑋. The aim is to estimate the mean 𝔼[𝑓], with the
minimum number of evaluations of the function 𝑓. The simple MC estimator is
𝜇𝑁=1
𝑁
𝑁
𝑖=1
𝑓(𝒙𝑖).(2)
The mean squared error of the estimator is
MSE 𝜇𝑁,𝔼[𝑓]=Var[𝑓]
𝑁.(3)
Therefore it is an unbiased estimator since lim𝑁𝜇𝑁=𝔼[𝑓]. With this method we require 𝑁=Var[𝑓]
MSE samples to
obtain an estimation with a mean squared error of MSE.
Now let 𝑓(𝑁)𝐿2𝑑, 𝜙(𝒙)𝑑𝒙be a surrogate model that was trained on 𝑁evaluations of 𝑓, and is much cheaper
to evaluate. Using this surrogate model to compute the sample mean, the estimator is
𝜇(𝑁)
𝑀=1
𝑀
𝑀
𝑖=1
𝑓(𝑁)(𝒛𝑖).(4)
Note that this estimator has the same cost as (2) since the number of evaluations of 𝑓is the same. The error is
MSE 𝜇(𝑁)
𝑀,𝔼[𝑓]=𝔼[𝑓(𝑁)] − 𝔼[𝑓]2+Var 𝑓(𝑁)
𝑀.(5)
The first term is the bias, and the second term is the variance. The variance term quickly vanishes if we assume that 𝑓(𝑁)
has a negligible runtime, and that thus 𝑀. However, the bias term can only be reduced by increasing the training
set size 𝑁(and hence improving the accuracy of the surrogate model), which might be impossible or computationally
demanding. This is especially problematic in high-dimensional cases, since the volume of the input space to be sampled
increases exponentially with 𝑑[14], and unless the training set was large 𝑁 𝑑 the surrogate will be heavily biased.
Additionally, even if the training set were large and 𝑁, the bias would in general not decay to zero due to model
bias.
Let us now combine the surrogate model with 𝑓into a two-level estimator. For this, assume that the set of 𝑁samples
is split into a training subset 𝒙1,𝒙2, ..., 𝒙𝑛of size 𝑛with 1𝑛𝑁, and an evaluation subset 𝒙𝑛+1,𝒙𝑛+2, ..., 𝒙𝑁of size
𝑁𝑛. The training samples are used to fit a surrogate 𝑓(𝑛). Then the two-level estimator reads
𝜇(𝑛)
𝑁𝑛,𝑀 =1
𝑀
𝑀
𝑖=1
𝑓(𝑛)(𝒛𝑖) + 1
𝑁𝑛
𝑁
𝑖=𝑛+1
𝑓(𝒙𝑖) − 𝑓(𝑛)(𝒙𝑖) = 𝜇(𝑛)
𝑀+𝜇𝑁𝑛𝜇(𝑛)
𝑁𝑛.(6)
As with the previous estimator, the cost is 𝑁, but in this case it is an unbiased estimate since the error
4
MSE 𝜇(𝑛)
𝑁𝑛,𝑀 ,𝔼[𝑓]=Var 𝑓(𝑛)
𝑀+Var 𝑓𝑓(𝑛)
𝑁𝑛(7)
has variance terms, but no bias term. Here again we assume that 𝑓(𝑛)has negligible runtime and that 𝑀, and thus
the first term vanishes. Then, the following statement can be made regarding the MSE of simple MC and of the two-level
estimator:
lim
𝑀MSE 𝜇(𝑛)
𝑁𝑛,𝑀 ,𝔼[𝑓]MSE 𝜇𝑁,𝔼[𝑓]𝑁
𝑁𝑛Var[𝑓]
Var 𝑓𝑓(𝑛).(8)
Therefore, for a given computational budget 𝑁, the two-level estimator is equally or more accurate than simple MC if
and only if 𝑛and 𝑓(𝑛)are such that (8) is satisfied. Note that the fraction on the far right of (8) is guaranteed to be larger
than 1by assumption 1a. Also in this fraction note that the denominator could in principle be zero, however we assume
that this will never happen due to the model bias of 𝑓(𝑛).
2.1.2 Variance Estimation
Let 𝑓(𝒙) ∈ 𝐿4𝑑, 𝜙(𝒙)𝑑𝒙. Let there be two sets of input samples of size 𝑁and 𝑀, as in section 2.1.1. Then the simple
MC estimator for the variance is
𝜎2
𝑁=1
𝑁− 1
𝑁
𝑖=1 𝑓(𝒙𝑖) −
𝑁
𝑗=1
𝑓(𝒙𝑗)
𝑁2
,(9)
which is unbiased and has an error
MSE 𝜎2
𝑁,Var[𝑓]=1
𝑁𝑚4[𝑓] − 𝑁− 3
𝑁 1 Var2[𝑓].(10)
Using a surrogate model 𝑓(𝑁)𝐿4𝑑, 𝜙(𝒙)𝑑𝒙trained on 𝑁evaluations of 𝑓, the variance estimator is
𝜎2(𝑁)
𝑀=1
𝑀− 1
𝑀
𝑖=1 𝑓(𝑁)(𝒙𝑖) −
𝑀
𝑗=1
𝑓(𝑁)(𝒙𝑗)
𝑀2
,(11)
which has an error
MSE 𝜎2(𝑁)
𝑀,Var[𝑓]=Var[𝑓(𝑁)] − Var[𝑓]2
+1
𝑀𝑚4[𝑓(𝑁)] − 𝑀− 3
𝑀 1 Var2[𝑓(𝑁)].
The surrogate estimation of the variance is biased. The second term vanishes if 𝑀, while the first term is affected
by model bias, and decays slowly due to the curse of dimensionality.
Now assume that the 𝑁input samples are split into a subset of 𝑛training samples which are used to fit a surrogate
𝑓(𝑛), and a subset of 𝑁𝑛evaluation samples. Then the two-level estimator for the variance is
𝜎2(𝑛)
𝑁𝑛,𝑀 =𝜎2(𝑛)
𝑀+𝜎2
𝑁𝑛𝜎2(𝑛)
𝑁𝑛,(12)
This estimator has a computational cost of 𝑁and is unbiased. The error is
5
摘要:

LassoMonteCarlo,aVariationonMultiFidelityMethodsforHighDimensionalUncertaintyQuantificationArnauAlbà1,2,RomanaBoiger1,DimitriRochman1,andAndreasAdelmann*,11PaulScherrerInstitut,Forschungstrasse111,5232Villigen,Switzerland2ETHZürich,Rämistrasse101,8092ZurichSwitzerland*andreas.adelmann@psi.ch1stSepte...

展开>> 收起<<
Lasso Monte Carlo a Variation on Multi Fidelity Methods for High Dimensional Uncertainty Quantification Arnau Albà12 Romana Boiger1 Dimitri Rochman1 and Andreas Adelmann1.pdf

共30页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:30 页 大小:1.07MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 30
客服
关注