Higher-order Monte Carlo through cubic stratification Nicolas Chopin1 Mathieu Gerber2

2025-05-06 0 0 833.81KB 35 页 10玖币

侵权投诉

Higher-order Monte Carlo through cubic

stratiﬁcation

Nicolas Chopin(1), Mathieu Gerber(2)

(1) ENSAE, Institut Polytechnique de Paris, Paris, France

(2) School of Mathematics, University of Bristol, UK

We propose two novel unbiased estimators of the integral ´[0,1]sf(u)du for

a function f, which depend on a smoothness parameter r∈N. The ﬁrst

estimator integrates exactly the polynomials of degrees p<rand achieves

the optimal error n−1/2−r/s (where nis the number of evaluations of f) when

fis rtimes continuously diﬀerentiable. The second estimator is also optimal

in term of convergence rate and has the advantage to be computationally

cheaper, but it is restricted to functions that vanish on the boundary of

[0,1]s. The construction of the two estimators relies on a combination of

cubic stratiﬁcation and control variates based on numerical derivatives. We

provide numerical evidence that they show good performance even for mod-

erate values of n.

1. Introduction

1.1. Background

This paper is concerned with the construction of unbiased estimators of the integral

I(f) := ´[0,1]sf(u)du based on a certain number nof evaluations of f. The motivation

for this problem is well-known. Many quantities of interest in applied mathematics may

be expressed as such an integral. Providing random, unbiased approximations present

several practical advantages. First, it greatly facilitates the assessment of the numer-

ical error, through repeated runs. Second, such independent estimates may be gener-

ated in parallel, and then may be averaged to obtain a lower variance approximation of

I(f). Third, generating unbiased estimates as plug-in replacements is of interest in vari-

ous advanced Monte Carlo methodologies, such as pseudo-marginal sampling Andrieu

and Roberts (2009), stochastic approximation Robbins and Monro (1951) and stochastic

gradient descent. Finally, random integration algorithms converge at a faster rate than

deterministic ones Novak (1988) (but note that these convergence rates correspond to

diﬀerent criteria).

arXiv:2210.01554v2 [stat.CO] 9 Jun 2023

The most basic and well-known stochastic integration rule is the crude Monte Carlo

method, where one simulates uniformly nindependent and identically distributed variates

Ui, and returns n−1Pn

i=1 f(Ui)as an estimate of I(f). Assuming that f∈L2([0,1]s),

the root mean square error (RMSE) of this estimator converges to zero at rate n−1/2.

In this paper we consider the problem of estimating I(f)under the additional condition

that all the partial derivatives of fof order less or equal to rexist and are continuous,

or, in short, that f∈ Cr([0,1]s). Under this assumption on fit is well-known that we

can improve upon the crude Monte Carlo error rate. More precisely, for f∈ Cr([0,1]s)

the optimal convergence rate for the RMSE of an estimate b

I(f)of I(f)based on n

evaluations of fis n−1/2−r/s, in the sense that if g:N→[0,∞)is such that

∀f∈ Cr([0,1]s), n ≥1,Eh|b

I(f)− I(f)|2i1/2≤g(n)∥f∥r

(where ∥f∥ris a bound on the r-th order derivatives of f, see Section 1.4 for a proper

deﬁnition) then we must have n−1/2−r/s/g(n) = O(1) (this result can for instance be

obtained from Propositions 1-2 given in Section 2.2.4, page 55, of Novak (1988)).

Stochastic algorithms that achieve this optimal convergence rate for f∈ Cr([0,1]s)

have been proposed e.g. in Haber (1966) for r= 1 and in Haber (1967) for r∈ {1,2}. In

Haber (1969) it is shown that if b

I(·)is a stochastic quadrature (SQ) of degree r−1, that

is, if E[b

I(f)] = I(f)for all f∈L1([0,1]s)and P(b

I(f) = I(f)) = 1 if fis a polynomial of

degree p<r, then b

I(·)can be used to deﬁne an estimator of I(f)whose RMSE converges

to zero at rate n−1/2−r/s when f∈ Cr([0,1]s). In Haber (1969) a formula for a SQ of

degree r−1is given for r∈ {3,4}while, for s= 1, Siegel and O’Brien (1985) provides a

SQ of degree 2r+1 for all r≥1. For multivariate integration problems, and an arbitrary

value of r≥1, a SQ of degree r−1can be constructed from the integration method

proposed in Ermakov and Zolotukhin (1960). However, the algorithm proposed in this

reference requires to perform a sampling task which is so computationally expensive that

it is considered as almost intractable Patterson (1987).

A related approach is derived by Dick in Dick (2011), which achieves rate O(n−1/2−α+ε)

for ε > 0and a certain class of functions indexed by α(which diﬀers from Cr([0,1]s)

even when r=sα). We will go back to this point and compare our approach to Dick’s

in our numerical study.

1.2. Motivation and plan

The paper is structured as follows. We introduce in Section 2 an unbiased estimator of

I(f)which has the following three appealing properties when f∈ Cr([0,1]s). First, its

RMSE converges to zero at the optimal n−1/2−r/s rate. Second, it integrates exactly fif

fis a polynomial of degree p<r. Third, for some constant C < ∞and with probability

one, the absolute value of its estimation error is bounded by Cn−r/s, where n−r/s is the

optimal convergence rate for a deterministic integration rule (this result can for instance

be obtained from Proposition 1.3.5, page 28, of Novak (1988)). In addition, we establish

a central limit theorem (CLT) for a particular version of the proposed estimator. To the

best of our knowledge, a CLT for an estimator of I(f)having an RMSE that converges

at the optimal rate when f∈ Cr([0,1]s)exists only for r= 1 (see Bardenet and Hardy

(2020)).

In Section 3, we focus our attention on the estimation of I(f)when f∈ Cr

0([0,1]s),

where we deﬁne Cr

0([0,1]s)as the set of functions in Cr([0,1]s)whose partial derivatives of

order o≤rare all equal to zero on the boundary of [0,1]s. As we explain in that section,

this set-up is particularly relevant for solving integration problems on Rs. Restricting

our attention to Cr

0([0,1]s)⊂ Cr([0,1]s)allows us to derive an estimator of I(f), referred

to as the vanishing estimator in what follows, which is computationally cheaper than the

previous estimator, while retaining its convergence properties, namely an RMSE of size

O(n−1/2−r/s)and an actual error of size O(n−r/s)almost surely. We note that these

convergence rates are optimal for integrating a function in Cr

0([0,1]s)(again, see Sections

1.3.5 and 2.2.4 of Novak (1988)) and that an algorithm considering a similar class of

functions is proposed in Krieg and Novak (2017). The algorithm derived in this latter

reference has the advantage to achieve the optimal aforementioned convergence rates for

any r∈Nbut its implementation at reasonable computational cost remains an open

problem.

Section 4 discusses some practical details about the proposed estimators, regarding on

how their variance may be estimated and how the order of the vanishing estimator may

be selected automatically. Section 5 presents numerical experiments which conﬁrm that

the estimators converge at the expected rates, and show that they are already practical

for moderate values of n. Section 6 discusses future work. Proofs of certain technical

lemmas are deferred to Appendix C.

1.3. Connection with function approximation

As noted by e.g. Novak (2016), there is a strong connection between (unbiased) integra-

tion and function approximation. If one is able to construct an optimal approximation

An(f)of f∈ Cr([0,1]s), that is ∥f− An(f)∥∞=O(n−r/s)(see Novak (1988), page 36)

then one may derive the following unbiased estimate of I(f)

I(f) := I(An(f)) + 1

i=1

(f− An(f)) (Ui), Ui

iid

∼ U([0,1]s)(1)

which is also optimal, in the sense that its RMSE is O(n−1/2−r/s)for estimating I(f).

The (non-vanishing) estimator proposed in this paper for integrating a function f∈

Cr([0,1]s)is to some extent related to this idea, with An(f)a piecewise polynomial

approximation of fbased on local Taylor expansions in which the partial derivatives of

fare approximated using numerical diﬀerentiation techniques. Note however that we

use stratiﬁed random variables, rather than independent and identically distributed ones.

This makes the estimator easier to compute, and reduces its variance.

1.4. Notation regarding derivatives and Taylor expansions

Let Nbe the set of positive integers, and N0=N∪ {0}. For α∈Ns

0, let |α|0=

s−Ps

j=1 1{0}(αj),|α|=Ps

i=1 α,α! = Qs

i=1 αi!,uα=Qs

i=1 uαi

ifor u∈Rs. For

g∈ Cr([0,1]s)we let Dαg: [0,1]s→Rbe deﬁned by

Dαg(u) = ∂|α|

∂uα1

1. . . ∂uαs

g(u), u ∈[0,1]s,

with the convention Dαg=gif |α|= 0, and we let ∥g∥r:= maxα:|α|=r∥Dαg∥∞.

With this notation in place, we recall that if g∈ Cr([0,1]s)then, by Taylor’s theorem,

there exists a function Rg,r : [0,1]s×[0,1]s→Rsuch that (Loomis and Sternberg (1968),

Section 3.17, page 191)

g(v) =

r−1

l=0 X

α:|α|=l

(v−u)αDαg(u)

α!+Rg,r(u, v),∀u, v ∈[0,1]s(2)

where, for some τu,v ∈[0,1],

Rg,r(u, v) = X

α:|α|=r

Dαg(u+τu,v(v−u))

α!(v−u)α.(3)

1.5. Notation related to stratiﬁcation

Throughout the paper, f: [0,1]s→Rand s≥1. Our approach relies on stratifying

[0,1]sinto ksclosed hyper-cubes, k≥2, and performing a certain number lof evaluations

of finside each hyper-cube; see Figure 1. The total number of evaluations is therefore

something like n=lks, but with a value for lthat depends on the considered estimator

and other parameters such as r. Thus, we will index the proposed estimators by k, e.g

Ik(f)(or b

Ir,k(f)when it also depends on r) rather than n. We will provide the exact

expression of nalongside the deﬁnition of the considered estimator.

For c∈Rsand k≥1, we use the short-hand Bk(c)for the hyper-cube Qs

i=1[ci−

1/2k, ci+ 1/2k]; in other words, the ball with radius 1/2kand centre rwith respect to

the maximum norm.

For m∈N0let

Cm,k =2j1+ 1

2k,...,2js+ 1

2ks.t. (j1, . . . , js)∈ {−m, . . . , k +m−1}s(4)

be the set of the centres of the (k+2m)shypercubes Bk(c)whose union is equal to the set

Sm,k := [−m/k, 1+m/k]s. In Section 2, we will set m= 0 and recover the aforementioned

stratiﬁcation; in that case, we will use the short-hand Ck:= C0,k. However, in order to

deﬁne the second (vanishing) estimator in Section 3, we shall take m≥0.

To each c∈Cm,k (with m, again, ﬁxed and determined by the context), we associate

a random variable Ucsuch that

Uc∼ U −1

2k,1

2ks.

Notice that the support of c+Ucis Bk(c).

Figure 1: Stratiﬁcation of [0,1]swhen s= 2 and k= 5, and two evaluations are performed

in each of the ks= 25 squares. The location of the points are generated as in

Haber’s second estimator, which we discuss in Section 2.1.

2. Integration of functions in Cr([0,1]s)

2.1. Preliminaries: Haber’s estimators

In Haber (1966) Haber introduced the following estimator:

I1,k(f) := 1

ksX

c∈Ck

f(c+Uc), Uc∼ U −1

2k,1

2ks(5)

based on n=ksevaluations of f, which is optimal for r= 1; i.e. its RMSE is

O(n−1/2−1/s)provided f∈ C1([0,1]s). To establish this result, note that each term

f(c+Uc)has expectation ks´Bk(c)f(u)duand variance O(n−2/s), since |f(u)−f(v)|=

O(k−1) = O(n−1/s)for u, v ∈Bk(c).

We note in passing that an alternative, and closely related, estimator may be obtained

by approximating fwith the piecewise constant function fndeﬁned by

fn(u) = X

c∈Ck

f(c)1Bk(c)(u), u ∈[0,1]s

and using that particular fnin (1). Since ∥f−fn∥∞=O(n−1/s)when f∈ C1([0,1]s),

this alternative estimator is indeed optimal for r= 1. The estimator deﬁned in (5) is

however slightly more convenient to compute, and relies on only nevaluations (versus

2nfor the alternative estimator).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Higher-orderMonteCarlothroughcubicstratificationNicolasChopin(1),MathieuGerber(2)(1)ENSAE,InstitutPolytechniquedeParis,Paris,France(2)SchoolofMathematics,UniversityofBristol,UKWeproposetwonovelunbiasedestimatorsoftheintegral´[0,1]sf(u)duforafunctionf,whichdependonasmoothnessparameterr∈N.Thefirstesti...

展开>> 收起<<

Higher-order Monte Carlo through cubic stratification Nicolas Chopin1 Mathieu Gerber2.pdf

共35页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Higher-order Monte Carlo through cubic stratification Nicolas Chopin1 Mathieu Gerber2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: