Bootstrap in High Dimension with Low Computation

2025-05-06 0 0 1.02MB 35 页 10玖币
侵权投诉
Bootstrap in High Dimension with Low Computation
Henry Lam 1Zhenyuan Liu 1
Abstract
The bootstrap is a popular data-driven method to
quantify statistical uncertainty, but for modern
high-dimensional problems, it could suffer from
huge computational costs due to the need to re-
peatedly generate resamples and refit models. We
study the use of bootstraps in high-dimensional
environments with a small number of resamples.
In particular, we show that with a recent “cheap”
bootstrap perspective, using a number of resam-
ples as small as one could attain valid coverage
even when the dimension grows closely with the
sample size, thus strongly supporting the imple-
mentability of the bootstrap for large-scale prob-
lems. We validate our theoretical results and com-
pare the performance of our approach with other
benchmarks via a range of experiments.
1. Introduction
The bootstrap is a widely used method for statistical uncer-
tainty quantification, notably confidence interval construc-
tion (Efron & Tibshirani,1994;Davison & Hinkley,1997;
Shao & Tu,2012;Hall & Martin,1988). Its main idea is
to resample data and use the distribution of resample esti-
mates to approximate a sampling distribution. Typically, this
approximation requires running many Monte Carlo repli-
cations to generate the resamples and refit models. This is
affordable for classical problems, but for modern large-scale
problems, this repeated fitting could impose tremendous
computation concerns. This issue motivates an array of re-
cent works to curb the computation effort, mostly through a
“subsampling” perspective that fits models on smaller data
sets in the bootstrap process, e.g., Kleiner et al. (2012);
Lu et al. (2020); Giordano et al. (2019); Schulam & Saria
(2019); Alaa & Van Der Schaar (2020).
In contrast to subsampling, we consider in this paper the
1
Department of Industrial Engineering and Operations Re-
search, Columbia University, New York, NY, USA. Correspon-
dence to: Henry Lam <khl2114@columbia.edu>.
Proceedings of the
40 th
International Conference on Machine
Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright
2023 by the author(s).
reduction in bootstrap computation cost by using a fewer
number of Monte Carlo replications or resamples. In par-
ticular, we target the following question: Is it possible to
run a valid bootstrap for high-dimensional problems with
very little Monte Carlo computation? While conventional
bootstraps rely heavily on adequate resamples, recent work
(Lam,2022a;b) shows that it is possible to reduce the resam-
pling effort dramatically, even down to one Monte Carlo
replication. The rough idea of this “cheap” bootstrap is to ex-
ploit the approximate independence among the original and
resample estimates, instead of their distributional closeness
utilized in the conventional bootstraps. We will leverage
this recent idea in this paper. However, since Lam (2022a;b)
is based purely on asymptotic derivation, giving an affirma-
tive answer to the above question requires the study of new
finite-sample bounds to draw understanding on bootstrap
behaviors jointly in terms of problem dimension
p
, sample
size nand number of resamples B.
To this end, our main theoretical contribution in this paper
is three-fold:
General Finite-Sample Bootstrap Bounds: We derive
general finite-sample bounds on the coverage error of con-
fidence intervals aggregated from
B
resample estimates,
where
B
is small using the “cheap” bootstrap idea, and
B=
for traditional quantile-based bootstrap methods
including the basic and percentile bootstraps (e.g., Davison
& Hinkley (1997) Section 5.2-5.3). Our bounds reveal that,
given the same primitives on the approximate normality of
the original and each resample estimate, the cheap bootstrap
with fixed small
B
achieves similar coverage error bounds
as conventional bootstraps using infinite resamples. This
also simultaneously recovers the main result in Lam (2022a),
but stronger in terms of the finite-sample guarantee.
Bootstrap Bounds on Function-of-Mean Models Explicit
in
p
,
n
and
B
:We specialize our general bounds above to
the function-of-mean model that is customary in the high-
dimensional Berry-Esseen and central limit theorem (CLT)
literature (Pinelis & Molzon,2016;Zhilova,2020). In partic-
ular, our bounds explicit on
p
,
n
and
B
conclude vanishing
coverage errors for the cheap bootstrap when
p=o(n)
, for
any given
B1
. Note that the function-of-mean model
does not capture all interesting problems, but it has been
commonly used – and in fact, appears the only model used
1
arXiv:2210.10974v4 [stat.ME] 19 Jun 2023
Bootstrap in High Dimension with Low Computation
in deriving finite-sample CLT errors for technicality reasons.
Our bounds shed light that, at least for this wide class of
models, using a small number of resamples can achieve a
good coverage even in a dimension pgrowing closely with
n.
Bootstrap Bounds on Linear Models Independent of
p
:
We further specialize our bounds to linear functions with
weaker tail conditions, which have orders independent of
p
under certain conditions on the
Lp
norm or Orlicz norm of
the linearly scaled random variable.
In addition to theoretical bounds, we investigate the empiri-
cal performances of bootstraps using few resamples on large-
scale problems, including high-dimensional linear regres-
sion, high-dimensional logistic regression, computational
simulation modeling, and a real-world data set RCV1-v2
(Lewis et al.,2004). To give a sense of our comparisons
that support using the cheap bootstrap in high dimension,
here is a general conclusion observed in our experiments:
Figure 1(a) shows the coverage probabilities of
95%
-level
confidence intervals for three regression coefficients with
corresponding true values
0,2,1
in a 9000-dimensional
linear regression (in Section 4). The cheap bootstrap cov-
erage probabilities are close to the nominal level
95%
even
with one resample, but the basic and percentile bootstraps
only attain around
80%
coverage with ten resamples. In
this example, one Monte Carlo replication to obtain each
resample estimate takes around 4 minutes in the virtual ma-
chine e2-highmem-2 in Google Cloud Platform. Therefore,
the cheap bootstrap only requires 4 minutes to obtain a sta-
tistically valid interval, but the standard bootstrap methods
are still far from the nominal coverage even after more than
a 40-minute run. Figure 1(b) shows the average interval
widths. This reveals the price of a wider interval for the
cheap bootstrap when the Monte Carlo budget is very small,
but considering the low coverages in the other two methods
and the fast decay of the cheap bootstrap width for the first
few number of resamples, such a price appears secondary.
Notation. For a random vector
X
, we write
Xk
as the ten-
sor power
Xk
. The vector norm is taken as the usual
Euclidean norm. The matrix and tensor norms are taken
as the operator norm. For a square matrix
M
,
tr(M)
de-
notes the trace of
M
.
Ip×p
denotes the identity matrix in
Rp×p
and
1p
denotes the vector in
Rp
whose components
are all
1
.
Φ
denotes the cumulative distribution function of
the standard normal.
C2(Rp)
denotes the set of twice con-
tinuously differentiable functions on
Rp
. Throughout the
whole paper, we use
C > 0
(without subscripts) to denote
a universal constant which may vary each time it appears.
We use
C1, C2, . . .
to denote constants that could depend on
other parameters and we will clarify their dependence when
using them.
2. Background on Bootstrap Methods
We briefly review standard bootstrap methods and from
there the recent cheap bootstrap. Suppose we are interested
in estimating a target statistical quantity
ψ:= ψ(PX)
where
ψ(·) : P 7→ R
is a functional defined on the probability mea-
sure space
P
. Given i.i.d. data
X1, . . . , XnRp
following
the unknown distribution
PX
, we denote the empirical dis-
tribution as
ˆ
PX,n(·) := (1/n)Pn
i=1 I(Xi∈ ·)
. A natural
point estimator is ˆ
ψn:= ψ(ˆ
PX,n).
To construct a confidence interval from
ˆ
ψn
, a typical be-
ginning point is the distribution of
ˆ
ψnψ
from which we
can pivotize. As this distribution is unknown in general,
the bootstrap idea is to approximate it using the resample
counterpart, as if the empirical distribution was the true
distribution. More concretely, conditional on
X1, . . . , Xn
,
we repeatedly, say for
B
times, resample (i.e., sample
with replacement) the data
n
times to obtain resamples
{Xb
1, . . . , Xb
n}, b = 1, . . . , B
. Denoting
ˆ
Pb
X,n
as the re-
sample empirical distributions, we construct
B
resample
estimates
ˆ
ψb
n:= ψ(ˆ
Pb
X,n)
. Then we use the
α/2
and
(1 α/2)
-th quantiles of
ˆ
ψb
nˆ
ψn
, called
qα/2
and
q1α/2
,
to construct
[ˆ
ψnq1α/2,ˆ
ψnqα/2]
as a
(1 α)
-level
confidence interval, which is known as the basic bootstrap
(Davison & Hinkley (1997) Section 5.2). Alternatively, we
could also use the
α/2
and
(1α/2)
-th quantiles of
ˆ
ψb
n
, say
q
α/2
and
q
1α/2
, to form
[q
α/2, q
1α/2]
, which is known as
the percentile bootstrap (Davison & Hinkley (1997) Section
5.3). There are numerous other variants in the literature,
such as studentization (Hall,1988), calibration or iterated
bootstrap (Hall,1986a;Beran,1987), and bias correction
and acceleration (Efron,1987;DiCiccio et al.,1996;DiCic-
cio & Tibshirani,1987), with the general goal of obtaining
more accurate coverage.
All the above methods rely on the principle that
ˆ
ψnψ
and
ˆ
ψ
nˆ
ψn
(conditional on
X1, . . . , Xn
) are close in distri-
bution. Typically, this means that, with a
n
-scaling, they
both converge to the same normal distribution. In contrast,
the cheap bootstrap proposed in Lam (2022a;b) constructs a
(1 α)-level confidence interval via
hˆ
ψntB,1α/2Sn,B ,ˆ
ψn+tB,1α/2Sn,B i,(1)
where
S2
n,B = (1/B)PB
b=1(ˆ
ψb
nˆ
ψn)2
, and
tB,1α/2
is
the
(1 α/2)
-th quantile of
tB
, the
t
-distribution with de-
gree of freedom
B
. The quantity
S2
n,B
resembles the sample
variance of the resample estimates
ˆ
ψb
n
s, in the sense that as
B→ ∞
,
S2
n,B
approaches the bootstrap variance
V ar(ˆ
ψ
n)
(where
V ar(·)
denotes the variance of a resample condi-
tional on the data). In this way,
(1)
reduces to the normality
interval with a “plug-in” estimator of the standard error term
when
B
and
n
are both large. However, intriguingly,
B
does
not need to be large, and
S2
n,B
is not necessarily close to
2
Bootstrap in High Dimension with Low Computation
2 4 6 8 10
number of resamples
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
estimated coverage probability
Cheap Boostrap βi= 0
Basic Boostrap βi= 0
Percentile Boostrap βi= 0
Cheap Boostrap βi= 2
Basic Boostrap βi= 2
Percentile Boostrap βi= 2
Cheap Boostrap βi=1
Basic Boostrap βi=1
Percentile Boostrap βi=1
(a) Coverage probability
2 4 6 8 10
number of resamples
0.1
0.2
0.3
0.4
0.5
0.6
0.7
estimated confidence interval width
Cheap Boostrap βi= 0
Basic/Percentile Boostrap βi= 0
Cheap Boostrap βi= 2
Basic/Percentile Boostrap βi= 2
Cheap Boostrap βi=1
Basic/Percentile Boostrap βi=1
(b) Confidence interval width
Figure 1. Empirical coverage probabilities and confidence interval widths for different numbers of resamples in a linear regression.
the bootstrap variance. Instead, the idea is to consider the
joint distribution of
ˆ
ψnψ, ˆ
ψ1
nˆ
ψn,..., ˆ
ψB
nˆ
ψn
that
is argued to be asymptotically independent as
n→ ∞
and
B
fixed, which subsequently allows the construction
of a pivotal
t
-statistic and gives rise to
(1)
for fixed
B
. A
more detailed explanation on the cheap bootstrap in the
low-dimensional case (i.e.,
p
is fixed) is given in Appendix
A.
3. High-Dimensional Bootstrap Bounds
As explained in the introduction, we aim to study the cov-
erage performance of bootstraps in high-dimensional prob-
lems, focusing on the cheap bootstrap approach that allows
the use of small computation effort. We describe our re-
sults at three levels, first under general assumptions (Sec-
tion 3.1), then more explicit bounds under the function-of-
mean model and sub-Gaussianity of
X
(Section 3.2), finally
bounds for a linear function under weaker tail assumptions
on X(Section 3.3).
3.1. General Finite-Sample Bounds
We have the following finite-sample bound for the cheap
bootstrap:
Theorem 3.1. Suppose we have the finite-sample accuracy
for the estimator ˆ
ψn
sup
xRP(n(ˆ
ψnψ)x)Φ(x/σ)≤ E1,(2)
and with probability at least
1β
we have the finite-sample
accuracy for the bootstrap estimator ˆ
ψ
n
sup
xRP(n(ˆ
ψ
nˆ
ψn)x)Φ(x/σ)≤ E2,(3)
where
σ > 0
,
E1
and
E2
are deterministic quantities, and
P
denotes the probability on a resample conditional on
the data. Then the coverage error of
(1)
satisfies, for any
B1,
P(|ψˆ
ψn| ≤ tB,1α/2Sn,B )(1 α)
2E1+ 2BE2+β. (4)
Condition (2) is a Berry-Esseen bound (Bentkus,2003;
Pinelis & Molzon,2016) that gauges the normal approxima-
tion for the original estimate
ˆ
ψn
. Condition (3) manifests a
similar normal approximation for the resample estimate
ˆ
ψ
n
,
and has been a focus in the high-dimensional CLT literature
(Zhilova,2020;Lopes,2022;Chernozhukov et al.,2020).
Both conditions (in their asymptotic form) are commonly
used to establish the validity of standard bootstrap methods.
Theorem 3.1 shows that, under these conditions, the cover-
age error of the cheap bootstrap interval
(1)
with any
B1
can be controlled. Note that Theorem 3.1 is very general in
the sense that there is no direct assumption applied on the
form of
ψ(·)
– All we assume is approximate normality in
the sense of (2) and (3). Due to technical delicacies, in the
bootstrap literature, finite-sample or higher-order coverage
errors are typically obtainable only with specific models
(Hall,2013;Zhilova,2020;Lopes,2022;Chernozhukov
et al.,2020), the most popular being the function-of-mean
model (see Section 3.2) or even simply the sample mean.
In contrast, the bound in Theorem 3.1 that concludes the
sufficiency in using a very small
B
is a general statement
that does not depend on the delicacies of
ψ(·)
. Moreover,
by plugging in suitable bounds for
E1,E2, β
under regularity
conditions, Theorem 3.1 also recovers the main result (The-
orem 1) in Lam (2022a). The detailed proof of Theorem
3.1 is in Appendix D(and so are the proofs of all other
theorems). Below we give a sketch of the main argument.
3
Bootstrap in High Dimension with Low Computation
Proof sketch of Theorem 3.1.Step 1: We write the cover-
age probability as the expected value (with respect to data)
of a multiple integral with respect to the distributions of
n(ˆ
ψ
nˆ
ψn)(denoted by Q, conditional on data), i.e.,
P(|ψˆ
ψn| ≤ tB,1α/2Sn,B )
=P
n(ˆ
ψnψ)
q1
BPB
b=1(n(ˆ
ψb
nˆ
ψn))2tB,1α/2
=E
Z|n(ˆ
ψnψ)|
PB
b=1 z2
b/B tB,1α/2
dQ(zB)···dQ(z1)
.
(5)
Step 2: Suppose (3) happens and denote this event by
A
which satisfies
P(Ac)β
. For each
b= 1, . . . , B
, given
all other
zb, b̸=b
, the integration region is of the form
zb(−∞,q][q, )
for some
q0
. Then we can
replace the distribution
Q
by the distribution of
N(0, σ2)
(denoted by
P0
) with controlled error given in (3) and obtain
E
Z|n(ˆ
ψnψ)|
PB
b=1 z2
b/B tB,1α/2
dQ(zB)···dQ(z1)
=E
Z|n(ˆ
ψnψ)|
PB
b=1 z2
b/B tB,11α/2
dP0(zB)···dP0(z1)
+R1,(6)
where
|R1| ≤ 2BE2+β
accounts for the error from (3) and
the small probability event Ac.
Step 3: Following the same logic in Step 2 and noticing that
the integration region for
n(ˆ
ψnψ)
is
[q, q]
for some
q0
, we can also replace the distribution of
n(ˆ
ψnψ)
by the distribution
P0
with controlled error
|R2| ≤ 2E1
according to (2):
E
Z|n(ˆ
ψnψ)|
PB
b=1 z2
b/B tB,1α/2
dP0(zB)···dP0(z1)
=Z|z0|
PB
b=1 z2
b/B tB,1α/2
dP0(zB)···dP0(z1)dP0(z0)
+R2
= 1 α+R2.(7)
Step 4: Plugging (6) and (7) back into (5), we can express
the coverage probability as a sum of the nominal level and
the remainder term:
P(|ψˆ
ψn| ≤ tB,1α/2Sn,B ) = 1 α+R1+R2
with error
|R1+R2| ≤ 2E1+ 2BE2+β
. This gives our
conclusion.
Theorem 3.1 is designed to work well for small
B
(our target
scenario), but deteriorates when
B
grows. However, in the
latter case, we can strengthen the bound to cover the large-
B
regime with additional conditions on the variance estimator
(see Appendix B.1).
We compare with standard basic and percentile bootstraps
using B=. Below is a generalization of Zhilova (2020)
which focuses only on the basic bootstrap under the function-
of-mean model.
Theorem 3.2. Suppose the conditions in Theorem 3.1 hold.
If
qα/2
,
q1α/2
are the
α/2
-th and
(1 α/2)
-th quantiles
of
ˆ
ψ
nˆ
ψn
respectively given
X1, . . . , Xn
, then a finite-
sample bound on the basic bootstrap coverage error is
|P(ˆ
ψnq1α/2ψˆ
ψnqα/2)(1 α)|
2E1+ 2E2+ 2β. (8)
If
q
α/2
,
q
1α/2
are the
α/2
-th and
(1 α/2)
-th quantiles
of
ˆ
ψ
n
respectively given
X1, . . . , Xn
, then a finite-sample
bound on the percentile bootstrap coverage error is
|P(q
α/2ψq
1α/2)(1α)| ≤ 2E1+2E2+2β. (9)
In view of Theorems 3.1 and 3.2, the cheap bootstrap with
any fixed
B
can achieve the same order of coverage error
bound as the basic and percentile bootstraps with
B=
,
in the sense that
(1/2) EBQuantile EBCheap BEBQuantile,(10)
where
EBCheap
is the RHS error bound of
(4)
and
EBQuantile
is
that of
(8)
or
(9)
. This shows that, to attain a good coverage
that is on par with standard basic/percentile bootstraps, it
suffices to use the cheap bootstrap with a small
B
which
could save computation dramatically.
Besides coverage, another important quality of confidence
interval is its width. To this end, note that for any fixed
B
,
(3) ensures that
nSn,B σpχ2
B/B
(unconditionally as
n→ ∞
with proper model assumptions) . Therefore, the
half-width of (1) is approximately
tB,1α/2σpχ2
B/(nB)
with expected value
E"tB,1α/2σrχ2
B
nB #=tB,1α/2σr2
Bn
Γ((B+ 1)/2)
Γ(B/2) ,
(11)
where
Γ(·)
is the gamma function. Since the dimensional
impact is hidden in
σ
which is a common factor in the
expected width as
B
varies, we can see
p
does not affect the
relative width behavior as
B
changes. In particular, from
(11)
the inflation of the expected width relative to the case
4
Bootstrap in High Dimension with Low Computation
B=
is 417.3% for
B= 1
, and dramatically reduces to
94.6%, 24.8% and 10.9% for
B= 2,5,10
, thus giving an
interval with both correct coverage and short width using a
small computation budget B.
In the next sections, we will apply Theorem 3.1 to obtain
explicit bounds for specific high-dimensional models. Here,
in relation to
(10)
, we briefly comment that the order of the
coverage error bounds for these models is of order
1/n
,
both for the cheap bootstrap (which we will derive) and
state-of-the-art high-dimensional bootstrap CLT. This is
in contrast to the typical
1/n
coverage error in two-sided
bootstrap confidence intervals in low dimension (see Hall
(2013) Section 3.5 for quantile-based bootstraps and Lam
(2022a) Section 3.2 for the cheap bootstrap).
3.2. Function-of-Mean Models
We now specialize to the function-of-mean model
ψ=g(µ)
for a mean vector
µ=E[X]Rp
and smooth function
g:Rp7→ R
, which allows us to construct more explicit
bounds. The original estimate
ˆ
ψn
and resample estimate
ˆ
ψb
n
are now given by
g(¯
Xn)
and
g(¯
Xb
n)
respectively, where
¯
Xn
denotes the sample mean of data and
¯
Xb
n
denotes the
resample mean of Xb
1, . . . , Xb
n. We assume:
Assumption 3.3. The function
g(x)C2(Rp)
has Hes-
sian matrix
Hg(x)
with uniformly bounded eigenvalues,
that is,
a constant
CHg>0
s.t.
supxRp|aHg(x)a| ≤
CHg||a||2,aRp.
Assumption 3.4.
X
is sub-Gaussian, i.e., there is a constant
τ2>0
s.t.
E[exp(a(Xµ))] exp(||a||2τ2/2),a
Rp
. Furthermore,
X
has a density bounded by a constant
CX
and its covariance matrix
Σ
is positive definite with the
smallest eigenvalue λΣ>0.
Based on Theorem 3.1, we derive the following explicit
bound:
Theorem 3.5. Suppose the function
g
satisfies Assump-
tion 3.3 and random vector
X
satisfies Assumption 3.4.
Moreover, assume
||∇g(µ)|| > Cgp
for some constant
Cg>0. Then we have
P(|g(µ)g(¯
Xn)| ≤ tB,1α/2Sn,B )(1 α)
6
n+BC m31
3+CHgm1/3
31 tr(Σ)
2+CHgm2/3
32
n5/6σ
+CHgm1/3
31 m2/3
32
2+CHgτ2
CgλΣ1 + log n
prp
n
+||E[(Xµ)3]||
λ3/2
Σ
1
n+τ3
λ3/2
Σ1 + log n
p3/21
n
+τ4p
λ2
Σn1 + log n
p1/2
+τ2p
λΣn1 + log n
p1/2
+τ3p
λ3/2
Σn1 + log n
p!
+BC1τ4(log n)3/2
λ2
Σn+τ2(log n)3/2
λΣn
+τ3
λ3/2
Σn1 + log n
p1/2
(log n+ log p)plog n!,
where
m31 =E[|∇g(µ)(Xµ)|3]
,
m32 := E[||Xµ||3]
,
σ2=g(µ)Σg(µ)
,
C
is a universal constant and
C1
is a constant only depending on CX.
Theorem 3.5 is obtained by tracing the implicit quantities in
Theorem 3.1 for the function-of-mean model, via extracting
the dependence on problem parameters in the Berry-Esseen
theorems for the multivariate delta method (Pinelis & Mol-
zon,2016) and the standard bootstrap (Zhilova,2020). In
particular, the sub-Gaussian assumption is required to de-
rive finite-sample concentration inequalities, in a similar
spirit as the state-of-the-art high-dimensional CLTs (e.g.,
Chernozhukov et al. (2017); Lopes (2022)). On the other
hand, the third moments such as
||E[(Xµ)3]||
(operation
norm of the third order tensor
E[(Xµ)3]
),
m31
and
m32
are due to the use of the Berry-Esseen theorem and a mul-
tivariate higher-order Berry-Esseen inequality in Zhilova
(2020), which generally requires this order of moments. The
bound in Theorem 3.5 can be simplified with reasonable
assumptions on the involved quantities:
Corollary 3.6. Suppose the conditions in Theorem 3.5 hold.
Moreover, suppose that
τ, CHg, C1=O(1)
,
Cg, λΣ=
Θ(1)
,
||∇g(µ)||2=O(p)
,
σ2= Θ(p)
and
||E[(X
µ)3]|| =O(1). Then as p, n → ∞,
P(|g(µ)g(¯
Xn)| ≤ tB,1α/2Sn,B )(1 α)
=B×O1 + log n
prp
n
+1
n1 + log n
p1/2
(log n+ log p)plog n!.
Consequently, for any fixed
B1
, the cheap bootstrap
confidence interval is asymptotically exact provided
p=
o(n), i.e.,
lim
p,n→∞
p=o(n)
P(|g(µ)g(¯
Xn)| ≤ tB,1α/2Sn,B )=1α.
In Corollary 3.6, the cheap bootstrap coverage error shrinks
to
0
as
n→ ∞
if
p=o(n)
, i.e., the problem dimen-
sion grows slower than
n
in any arbitrary fashion. Al-
though there is no theoretical guarantee that the choice of
p=o(n)
is tight, we offer numerical evidence in Section
4where the cheap bootstrap works with a small
B
when
p/n = 0.09
but it fails (i.e., over-covers the target with a
5
摘要:

BootstrapinHighDimensionwithLowComputationHenryLam1ZhenyuanLiu1AbstractThebootstrapisapopulardata-drivenmethodtoquantifystatisticaluncertainty,butformodernhigh-dimensionalproblems,itcouldsufferfromhugecomputationalcostsduetotheneedtore-peatedlygenerateresamplesandrefitmodels.Westudytheuseofbootstrap...

展开>> 收起<<
Bootstrap in High Dimension with Low Computation.pdf

共35页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:35 页 大小:1.02MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 35
客服
关注