Batch Multi-Fidelity Active Learning with Budget Constraints Shibo Li Jeff M. Phillips Xin Yu Robert M. Kirby and Shandian Zhe

2025-05-06 0 0 1.68MB 14 页 10玖币
侵权投诉
Batch Multi-Fidelity Active Learning with Budget
Constraints
Shibo Li, Jeff M. Phillips, Xin Yu, Robert M. Kirby, and Shandian Zhe
School of Computing, University of Utah
Salt Lake City, UT 84112
{shibo, jeffp, xiny, kirby, zhe}@cs.utah.edu
Abstract
Learning functions with high-dimensional outputs is critical in many applications,
such as physical simulation and engineering design. However, collecting training
examples for these applications is often costly, e.g., by running numerical solvers.
The recent work (Li et al., 2022) proposes the first multi-fidelity active learning
approach for high-dimensional outputs, which can acquire examples at different
fidelities to reduce the cost while improving the learning performance. However,
this method only queries at one pair of fidelity and input at a time, and hence has a
risk to bring in strongly correlated examples to reduce the learning efficiency. In this
paper, we propose Batch Multi-Fidelity Active Learning with Budget Constraints
(BMFAL-BC), which can promote the diversity of training examples to improve
the benefit-cost ratio, while respecting a given budget constraint for batch queries.
Hence, our method can be more practically useful. Specifically, we propose a novel
batch acquisition function that measures the mutual information between a batch
of multi-fidelity queries and the target function, so as to penalize highly correlated
queries and encourages diversity. The optimization of the batch acquisition function
is challenging in that it involves a combinatorial search over many fidelities while
subject to the budget constraint. To address this challenge, we develop a weighted
greedy algorithm that can sequentially identify each (fidelity, input) pair, while
achieving a near
(1 1/e)
-approximation of the optimum. We show the advantage
of our method in several computational physics and engineering applications.
1 Introduction
Applications, such as in computational physics and engineering design, often demand we calculate
a complex mapping from low-dimensional inputs to high-dimensional outputs, such as finding the
optimal material layout (output) given the design parameters (input), and solving the solution field on
a mesh (output) given the PDE parameters (input). Computing these mappings is often very expensive,
e.g., iteratively running numerical solvers. Hence, learning a surrogate model to outright predict the
mapping, which is much faster and cheaper, is of great practical interest and importance (Kennedy
and O’Hagan, 2000; Conti and O’Hagan, 2010)
However, collecting training examples for the surrogate model becomes another bottleneck, since
each example still requires a costly computation. To alleviate this issue, Li et al. (2022) developed
DMFAL, a first deep multi-fidelity active learning algorithm, which can acquire examples at different
fidelities to reduce the cost of data collection. Low-fidelity examples are cheap to compute (e.g.,
with coarse meshes) yet inaccurate; high-fidelity examples are accurate but much more expensive
(e.g., calculated with dense grids). See Fig. 1 for an illustration. DMFAL uses an optimization-based
acquisition method to dynamically identify the input and fidelity at which to query new examples, so
as to improve the learning performance, lower the sample complexity, and reduce the cost.
Equal contribution Correspondence to: Jeff M. Phillips, Shandian Zhe.
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
arXiv:2210.12704v1 [cs.LG] 23 Oct 2022
Despite its effectiveness, DMFAL can only optimize and query at one pair of input and fidelity each
time and hence ignores the correlation between consecutive queries. As a result, it has a risk of
bringing in strongly correlated examples, which can restrict the learning efficiency and lead to a
suboptimal benefit-cost ratio. In addition, the sequential querying and training strategy is difficult to
utilize parallel computing resources that are common nowadays (e.g., multi-core CPUs/GPUs and
computer clusters) to query concurrently and to further speed up.
In this paper, we propose BMFAL-BC, a batch multi-fidelity active learning method with budget
constraints. Our method can acquire a batch of multi-fidelity examples at a time to inhibit the
example correlations, promote diversity so as to improve the learning efficiency and benefit-cost
ratio. Our method can respect a given budget in issuing batch queries, hence are more widely
applicable and practically useful. Specifically, we first propose a novel acquisition function, which
measures the mutual information between a batch of multi-fidelity queries and the target function.
The acquisition function not only can penalize highly correlated queries to encourage diversity,
but also can be efficiently computed by an Monte-Carlo approximation. However, optimizing
the acquisition function is challenging because it incurs a combinatorial search over fidelities and
meanwhile needs to obey the constraint. To address this challenge, we develop a weighted greedy
algorithm. We sequentially find one pair of fidelity and input each step, by maximizing the increment
of the mutual information weighted by the cost. In this way, we avoid enumerating the fidelity
combinations and greatly improve the efficiency. We prove that our greedy algorithm nearly achieves
a(1 1
e)-approximation of the optimum, with a few minor caveats.
For evaluation, we examined BMFAL-BC in five real-world applications, including three benchmark
tasks in physical simulation (solving Poisson’s, Heat and viscous Burger’s equations), a topology
structure design problem, and a computational fluid dynamics (CFD) task to predict the velocity
field of boundary-driven flows. We compared with the budget-aware version of DMFAL, single
multi-fideity querying with our acquisition function, and several random querying strategies. Under
the same budget constraint, our method consistently outperforms the competing methods throughout
the learning process, often by a large margin.
2 Background
2.1 Problem Setting
Suppose we aim to learn a mapping
f: Ω RrRd
where
r
is small but
d
is large, e.g., hundreds
of thousands. To economically learn this mapping, we collect training examples at
M
fidelities. Each
fidelity
m
corresponds to mapping
fm: Ω Rdm
. The target mapping is computed at the highest
fidelity, i.e.,
f(x) = fM(x)
. The other
fm
can be viewed as a (rough) approximation of
f
. Note
that
dm
is unnecessarily the same as
d
for
m < M
. For example, solving PDEs on a coarse mesh
will give a lower-dimensional output (on the mesh points). However, we can interpolate it to the
d
-dimensional space to match
f(·)
(this is standard in physical simulation (Zienkiewicz et al., 1977)).
Denote by λmthe cost of computing fm(·)at fidelity m. We have λ1. . . λM.
2.2 Deep Multi-Fidelity Active Learning (DMFAL)
To effectively estimate
f
while reducing the cost, Li et al. (2022) proposed DMFAL, a multi-fidelity
deep active learning approach. Specifically, a neural network (NN) is introduced for each fidelity
m
, where a low-dimensional hidden output
hm(x)
is first generated, and then projected to the
high-dimensional observation space. Each NN is parameterized by
(Am,Wm,θm)
, where
Am
is
the projection matrix,
Wm
is the weight matrix of the last layer, and
θm
consists of the remaining
NN parameters. The model is defined as follows,
xm= [x;hm1(x)],hm(x) = Wmφθm(xm),ym(x) = Amhm(x) + ξm,(1)
where
xm
is the input to the NN at fidelity
m
,
ym(x)
is the observed
dm
dimensional output,
ξm
N(·|0, τmI)
is a random noise,
φθm(xm)
is the output of the second last layer and can be viewed
as a nonlinear feature transformation of
xm
. Since
xm
includes not only the original input
x
, but
also the hidden output from the previous fidelity, i.e.,
hm1(x)
, the model can propagate information
throughout fidelities and capture the complex relationships (e.g., nonlinear and nonstationary) between
different fidelities. The whole model is visualized in Fig. 5 of Appendix. To estimate the posterior
of the model, DMFAL uses a structural variational inference algorithm. A multi-variate Gaussian
2
high-fidelity, costly to solvelow-fidelity, cheap to solve
multi-fidelity surrogate to predict solution fields
PDE parameters:
<latexit sha1_base64="WO3GovjUHgiDGxzsliGdNbs2GTw=">AAACC3icbVDLSgMxFM34rPU16tJNaBEEscxIUTdC0Y3LCvYBbRkyadqGZjJDciMtQ/du/BU3LhRx6w+4829MHwttPXDh5Jx7yb0nTATX4HnfztLyyuraemYju7m1vbPr7u1XdWwUZRUai1jVQ6KZ4JJVgINg9UQxEoWC1cL+zdivPTCleSzvYZiwVkS6knc4JWClwM2ZAPAJNtgEA3yKm9LgJm3HYN/pYDDCV9gL3LxX8CbAi8SfkTyaoRy4X812TE3EJFBBtG74XgKtlCjgVLBRtmk0Swjtky5rWCpJxHQrndwywkdWaeNOrGxJwBP190RKIq2HUWg7IwI9Pe+Nxf+8hoHOZSvlMjHAJJ1+1DECQ4zHweA2V4yCGFpCqOJ2V0x7RBEKNr6sDcGfP3mRVM8K/nmheFfMl65ncWTQIcqhY+SjC1RCt6iMKoiiR/SMXtGb8+S8OO/Ox7R1yZnNHKA/cD5/ALlHmPc=</latexit>
ut+uux·uxx =0
<latexit sha1_base64="ZdwZZRY4yltDKU6D4D7eibOERsM=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeiF48V7Qe0oWy2k3bpZhN2N0IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3n1BpHstHM0nQj+hQ8pAzaqz00JNpv1xxq+4cZJV4OalAjka//NUbxCyNUBomqNZdz02Mn1FlOBM4LfVSjQllYzrErqWSRqj9bH7qlJxZZUDCWNmShszV3xMZjbSeRIHtjKgZ6WVvJv7ndVMTXvsZl0lqULLFojAVxMRk9jcZcIXMiIkllClubyVsRBVlxqZTsiF4yy+vktZF1bus1u5rlfpNHkcRTuAUzsGDK6jDHTSgCQyG8Ayv8OYI58V5dz4WrQUnnzmGP3A+fwBh0Y3g</latexit>
PDE solvers
Figure 1: Illustration of the motivation and goal with physical simulation as an example. It is costly to solve
every PDE from scratch. We aim to train a multi-fidelity surrogate model to directly predict high-fidelity solution
fields given the PDE parameters. To further reduce the cost of collecting the training data with numerical solvers,
we seek to develop multi-fidelity active learning algorithms.
posterior is introduced for each weight matrix,
q(Wm) = Nvec(Wm)|µm,Σm
. A variational
evidence lower bound (ELBO) is maximized via stochastic optimization and the reparameterization
trick (Kingma and Welling, 2013).
To conduct active learning, DMFAL views the most valuable example at each fidelity
m
as the one
that can best help its prediction at the highest fidelity
M
(i.e., the target function). Accordingly, the
acquisition function is defined as
a(m, x) = 1
λm
Iym(x),yM(x)|D=1
λm
(H(ym|D) + H(yM|D)H(ym,yM|D)) ,(2)
where
I(·,·)
is the mutual information,
H(·)
is the entropy, and
D
is the current training dataset.
The computation of the acquisition function is quite challenging because
ym
and
yM
are both high
dimensional. To address this issue, DMFAL takes advantage of the fact that each low-dimensional
output
hm(x)
is a nonlinear function of the random weight matrices
{W1,...,Wm}
. Based on the
variational posterior
{q(Wj)}
, DMFAL uses the multi-variate delta method (Oehlert, 1992; Bickel
and Doksum, 2015) to estimate the mean and covariance of
b
h= [hm;hM]
, with which to construct
a joint Gaussian posterior approximation for
b
h
by moment matching. Then, from the projection
in
(1)
, we can derive a Gaussian posterior for
[ym;yM]
. By further applying Weinstein-Aronszajn
identify (Kato, 2013), we can compute the entropy terms in (2) conveniently and efficiently.
Each time, DMFAL maximizes the acquisition function
(2)
to identify a pair of input and fidelity at
which to query. The new example is added into
D
, and the deep multi-fidelity model is retrained and
updated. The process repeats until the maximum number of training examples have been queried or
other stopping criteria are met.
3 Batch Multi-Fidelity Active Learning with Budget Constraints
While effective, DMFAL can only identify and query at one input and fidelity each time, hence
ignoring the correlations between the successive queries. As a result, highly correlated examples are
easier to be acquired and incorporated into the training set. Consider we have found
(x, m)
that
maximizes the acquisition function
(2)
. It is often the case that a single example will not cause a
significant change of the model posterior
{q(Wm)}
(especially when the current dataset
D
is not very
small). When we optimize the acquisition function again, we are likely to obtain a new pair
(ˆ
x,ˆm)
that is close to
(x, m)
(e.g.,
ˆm=m
and
ˆ
x
close to
x
). Accordingly, the queried example
will be highly correlated to the previous one. The redundant information within these examples can
restrict the learning efficiency, and demand for more queries (and cost) to achieve the satisfactory
performance. Note that the similar issue have been raised in other single-fidelity, pool-based active
learning works, e.g., (Geifman and El-Yaniv, 2017; Kirsch et al., 2019); see Sec 4.
To overcome this problem, we propose BMFAL-BC, a batch multi-fidelity active learning approach
to reduce the correlations and to promote the diversity of training examples, so that we can improve
the learning performance, lower the sample complexity, and better save the cost. In addition, we take
into account the budget constraint in querying the batch examples, which is common in practice (like
cloud or high-performance computing) (Mendes et al., 2020). Under the given budget, we aim to find
a batch of multi-fidelity queries that improves the benefit-cost ratio as much as possible.
3
摘要:

BatchMulti-FidelityActiveLearningwithBudgetConstraintsShiboLi,JeffM.Phillips,XinYu,RobertM.Kirby,andShandianZheSchoolofComputing,UniversityofUtahSaltLakeCity,UT84112{shibo,jeffp,xiny,kirby,zhe}@cs.utah.eduAbstractLearningfunctionswithhigh-dimensionaloutputsiscriticalinmanyapplications,suchasphysi...

展开>> 收起<<
Batch Multi-Fidelity Active Learning with Budget Constraints Shibo Li Jeff M. Phillips Xin Yu Robert M. Kirby and Shandian Zhe.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:1.68MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注