
Despite its effectiveness, DMFAL can only optimize and query at one pair of input and fidelity each
time and hence ignores the correlation between consecutive queries. As a result, it has a risk of
bringing in strongly correlated examples, which can restrict the learning efficiency and lead to a
suboptimal benefit-cost ratio. In addition, the sequential querying and training strategy is difficult to
utilize parallel computing resources that are common nowadays (e.g., multi-core CPUs/GPUs and
computer clusters) to query concurrently and to further speed up.
In this paper, we propose BMFAL-BC, a batch multi-fidelity active learning method with budget
constraints. Our method can acquire a batch of multi-fidelity examples at a time to inhibit the
example correlations, promote diversity so as to improve the learning efficiency and benefit-cost
ratio. Our method can respect a given budget in issuing batch queries, hence are more widely
applicable and practically useful. Specifically, we first propose a novel acquisition function, which
measures the mutual information between a batch of multi-fidelity queries and the target function.
The acquisition function not only can penalize highly correlated queries to encourage diversity,
but also can be efficiently computed by an Monte-Carlo approximation. However, optimizing
the acquisition function is challenging because it incurs a combinatorial search over fidelities and
meanwhile needs to obey the constraint. To address this challenge, we develop a weighted greedy
algorithm. We sequentially find one pair of fidelity and input each step, by maximizing the increment
of the mutual information weighted by the cost. In this way, we avoid enumerating the fidelity
combinations and greatly improve the efficiency. We prove that our greedy algorithm nearly achieves
a(1 −1
e)-approximation of the optimum, with a few minor caveats.
For evaluation, we examined BMFAL-BC in five real-world applications, including three benchmark
tasks in physical simulation (solving Poisson’s, Heat and viscous Burger’s equations), a topology
structure design problem, and a computational fluid dynamics (CFD) task to predict the velocity
field of boundary-driven flows. We compared with the budget-aware version of DMFAL, single
multi-fideity querying with our acquisition function, and several random querying strategies. Under
the same budget constraint, our method consistently outperforms the competing methods throughout
the learning process, often by a large margin.
2 Background
2.1 Problem Setting
Suppose we aim to learn a mapping
f: Ω ⊆Rr→Rd
where
r
is small but
d
is large, e.g., hundreds
of thousands. To economically learn this mapping, we collect training examples at
M
fidelities. Each
fidelity
m
corresponds to mapping
fm: Ω →Rdm
. The target mapping is computed at the highest
fidelity, i.e.,
f(x) = fM(x)
. The other
fm
can be viewed as a (rough) approximation of
f
. Note
that
dm
is unnecessarily the same as
d
for
m < M
. For example, solving PDEs on a coarse mesh
will give a lower-dimensional output (on the mesh points). However, we can interpolate it to the
d
-dimensional space to match
f(·)
(this is standard in physical simulation (Zienkiewicz et al., 1977)).
Denote by λmthe cost of computing fm(·)at fidelity m. We have λ1≤. . . ≤λM.
2.2 Deep Multi-Fidelity Active Learning (DMFAL)
To effectively estimate
f
while reducing the cost, Li et al. (2022) proposed DMFAL, a multi-fidelity
deep active learning approach. Specifically, a neural network (NN) is introduced for each fidelity
m
, where a low-dimensional hidden output
hm(x)
is first generated, and then projected to the
high-dimensional observation space. Each NN is parameterized by
(Am,Wm,θm)
, where
Am
is
the projection matrix,
Wm
is the weight matrix of the last layer, and
θm
consists of the remaining
NN parameters. The model is defined as follows,
xm= [x;hm−1(x)],hm(x) = Wmφθm(xm),ym(x) = Amhm(x) + ξm,(1)
where
xm
is the input to the NN at fidelity
m
,
ym(x)
is the observed
dm
dimensional output,
ξm∼
N(·|0, τmI)
is a random noise,
φθm(xm)
is the output of the second last layer and can be viewed
as a nonlinear feature transformation of
xm
. Since
xm
includes not only the original input
x
, but
also the hidden output from the previous fidelity, i.e.,
hm−1(x)
, the model can propagate information
throughout fidelities and capture the complex relationships (e.g., nonlinear and nonstationary) between
different fidelities. The whole model is visualized in Fig. 5 of Appendix. To estimate the posterior
of the model, DMFAL uses a structural variational inference algorithm. A multi-variate Gaussian
2