
http://zavalab.engr.wisc.edu
result, HTE platforms rely on the use of design of experiments (DoE) algorithms, which aim to
systematically explore the design space.
Screening is a simple DoE approach in which experiments are performed at points on a dis-
cretized grid of the design space [23]; this approach is intuitive but does not scale well with the
number of design variables and can ultimately lead to significant waste of resources (conduct
experiments that do not provide significant information). The central aim of advanced DoE ap-
proaches is to maximize the value provided by each experiment and ultimately reduce the number
of experiments and resources used (e.g., experiment time). The value of an experiment is usually
measured either by information content (e.g., reduces model uncertainty) or if it results in a desir-
able outcome (e.g., improves an economic objective) [2]. A widely used DoE approach that aims
to tackle this problem is response surface methodology or RSM [3]; this approach is generally
sample-efficient (requires few experiments) but uses second-degree polynomial surrogate models
that can fail to accurately capture system trends. In addition, parameters used in the RSM surro-
gate model are subject to uncertainty and this uncertainty is not resolved via further experiments
[12] (i.e., RSM is an open-loop DoE technique).
Another powerful approach to DoE that aims to maximize value of experiments is Bayesian
experimental design [5]. Recently, the machine learning (ML) community has been using variants
of this paradigm to conduct closed-loop experimental design [7]. One of the most effective varia-
tions of this paradigm is the Bayesian optimization (BO) algorithm [1]; BO has been shown to be
sample-efficient and scalable (requires minimal experiments and can explore large design spaces)
[28]. BO is widely used in applications such as experimental design, hyper-parameter tuning, and
reinforcement learning. Of particular interest is the flexibility of the BO paradigm as it is capable
of accommodating both continuous and discrete (e.g., categorical) design variables as well as con-
straints (which help encode domain knowledge and restrict the design space) [4]. Additionally,
BO uses probabilistic surrogate models (e.g. Gaussian process models) which greatly facilitates
the quantification of uncertainty and information in different regions of the design space [9]; this
feature is particularly useful in guiding experiments where information gain can be as important
as performance. BO can also be tuned to emphasize exploration (by sampling regions with high
uncertainty) over exploitation (by sampling from regions with high economic performance) [17];
this trade-off is achieved by tuning the so-called acquisition function (AF), which is a composite
function that captures uncertainty and performance.
A fundamental caveat of BO is that it is inherently a sequential algorithm (samples a single
point in the design space at each iteration), limiting its ability to exploit HTE platforms. Modi-
fications to the BO algorithm have been proposed in the literature to overcome these limitations
[10,6,15]. Relevant variants include Hyperspace partitioning [33], batch Bayesian optimization
[31], NxMCMC [26], and AF optimization over a set of exploratory designs [11]. These parallel
BO approaches have been shown to perform better than sequential BO in terms of search time
2