undesirable event (a failure), and
g(x) = NaN
is an event of unspecified performance. An undefined
value does not indicate that an undesirable event has occurred for a particular
x
, and therefore we
wish to classify these
x
differently to the failure events. The rate of failures is quantified using the
probabilistic threshold robustness (PTR) of the system, which we define as
pf=ZX
1[g(x)<0∩g(x)6=NaN]p(x)dx,(1)
where
1
is the indicator function and
p(x)
is the probability density (mass) function of
x
[
12
]. Eq.
(1)
represents the probability that the system is in the failure state, while disregarding the ‘uninteresting’
cases where the performance is undefined. Note that we are not attempting to model the distribution
of environment variables and treat
p(x)
as given. Estimating
pf
in Eq.
(1)
using a vanilla Monte
Carlo simulation can be computationally expensive since identifying a failure rate lower than
will
typically require at least
1/
tests [
13
]. In order to reduce the required number of samples, we use the
Adaptive Kriging Monte Carlo Simulation (AK-MCS) algorithm, a simple active learning technique
based on Gaussian processes which was shown to provide an extremely efficient evaluation of the
PTR measure for previously studied problems [
2
], and extend it to partially undefined performance
functions. We perform experiments comparing our proposed methodology to several naïve baselines
on problems for which the results are known analytically. We find that our approach produces a more
accurate estimation of
pf
and also that the surrogate model is a more accurate representation of the
true performance function.
2 Related Work
The PTR measure has recently become of interest in the robust optimisation literature [
12
], for
example Inatsu et al.
[4]
show how system designs can be adjusted to optimise the measure. The
measure has a much longer history in reliability engineering [
14
]. Moustapha et al.
[6]
and Teixeira
et al.
[15]
provide reviews of active learning methods for estimating this measure. The Adaptive
Kriging Monte Carlo methodology is perhaps the most well known of these methods, and achieves
close to state of the art results [
2
,
3
,
6
]. Efficient methods of estimating the PTR measure also exist in
reinforcement learning [
13
]. Beglerovic et al.
[16]
use a Bayesian optimisation approach to identify
failure cases for an autonomous vehicle but do not exhaustively search for all
x
such that
f(x)<0
and also do not calculate the PTR measure,
pf
. Similarly, Wang et al.
[17]
use a realistic LiDAR
simulator to modify real-world LiDAR data which can then be used to test end-to-end autonomous
driving systems while searching for adversarial traffic scenarios with Bayesian Optimisation. A
related problem in the autonomous vehicle space is finding the most likely
x
, i.e. with largest
p(x)
,
leading to
f(x)<0
[
18
]. This is closely related to first order methods for estimating the PTR
measure [14].
Although there exists literature related to Gaussian Process modelling for discontinuous targets [
11
],
there is little literature on active learning specifically for the PTR measure for discontinuous targets.
3 Approach
We modify the AK-MCS active learning algorithm by using a different Gaussian Process and
acquisition function to Echard et al.
[2]
. We use a hierarchical Gaussian process model for the rule
function, consisting of separate regression and classification Gaussian processes, and a modified
acquisition function which minimises the catastrophic event classification error to yield an optimal
surrogate model of the rule function. Otherwise our proposed algorithm proceeds in the same way
as the AK-MCS algorithm, i.e. an initial training set is chosen to train a Gaussian process, and
then subsequent evaluations of the performance function,
g
, are chosen iteratively by maximising a
function of the Gaussian process known as the acquisition function, which are then used to retrain
the Gaussian process. The algorithm terminates when the coefficient of variation (CoV) of the
failure probability computed using the Gaussian process is below some threshold, and the predicted
misclassification probability is also below some threshold. This algorithm is shown in Algorithm 1.
Let
y∗
be the predicted performance for the test input
x∗∈ X
where
y∈R∪ {NaN}
, and let
the dataset of training examples
D={(xi, yi)|i= 1, ..., n}
. We model the predictive distribution
2