1
An adaptive multi-fidelity sampling framework for
safety analysis of connected and automated vehicles
Xianliang Gong, Shuo Feng, Yulin Pan
Abstract—Testing and evaluation are expensive but critical
steps in the development of connected and automated vehicles
(CAVs). In this paper, we develop an adaptive sampling frame-
work to efficiently evaluate the accident rate of CAVs, particu-
larly for scenario-based tests where the probability distribution
of input parameters is known from the Naturalistic Driving Data.
Our framework relies on a surrogate model to approximate the
CAV performance and a novel acquisition function to maximize
the benefit (information to accident rate) of the next sample
formulated through an information-theoretic consideration. In
addition to the standard application with only a single high-
fidelity model of CAV performance, we also extend our approach
to the bi-fidelity context where an additional low-fidelity model
can be used at a lower computational cost to approximate
the CAV performance. Accordingly, for the second case, our
approach is formulated such that it allows the choice of the
next sample in terms of both fidelity level (i.e., which model to
use) and sampling location to maximize the benefit per cost. Our
framework is tested in a widely-considered two-dimensional cut-
in problem for CAVs, where Intelligent Driving Model (IDM)
with different time resolutions are used to construct the high
and low-fidelity models. We show that our single-fidelity method
outperforms the existing approach for the same problem, and
the bi-fidelity method can further save half of the computational
cost to reach a similar accuracy in estimating the accident rate.
Index Terms—Connected and Automated Vehicles, safety anal-
ysis, multi-fidelity model, active learning
I. INTRODUCTION
CONNECTED and autonomous vehicles (CAVs) have
attracted increasing attention due to their potential to
improve mobility and safety while reducing the energy con-
sumed. One critical issue for the development of CAVs is their
safety testing and evaluation. In general, converged statistics
of accident rate may require hundreds of millions of miles for
each configuration of CAVs [1]. To reduce the testing cost,
scenario-based approaches have been developed [2]–[5], with
many of them testing certain simplified driving events, e.g., the
cut-in problem. In the scenario-based framework, the scenarios
(and their distribution) describing certain traffic environment
are parameterized from the Naturalistic Driving Data (NDD).
The performance of CAVs is then evaluated for given scenarios
as the input, and the accident rate of CAVs is quantified
considering the distribution of scenarios.
The scenario-based safety analysis, however, is far from a
trivial task. The difficulties lie in the high cost of evaluating
Xianliang Gong and Yulin Pan are with the Department of Naval Archi-
tecture and Marine Engineering, University of Michigan, 48109, MI, USA
(e-mail: xlgong@umich.edu; yulinpan@umich.edu)
Shuo Feng is with the Department of Automation, Tsinghua University,
Beijing 100084, China, (e-mail: fshuo@tsinghua.edu.cn)
the CAV performance given a scenario (say, using road tests
or high-fidelity simulators) and the rareness of accidents in the
scenario space [6]. The two factors result in a large number
of required CAV performance evaluations that can become
prohibitively expensive (either computationally or financially)
if a standard Monte Carlo method is used. In order to address
the problem, many methods have been developed to reduce the
number of scenario evaluations in safety analysis. One cate-
gory of methods rely on importance sampling, where samples
are selected from a proposal distribution to stress the critical
input regions (leading to most accidents). Different ways to
construct the proposal distribution have been developed in
[7]–[11], leading to significant acceleration compared to the
Monte Carlo method. In particular, the proposal distribution
is constructed in [7] from a parametric distribution with its
parameters determined from the cross-entropy method [12].
The method in [7] is further improved in [8] by employing
piece-wise parametric distributions (i.e., different parameters
used for different parts of input space). The proposal dis-
tribution can also be constructed via a low-fidelity model
(assumed to be associated with negligible cost), either from
the low-fidelity model evaluation itself [9], [10] or additionally
leveraging a small number of adaptively selected high-fidelity
model evaluations [11].
Another category of methods in safety analysis is based on
adaptive sampling enabled by active learning method. Under
this approach, a proposal distribution is not needed, and one
directly computes the accident rate according to the input
scenario probability with a surrogate model approximating the
CAV performance. The surrogate model can be established
through a supervised learning approach, say a Gaussian pro-
cess regression, together with an adaptive sampling algorithm
to choose the next-best sample through optimization of a pre-
defined acquisition function. Such choice of the next sample
is expected to accelerate the convergence of the accident rate
computed from the updated surrogate. This class of methods
were first developed for structural reliability analysis [13]–
[18] and have recently been introduced to the CAV field [11],
[19]–[22].
To provide more details, two acquisition functions are pro-
posed in [19], respectively designed for Gaussian process re-
gression and k-nearest neighbors as surrogate models, in order
to better resolve performance boundaries between accidents
and safe scenarios. These acquisition functions combine explo-
ration and exploitation under some heuristic consideration of
the surrogate models. The approach in [19] is extended in [21]
by clustering samples into different groups that allow a parallel
search of optimal samples to accelerate the overall algorithm.
arXiv:2210.14114v2 [cs.RO] 31 May 2023