1 An adaptive multi-fidelity sampling framework for safety analysis of connected and automated vehicles

2025-04-30 1 0 1.08MB 13 页 10玖币
侵权投诉
1
An adaptive multi-fidelity sampling framework for
safety analysis of connected and automated vehicles
Xianliang Gong, Shuo Feng, Yulin Pan
Abstract—Testing and evaluation are expensive but critical
steps in the development of connected and automated vehicles
(CAVs). In this paper, we develop an adaptive sampling frame-
work to efficiently evaluate the accident rate of CAVs, particu-
larly for scenario-based tests where the probability distribution
of input parameters is known from the Naturalistic Driving Data.
Our framework relies on a surrogate model to approximate the
CAV performance and a novel acquisition function to maximize
the benefit (information to accident rate) of the next sample
formulated through an information-theoretic consideration. In
addition to the standard application with only a single high-
fidelity model of CAV performance, we also extend our approach
to the bi-fidelity context where an additional low-fidelity model
can be used at a lower computational cost to approximate
the CAV performance. Accordingly, for the second case, our
approach is formulated such that it allows the choice of the
next sample in terms of both fidelity level (i.e., which model to
use) and sampling location to maximize the benefit per cost. Our
framework is tested in a widely-considered two-dimensional cut-
in problem for CAVs, where Intelligent Driving Model (IDM)
with different time resolutions are used to construct the high
and low-fidelity models. We show that our single-fidelity method
outperforms the existing approach for the same problem, and
the bi-fidelity method can further save half of the computational
cost to reach a similar accuracy in estimating the accident rate.
Index Terms—Connected and Automated Vehicles, safety anal-
ysis, multi-fidelity model, active learning
I. INTRODUCTION
CONNECTED and autonomous vehicles (CAVs) have
attracted increasing attention due to their potential to
improve mobility and safety while reducing the energy con-
sumed. One critical issue for the development of CAVs is their
safety testing and evaluation. In general, converged statistics
of accident rate may require hundreds of millions of miles for
each configuration of CAVs [1]. To reduce the testing cost,
scenario-based approaches have been developed [2]–[5], with
many of them testing certain simplified driving events, e.g., the
cut-in problem. In the scenario-based framework, the scenarios
(and their distribution) describing certain traffic environment
are parameterized from the Naturalistic Driving Data (NDD).
The performance of CAVs is then evaluated for given scenarios
as the input, and the accident rate of CAVs is quantified
considering the distribution of scenarios.
The scenario-based safety analysis, however, is far from a
trivial task. The difficulties lie in the high cost of evaluating
Xianliang Gong and Yulin Pan are with the Department of Naval Archi-
tecture and Marine Engineering, University of Michigan, 48109, MI, USA
(e-mail: xlgong@umich.edu; yulinpan@umich.edu)
Shuo Feng is with the Department of Automation, Tsinghua University,
Beijing 100084, China, (e-mail: fshuo@tsinghua.edu.cn)
the CAV performance given a scenario (say, using road tests
or high-fidelity simulators) and the rareness of accidents in the
scenario space [6]. The two factors result in a large number
of required CAV performance evaluations that can become
prohibitively expensive (either computationally or financially)
if a standard Monte Carlo method is used. In order to address
the problem, many methods have been developed to reduce the
number of scenario evaluations in safety analysis. One cate-
gory of methods rely on importance sampling, where samples
are selected from a proposal distribution to stress the critical
input regions (leading to most accidents). Different ways to
construct the proposal distribution have been developed in
[7]–[11], leading to significant acceleration compared to the
Monte Carlo method. In particular, the proposal distribution
is constructed in [7] from a parametric distribution with its
parameters determined from the cross-entropy method [12].
The method in [7] is further improved in [8] by employing
piece-wise parametric distributions (i.e., different parameters
used for different parts of input space). The proposal dis-
tribution can also be constructed via a low-fidelity model
(assumed to be associated with negligible cost), either from
the low-fidelity model evaluation itself [9], [10] or additionally
leveraging a small number of adaptively selected high-fidelity
model evaluations [11].
Another category of methods in safety analysis is based on
adaptive sampling enabled by active learning method. Under
this approach, a proposal distribution is not needed, and one
directly computes the accident rate according to the input
scenario probability with a surrogate model approximating the
CAV performance. The surrogate model can be established
through a supervised learning approach, say a Gaussian pro-
cess regression, together with an adaptive sampling algorithm
to choose the next-best sample through optimization of a pre-
defined acquisition function. Such choice of the next sample
is expected to accelerate the convergence of the accident rate
computed from the updated surrogate. This class of methods
were first developed for structural reliability analysis [13]–
[18] and have recently been introduced to the CAV field [11],
[19]–[22].
To provide more details, two acquisition functions are pro-
posed in [19], respectively designed for Gaussian process re-
gression and k-nearest neighbors as surrogate models, in order
to better resolve performance boundaries between accidents
and safe scenarios. These acquisition functions combine explo-
ration and exploitation under some heuristic consideration of
the surrogate models. The approach in [19] is extended in [21]
by clustering samples into different groups that allow a parallel
search of optimal samples to accelerate the overall algorithm.
arXiv:2210.14114v2 [cs.RO] 31 May 2023
2
TABLE I
SUMMARY OF EXISTING ADAPTIVE SAMPLING METHODS AND CURRENT WORK
Surrogate1Acquisition Objective
[19], [21] GPR Heuristic combination of exploitation (large gradient of GPR mean)
and exploration (large uncertainty of GPR) Performance boundary
KNN Heuristic combination of exploitation (large variance among neighboring samples)
and exploration (large distance to neighboring samples) Performance boundary
[20] GPR, KNN,
XGB, etc.
Heuristic combination of exploitation (poor performance)
and exploration (large distance to all existing samples) Accident scenarios
Heuristic combination of exploitation (performance close to accidents threshold)
and exploration (large distance to all existing samples) Performance boundary
[11] GPR Reducing variance of the proposal distribution Optimal proposal distribution
for importance sampling
[22], [23] GPR Squared change of accident rate after adding a hypothetical sample Accident rate
Current work GPR Approximation of the expected K-L divergence between current and next-step accident
rate distribution after adding a hypothetical sample Accident rate
1GPR KNN, and XGB respectively stand for Gaussian process regression, k-nearest neighbors, and extreme gradient boosting.
In [20], the authors develop two acquisition functions appli-
cable to six different surrogate models, which favor samples
expected to respectively produce (i) poor performance, and (ii)
performance close to accident threshold, and in the meanwhile,
far from existing samples (for exploration). In these works,
the proposed acquisition functions are rather empirical and
cannot guarantee optimal convergence of the accident rate.
In addition, an acquisition function that directly targets the
accident rate (in which sense similar to what we develop) is
proposed in [22], [23], but their method is not sufficiently
supported by numerical tests provided in their papers. A
summary of these existing methods is provided in Table I
(also see [5] for a more comprehensive review). In viewing
the state-of-the-art methods in the field, it is clear that large
room exists for further improvement of the sampling efficiency
(i.e., reduction of the required number of samples) through a
more rigorous information-theoretic approach to develop the
acquisition. Such developments are not only desired to reduce
the cost of CAV safety evaluation but are also valuable to the
general field of reliability analysis.
The cost in the evaluation of CAV accident rate can also
be reduced by leveraging low-fidelity models applied in con-
junction with the high-fidelity model. In principle, the low-
fidelity models can provide useful information on the surrogate
model (e.g., the general trend of the function) although their
own predictions may be associated with considerable errors.
For example, low-fidelity models have been used to generate
the proposal distribution for importance sampling [9], [11].
It needs to be emphasized that almost all existing works
(in the CAV field) assume that the low-fidelity models are
associated with negligible cost, i.e., the low-fidelity map from
scenario space to CAV performance can be considered as a
known function. However, in practical situations, the cost ratio
between high and low-fidelity models may not be that drastic.
Typical cases include (i) CARLA [24] simulator versus SUMO
simulator [25], (ii) the same simulator with fine versus coarse-
time resolutions. For these cases, a new adaptive-sampling
algorithm considering the cost ratio is needed, which is
expected to be able to select both the model (i.e., fidelity level)
and scenario for the next-best sample to reduce the overall cost
in the evaluation of the accident rate. Such methods are not
yet available for CAV testing.
In this work, we develop an adaptive sampling algorithm
in the active learning framework for safety testing and evalu-
ations of CAVs. The novelty of our method lies in the devel-
opment of an information-theoretic-based acquisition function
that leads to very high sampling efficiency and can be extended
to bi-fidelity contexts in a relatively straightforward manner.
In particular, our method is applied to two situations: (i) the
single-fidelity context where only a high-fidelity model is
available; and (ii) the bi-fidelity context where the high-to-
low model cost ratio is finite and fixed. We note that for case
(ii), our method needs to be established by using a bi-fidelity
Gaussian process as the surrogate model and an acquisition
function to select the next sample (in terms of both model
fidelity and traffic scenario) which maximizes information
gain per cost. Both applications of our method are tested
in a widely-considered two-dimensional cut-in problem for
CAVs, with the high-fidelity model taken as the Intelligent
Driving Model (IDM) with fine time resolution. The low-
fidelity model is constructed by a coarser-time-resolution IDM
model in application (ii). We compare the performance of
our method with the state-of-the-art approaches in the CAV
field for the same problem and find that even the single-
fidelity approach can considerably outperform the existing
approaches. The method in application (ii) can further reduce
the computational cost by at least a factor of 2.
We finally remark that the method we develop here is
new to the entire field of reliability analysis according to
our knowledge, and its application to other fields may prove
equally fruitful. For example, it may be applied to evaluate
the ship capsizing probability in ocean engineering [26], [27],
structural safety analysis [15], [18], probability of extreme
pandemic spikes for public health [28] and many other physi-
cal, engineering and societal problems. Within the CAV field,
our method is certainly not limited to the IDM models used in
this paper as demonstrations. It can be connected to a broad
range of CAV evaluation tools across on-road tests, closed-
facility tests, simulations based on various kinds of simulators
(e.g. Google/Waymo’s Car-Craft9 [29], Intel’s CARLA6 [30],
Microsoft’s Air-Sim7 [31], NVIDIAs Drive Constellation
[32]). Among these examples, we would like to emphasize the
3
Fig. 1. Illustration of the cut-in scenario [11]. Rand ˙
Rrespectively denote
the range and range rate between CAV and BV.
possible benefit of our method to the augmented-reality test
environment combing a real test vehicle on road and simulated
background vehicles [33]. Due to the bi-fidelity capability
of our method, it also becomes beneficial to combine two
different tools in the above list to further improve the testing
efficiency. The extension of our method to high-dimensional
problems is also possible (see [27] for another sampling
purpose) but will not be considered in this paper.
The python code for the algorithm, named MFGPreliability,
is available on Github1.
TABLE II
NOTATIONS OF VARIABLES
Variable Notation
xpx(x)Decision variables with its probability distribution
fh, fl
High and low-fidelity models mapping from decision
variables to a measure of CAV performance
R0,˙
R0Range and range rate at the cut-in moment
PaAccident rate
δThreshold to define an accident
D={X,Y}Existing dataset with inputs Xand corresponding
outputs Y
k, θ Kernel function and its hyperparameter
UAn upper bound of the uncertainty in estimating Pa
BBenefit of adding a sequential sample
ch, clCost of high and low-fidelity models
tTime step to simulate the IDM model
II. PROBLEM SETUP
We consider a black-box function fh(x) : RdRwith
input xad-dimensional decision variable of a driving scenario
and output a measure of the CAV performance. A subscript h
is used here to denote that the function needs to be evaluated
by an expensive high-fidelity model. Taking the cut-in problem
(Fig. 1) as an example, the input can be formulated as x=
(R0,˙
R0)where R0and ˙
R0denote the initial range and range
rate between the CAV and background vehicle (BV) at the
cut-in moment t= 0 (more details in Sec. IV-A). The output
is the minimum range between the two vehicles during their
speed adjustment process for t0.
The probability of the input xpx(x)is assumed to be
known from the naturalistic driving data (NDD). Our objective
is the evaluation of accident rate, i.e., probability of the output
1https://github.com/umbrellagong/MFGPreliability
smaller than some threshold δ(or range between CAV and BV
smaller than δ):
Pa=Z1δ(fh(x))px(x)dx,(1)
where
1δ(fh(x)) = (1,if fh(x)< δ
0,o.w..(2)
A brute-force computation of Pacalls for a large number
of Monte Carlo samples in the space of x, which may be-
come computationally prohibitive (considering the expensive
evaluation of fhand the small Pa). In this work, we seek
to develop an adaptive sampling framework based on active
learning, where samples are selected optimally to accelerate
the convergence of the computed value of Pa. We will present
algorithms for (i) single-fidelity cases, where only one model
fhis available, and (ii) bi-fidelity cases. For case (ii), we
consider a practical situation that a low-fidelity model flwith
a lower but finite cost is also available to us that can provide a
certain level of approximation to fh. Making use of fl, as will
be demonstrated, can further reduce the cost of computing Pa.
III. METHOD
A. Single fidelity method
We consider the single-fidelity context where only the model
fhis available. Two basic components of our active learning
method are presented below: (i) an inexpensive surrogate
model based on the standard Gaussian process; (ii) a new
acquisitive function to select the next-best sample.
1) surrogate model by GPR: Gaussian process regression
(GPR) is a probabilistic machine learning approach [34]
widely used for active learning. Consider the task of in-
ferring fhfrom D={X,Y}, which consists of ninputs
X={xiRd}i=n
i=1 and the corresponding outputs Y=
{fh(xi)R}i=n
i=1 . In GPR, a prior, representing our beliefs
over all possible functions we expect to observe, is placed
on fhas a Gaussian process fh(x)∼ GP(0, k(x,x)) with
zero mean and covariance function k(usually defined by a
radial-basis-function kernel):
k(x,x) = τ2exp(1
2
j=d
X
j=1
(xjx
j)2
s2
j
),(3)
where the amplitude τ2and length scales sjare hyperparam-
eters θ={τ, sj}.
Following the Bayes’ theorem, the posterior prediction for
fhgiven the dataset Dcan be derived to be another Gaussian:
p(fh(x)|D) = p(fh(x),Y)
p(Y)
=NE(fh(x)|D),cov(fh(x), fh(x)|D),(4)
with mean and covariance respectively:
E(fh(x)|D) = k(x,X)K(X,X)1Y,(5)
cov(fh(x), fh(x)|D) = k(x,x)
k(x,X)K(X,X)1k(X,x),(6)
摘要:

1Anadaptivemulti-fidelitysamplingframeworkforsafetyanalysisofconnectedandautomatedvehiclesXianliangGong,ShuoFeng,YulinPanAbstract—Testingandevaluationareexpensivebutcriticalstepsinthedevelopmentofconnectedandautomatedvehicles(CAVs).Inthispaper,wedevelopanadaptivesamplingframe-worktoefficientlyevalua...

展开>> 收起<<
1 An adaptive multi-fidelity sampling framework for safety analysis of connected and automated vehicles.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:1.08MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注