1 An adaptive multi-fidelity sampling framework for safety analysis of connected and automated vehicles

2025-04-30 1 0 1.08MB 13 页 10玖币

侵权投诉

An adaptive multi-ﬁdelity sampling framework for

safety analysis of connected and automated vehicles

Xianliang Gong, Shuo Feng, Yulin Pan

Abstract—Testing and evaluation are expensive but critical

steps in the development of connected and automated vehicles

(CAVs). In this paper, we develop an adaptive sampling frame-

work to efﬁciently evaluate the accident rate of CAVs, particu-

larly for scenario-based tests where the probability distribution

of input parameters is known from the Naturalistic Driving Data.

Our framework relies on a surrogate model to approximate the

CAV performance and a novel acquisition function to maximize

the beneﬁt (information to accident rate) of the next sample

formulated through an information-theoretic consideration. In

addition to the standard application with only a single high-

ﬁdelity model of CAV performance, we also extend our approach

to the bi-ﬁdelity context where an additional low-ﬁdelity model

can be used at a lower computational cost to approximate

the CAV performance. Accordingly, for the second case, our

approach is formulated such that it allows the choice of the

next sample in terms of both ﬁdelity level (i.e., which model to

use) and sampling location to maximize the beneﬁt per cost. Our

framework is tested in a widely-considered two-dimensional cut-

in problem for CAVs, where Intelligent Driving Model (IDM)

with different time resolutions are used to construct the high

and low-ﬁdelity models. We show that our single-ﬁdelity method

outperforms the existing approach for the same problem, and

the bi-ﬁdelity method can further save half of the computational

cost to reach a similar accuracy in estimating the accident rate.

Index Terms—Connected and Automated Vehicles, safety anal-

ysis, multi-ﬁdelity model, active learning

I. INTRODUCTION

CONNECTED and autonomous vehicles (CAVs) have

attracted increasing attention due to their potential to

improve mobility and safety while reducing the energy con-

sumed. One critical issue for the development of CAVs is their

safety testing and evaluation. In general, converged statistics

of accident rate may require hundreds of millions of miles for

each conﬁguration of CAVs [1]. To reduce the testing cost,

scenario-based approaches have been developed [2]–[5], with

many of them testing certain simpliﬁed driving events, e.g., the

cut-in problem. In the scenario-based framework, the scenarios

(and their distribution) describing certain trafﬁc environment

are parameterized from the Naturalistic Driving Data (NDD).

The performance of CAVs is then evaluated for given scenarios

as the input, and the accident rate of CAVs is quantiﬁed

considering the distribution of scenarios.

The scenario-based safety analysis, however, is far from a

trivial task. The difﬁculties lie in the high cost of evaluating

Xianliang Gong and Yulin Pan are with the Department of Naval Archi-

tecture and Marine Engineering, University of Michigan, 48109, MI, USA

(e-mail: xlgong@umich.edu; yulinpan@umich.edu)

Shuo Feng is with the Department of Automation, Tsinghua University,

Beijing 100084, China, (e-mail: fshuo@tsinghua.edu.cn)

the CAV performance given a scenario (say, using road tests

or high-ﬁdelity simulators) and the rareness of accidents in the

scenario space [6]. The two factors result in a large number

of required CAV performance evaluations that can become

prohibitively expensive (either computationally or ﬁnancially)

if a standard Monte Carlo method is used. In order to address

the problem, many methods have been developed to reduce the

number of scenario evaluations in safety analysis. One cate-

gory of methods rely on importance sampling, where samples

are selected from a proposal distribution to stress the critical

input regions (leading to most accidents). Different ways to

construct the proposal distribution have been developed in

[7]–[11], leading to signiﬁcant acceleration compared to the

Monte Carlo method. In particular, the proposal distribution

is constructed in [7] from a parametric distribution with its

parameters determined from the cross-entropy method [12].

The method in [7] is further improved in [8] by employing

piece-wise parametric distributions (i.e., different parameters

used for different parts of input space). The proposal dis-

tribution can also be constructed via a low-ﬁdelity model

(assumed to be associated with negligible cost), either from

the low-ﬁdelity model evaluation itself [9], [10] or additionally

leveraging a small number of adaptively selected high-ﬁdelity

model evaluations [11].

Another category of methods in safety analysis is based on

adaptive sampling enabled by active learning method. Under

this approach, a proposal distribution is not needed, and one

directly computes the accident rate according to the input

scenario probability with a surrogate model approximating the

CAV performance. The surrogate model can be established

through a supervised learning approach, say a Gaussian pro-

cess regression, together with an adaptive sampling algorithm

to choose the next-best sample through optimization of a pre-

deﬁned acquisition function. Such choice of the next sample

is expected to accelerate the convergence of the accident rate

computed from the updated surrogate. This class of methods

were ﬁrst developed for structural reliability analysis [13]–

[18] and have recently been introduced to the CAV ﬁeld [11],

[19]–[22].

To provide more details, two acquisition functions are pro-

posed in [19], respectively designed for Gaussian process re-

gression and k-nearest neighbors as surrogate models, in order

to better resolve performance boundaries between accidents

and safe scenarios. These acquisition functions combine explo-

ration and exploitation under some heuristic consideration of

the surrogate models. The approach in [19] is extended in [21]

by clustering samples into different groups that allow a parallel

search of optimal samples to accelerate the overall algorithm.

arXiv:2210.14114v2 [cs.RO] 31 May 2023

TABLE I

SUMMARY OF EXISTING ADAPTIVE SAMPLING METHODS AND CURRENT WORK

Surrogate1Acquisition Objective

[19], [21] GPR Heuristic combination of exploitation (large gradient of GPR mean)

and exploration (large uncertainty of GPR) Performance boundary

KNN Heuristic combination of exploitation (large variance among neighboring samples)

and exploration (large distance to neighboring samples) Performance boundary

[20] GPR, KNN,

XGB, etc.

Heuristic combination of exploitation (poor performance)

and exploration (large distance to all existing samples) Accident scenarios

Heuristic combination of exploitation (performance close to accidents threshold)

and exploration (large distance to all existing samples) Performance boundary

[11] GPR Reducing variance of the proposal distribution Optimal proposal distribution

for importance sampling

[22], [23] GPR Squared change of accident rate after adding a hypothetical sample Accident rate

Current work GPR Approximation of the expected K-L divergence between current and next-step accident

rate distribution after adding a hypothetical sample Accident rate

1GPR KNN, and XGB respectively stand for Gaussian process regression, k-nearest neighbors, and extreme gradient boosting.

In [20], the authors develop two acquisition functions appli-

cable to six different surrogate models, which favor samples

expected to respectively produce (i) poor performance, and (ii)

performance close to accident threshold, and in the meanwhile,

far from existing samples (for exploration). In these works,

the proposed acquisition functions are rather empirical and

cannot guarantee optimal convergence of the accident rate.

In addition, an acquisition function that directly targets the

accident rate (in which sense similar to what we develop) is

proposed in [22], [23], but their method is not sufﬁciently

supported by numerical tests provided in their papers. A

summary of these existing methods is provided in Table I

(also see [5] for a more comprehensive review). In viewing

the state-of-the-art methods in the ﬁeld, it is clear that large

room exists for further improvement of the sampling efﬁciency

(i.e., reduction of the required number of samples) through a

more rigorous information-theoretic approach to develop the

acquisition. Such developments are not only desired to reduce

the cost of CAV safety evaluation but are also valuable to the

general ﬁeld of reliability analysis.

The cost in the evaluation of CAV accident rate can also

be reduced by leveraging low-ﬁdelity models applied in con-

junction with the high-ﬁdelity model. In principle, the low-

ﬁdelity models can provide useful information on the surrogate

model (e.g., the general trend of the function) although their

own predictions may be associated with considerable errors.

For example, low-ﬁdelity models have been used to generate

the proposal distribution for importance sampling [9], [11].

It needs to be emphasized that almost all existing works

(in the CAV ﬁeld) assume that the low-ﬁdelity models are

associated with negligible cost, i.e., the low-ﬁdelity map from

scenario space to CAV performance can be considered as a

known function. However, in practical situations, the cost ratio

between high and low-ﬁdelity models may not be that drastic.

Typical cases include (i) CARLA [24] simulator versus SUMO

simulator [25], (ii) the same simulator with ﬁne versus coarse-

time resolutions. For these cases, a new adaptive-sampling

algorithm considering the cost ratio is needed, which is

expected to be able to select both the model (i.e., ﬁdelity level)

and scenario for the next-best sample to reduce the overall cost

in the evaluation of the accident rate. Such methods are not

yet available for CAV testing.

In this work, we develop an adaptive sampling algorithm

in the active learning framework for safety testing and evalu-

ations of CAVs. The novelty of our method lies in the devel-

opment of an information-theoretic-based acquisition function

that leads to very high sampling efﬁciency and can be extended

to bi-ﬁdelity contexts in a relatively straightforward manner.

In particular, our method is applied to two situations: (i) the

single-ﬁdelity context where only a high-ﬁdelity model is

available; and (ii) the bi-ﬁdelity context where the high-to-

low model cost ratio is ﬁnite and ﬁxed. We note that for case

(ii), our method needs to be established by using a bi-ﬁdelity

Gaussian process as the surrogate model and an acquisition

function to select the next sample (in terms of both model

ﬁdelity and trafﬁc scenario) which maximizes information

gain per cost. Both applications of our method are tested

in a widely-considered two-dimensional cut-in problem for

CAVs, with the high-ﬁdelity model taken as the Intelligent

Driving Model (IDM) with ﬁne time resolution. The low-

ﬁdelity model is constructed by a coarser-time-resolution IDM

model in application (ii). We compare the performance of

our method with the state-of-the-art approaches in the CAV

ﬁeld for the same problem and ﬁnd that even the single-

ﬁdelity approach can considerably outperform the existing

approaches. The method in application (ii) can further reduce

the computational cost by at least a factor of 2.

We ﬁnally remark that the method we develop here is

new to the entire ﬁeld of reliability analysis according to

our knowledge, and its application to other ﬁelds may prove

equally fruitful. For example, it may be applied to evaluate

the ship capsizing probability in ocean engineering [26], [27],

structural safety analysis [15], [18], probability of extreme

pandemic spikes for public health [28] and many other physi-

cal, engineering and societal problems. Within the CAV ﬁeld,

our method is certainly not limited to the IDM models used in

this paper as demonstrations. It can be connected to a broad

range of CAV evaluation tools across on-road tests, closed-

facility tests, simulations based on various kinds of simulators

(e.g. Google/Waymo’s Car-Craft9 [29], Intel’s CARLA6 [30],

Microsoft’s Air-Sim7 [31], NVIDIA’s Drive Constellation

[32]). Among these examples, we would like to emphasize the

Fig. 1. Illustration of the cut-in scenario [11]. Rand ˙

Rrespectively denote

the range and range rate between CAV and BV.

possible beneﬁt of our method to the augmented-reality test

environment combing a real test vehicle on road and simulated

background vehicles [33]. Due to the bi-ﬁdelity capability

of our method, it also becomes beneﬁcial to combine two

different tools in the above list to further improve the testing

efﬁciency. The extension of our method to high-dimensional

problems is also possible (see [27] for another sampling

purpose) but will not be considered in this paper.

The python code for the algorithm, named MFGPreliability,

is available on Github1.

TABLE II

NOTATIONS OF VARIABLES

Variable Notation

x∼px(x)Decision variables with its probability distribution

fh, fl

High and low-ﬁdelity models mapping from decision

variables to a measure of CAV performance

R0,˙

R0Range and range rate at the cut-in moment

PaAccident rate

δThreshold to deﬁne an accident

D={X,Y}Existing dataset with inputs Xand corresponding

outputs Y

k, θ Kernel function and its hyperparameter

UAn upper bound of the uncertainty in estimating Pa

BBeneﬁt of adding a sequential sample

ch, clCost of high and low-ﬁdelity models

∆tTime step to simulate the IDM model

II. PROBLEM SETUP

We consider a black-box function fh(x) : Rd→Rwith

input xad-dimensional decision variable of a driving scenario

and output a measure of the CAV performance. A subscript h

is used here to denote that the function needs to be evaluated

by an expensive high-ﬁdelity model. Taking the cut-in problem

(Fig. 1) as an example, the input can be formulated as x=

(R0,˙

R0)where R0and ˙

R0denote the initial range and range

rate between the CAV and background vehicle (BV) at the

cut-in moment t= 0 (more details in Sec. IV-A). The output

is the minimum range between the two vehicles during their

speed adjustment process for t≥0.

The probability of the input x∼px(x)is assumed to be

known from the naturalistic driving data (NDD). Our objective

is the evaluation of accident rate, i.e., probability of the output

1https://github.com/umbrellagong/MFGPreliability

smaller than some threshold δ(or range between CAV and BV

smaller than δ):

Pa=Z1δ(fh(x))px(x)dx,(1)

where

1δ(fh(x)) = (1,if fh(x)< δ

0,o.w..(2)

A brute-force computation of Pacalls for a large number

of Monte Carlo samples in the space of x, which may be-

come computationally prohibitive (considering the expensive

evaluation of fhand the small Pa). In this work, we seek

to develop an adaptive sampling framework based on active

learning, where samples are selected optimally to accelerate

the convergence of the computed value of Pa. We will present

algorithms for (i) single-ﬁdelity cases, where only one model

fhis available, and (ii) bi-ﬁdelity cases. For case (ii), we

consider a practical situation that a low-ﬁdelity model flwith

a lower but ﬁnite cost is also available to us that can provide a

certain level of approximation to fh. Making use of fl, as will

be demonstrated, can further reduce the cost of computing Pa.

III. METHOD

A. Single ﬁdelity method

We consider the single-ﬁdelity context where only the model

fhis available. Two basic components of our active learning

method are presented below: (i) an inexpensive surrogate

model based on the standard Gaussian process; (ii) a new

acquisitive function to select the next-best sample.

1) surrogate model by GPR: Gaussian process regression

(GPR) is a probabilistic machine learning approach [34]

widely used for active learning. Consider the task of in-

ferring fhfrom D={X,Y}, which consists of ninputs

X={xi∈Rd}i=n

i=1 and the corresponding outputs Y=

{fh(xi)∈R}i=n

i=1 . In GPR, a prior, representing our beliefs

over all possible functions we expect to observe, is placed

on fhas a Gaussian process fh(x)∼ GP(0, k(x,x′)) with

zero mean and covariance function k(usually deﬁned by a

radial-basis-function kernel):

k(x,x′) = τ2exp(−1

j=d

j=1

(xj−x′

j)2

),(3)

where the amplitude τ2and length scales sjare hyperparam-

eters θ={τ, sj}.

Following the Bayes’ theorem, the posterior prediction for

fhgiven the dataset Dcan be derived to be another Gaussian:

p(fh(x)|D) = p(fh(x),Y)

p(Y)

=NE(fh(x)|D),cov(fh(x), fh(x′)|D),(4)

with mean and covariance respectively:

E(fh(x)|D) = k(x,X)K(X,X)−1Y,(5)

cov(fh(x), fh(x′)|D) = k(x,x′)

−k(x,X)K(X,X)−1k(X,x′),(6)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1Anadaptivemulti-fidelitysamplingframeworkforsafetyanalysisofconnectedandautomatedvehiclesXianliangGong,ShuoFeng,YulinPanAbstract—Testingandevaluationareexpensivebutcriticalstepsinthedevelopmentofconnectedandautomatedvehicles(CAVs).Inthispaper,wedevelopanadaptivesamplingframe-worktoefficientlyevalua...

展开>> 收起<<

1 An adaptive multi-fidelity sampling framework for safety analysis of connected and automated vehicles.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 An adaptive multi-fidelity sampling framework for safety analysis of connected and automated vehicles

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: