Learning Autonomous Vehicle Safety Concepts from Demonstrations Karen Leungy Sushant Veer Edward Schmerlingz Marco Pavonez Abstract Evaluating the safety of an autonomous vehicle

2025-04-29 1 0 1.65MB 9 页 10玖币

侵权投诉

Learning Autonomous Vehicle Safety Concepts from Demonstrations

Karen Leung?†, Sushant Veer?, Edward Schmerling?‡, Marco Pavone?‡

Abstract— Evaluating the safety of an autonomous vehicle

(AV) depends on the behavior of surrounding agents which can

be heavily inﬂuenced by factors such as environmental context

and informally-deﬁned driving etiquette. A key challenge is in

determining a minimum set of assumptions on what constitutes

reasonable foreseeable behaviors of other road users for the

development of AV safety models and techniques. In this

paper, we propose a data-driven AV safety design methodology

that ﬁrst learns “reasonable” behavioral assumptions from

data, and then synthesizes an AV safety concept using these

learned behavioral assumptions. We borrow techniques from

control theory, namely high order control barrier functions

and Hamilton-Jacobi reachability, to provide inductive bias

to aid interpretability, veriﬁability, and tractability of our

approach. In our experiments, we learn an AV safety concept

using demonstrations collected from a highway trafﬁc-weaving

scenario, compare our learned concept to existing baselines,

and showcase its efﬁcacy in evaluating real-world driving logs.

I. INTRODUCTION

As autonomous vehicle (AV) operations grow, developing

appropriate methods for evaluating AV safety becomes ever

more imperative. The question of “is a vehicle in an unsafe

state?” is relevant for AV system developers, policymakers,

and the general public alike (see Figure 1). While guar-

anteeing safety may not be practical in the face of the

myriad uncertainties and complexities that come with real-

world driving, there is still a broad desire to codify, to

some extent, collectively agreed-upon notions of safety [1].

Should safety be deﬁned using data-driven methods that

can account for the complexities of the AV’s environment

but lack interpretability and formal guarantees, or leverage

control theoretic techniques derived from ﬁrst principles

which are interpretable and rigorous but not as scalable or

expressive as their learned counterparts? In this work, we

strike a middle-ground by a learning safety-critical driving

behavior model and integrating it within a robust control

framework to develop an interpretable and rigorous AV safety

model that is informed by data.

Towards this goal of building safe and trustworthy AVs,

various stakeholders have advanced safety concepts consist-

ing, in general, of two functions mapping world state (e.g.,

joint state of all agents and environmental context) to (i) a

scalar measure of safety, and (ii) a set of allowable (safe)

agent actions [2]. Numerous uses of such safety concepts

have been proposed throughout AV pipelines, e.g., a criterion

to prune away unsafe plans [3], a safety monitor to determine

when evasive action must be taken [4], a component in the

?NVIDIA, †University of Washington, ‡Stanford University,

{kymleung@uw.edu, kaleung@nvidia.com}

Fig. 1: Evaluating the safety of an autonomous vehicle (AV)

depends on what constitutes as reasonable foreseeable behaviors of

other road users. For example, determining whether the autonomous

car (blue) is currently in a safe state depends on how the human

driver (red) may behave (e.g., speed up or swerve away). In this

work, we develop a data-driven methodology to model reasonable

foreseeable behaviors and leverage the learned model for AV safety

evaluation.

planning objective [5], or for perception safety evaluation

metrics [6], [7]. Critically, different behavioral assumptions

lead to different safety concepts which in turn affects realized

AV safety and performance. To mitigate overconservatism

and enable maximum ﬂexibility, we contend that the design

of an effective safety concept hinges upon characterizing

reasonable foreseeable behaviors of other agents, in a way

conducive to efﬁciently evaluating the safety of driving

scenes and producing associated safe controls.

To address this challenge, we propose designing novel

safety concepts by learning from data what are controls,

speciﬁcally, control sets, that humans operate with when their

safety is threatened, and then using these learned control sets

to inform AVs of what are reasonable foreseeable behaviors

of other agents in safety-critical scenarios. Equipped with

such a learned “human behavior collision avoidance model,”

we perform safety concept synthesis by using robust control

theory, speciﬁcally Hamilton-Jacobi reachability [8], as a

powerful inductive bias for interpretability, veriﬁability, and

tractability. Our control set learning approach differs from

reward learning paradigms (i.e., inverse reinforcement learn-

ing / inverse optimal control [9], [10], [11]) which strive to

learn high-level human intentions from demonstrations, often

in the absence of constraints [3], [12]. Reward learning ap-

proaches aim to model a human’s high-level planning objec-

tive which encapsulates more than just safety considerations.

Whereas in this work, we are interested in learning control

constraint boundaries associated with collision avoidance

behaviors as opposed to nuanced behaviors (e.g., aggressive

versus passive).

Structure and Contributions. We provide a literature re-

view in Section II, give an overview on Hamilton-Jacobi

arXiv:2210.02761v1 [cs.RO] 6 Oct 2022

(HJ) reachability in Section III, and formally state our

safety concept learning problem in Section IV. Then we

describe the details of our key contributions: (i) We propose

a data-driven approach to learn humans’ collision avoidance

behaviors in the control space to capture “reasonable driving

behaviors” (Section V). Speciﬁcally, we learn safe control

sets from demonstrations via a high order control barrier

function (HOCBF) [13] framework. (ii) We develop a con-

strained game-theoretic optimization problem derived from a

HJ reachability formulation to synthesize novel data-driven

safety concepts that are robust to other agents’ behaviors

while respecting the learned collision avoidance behaviors

(Section VI). (iii) We demonstrate our proposed learning

framework using highway driving data and show that the

resulting data-driven safety concept is less conservative than

other common safety concepts (due to the way it captures

constraints on reasonable agent behavior) and thus is useful

as a “responsibility-aware” evaluation metric for the safety

of AV interactions (Section VII).

II. RELATED WORK

We use the term safety concept to help unify existing

safety theory prevalent in various robot planning and control

algorithms, such as velocity obstacles [14], [15], forward

reachable sets [16], [17], contingency planning [18], [19],

backward reachability [8], and other methods that make static

assumptions on agent behavior [20], [21]. The differences

between various safety concepts stem from the assumptions

about the behavior of other interacting agents, ranging from

worst-case assumptions [8] to presuming agents follow ﬁxed

open-loop trajectories (e.g., braking [21], constant velocity

[15]). Indeed, a core challenge is in selecting behavioral

assumptions that balances conservatism, tractability, inter-

pretability, and compatibility with real-world interactions.

Recent works propose dynamically changing the conser-

vatism of the safety concept based on online estimations of

how conﬁdent the robot’s human behavior prediction model

is. If a human agent is behaving as expected (i.e., high

model conﬁdence), then the worst-case assumptions in the

safety concept can be relaxed, and vice versa [22], [23].

However, the integrity of the adaptive safety concept depends

on the quality of the prediction model where obtaining an

accurate human behavior prediction model is in general quite

challenging and, indeed, is an active research ﬁeld [24].

Another data-driven safe control technique is to use expert

demonstrations and learn a control barrier function (CBF)

[25] to describe unsafe regions in the state space [26], [27],

[28], [29]. The learned CBF is then directly used as the core

safety mechanism in synthesizing a safe policy. However,

CBFs are not well-suited for interactive settings where there

is uncertainty in how other interacting agents may behave. In

our work, we too consider learning CBFs (speciﬁcally, high

order CBFs (HOCBFs) [13]), but instead use the learned

HOCBF as an intermediate step towards formulating a more

rigorous notion of safety rooted in robust control theory

which enjoys interpretability and veriﬁability beneﬁts.

III. SAFETY CONCEPT VIA HAMILTON-JACOBI

REACHABILITY

We deﬁne a safety concept as a combination of two

functions mapping world state to (i) a scalar measure of

safety, and (ii) a set of allowable actions for each agent in

which to preserve safety. A family of safety concepts can be

described via a HJ reachability formulation [2].

HJ reachability is a mathematical formalism used for

characterizing the safety properties of dynamical systems [8],

[30]. The outputs of a HJ reachability computation are (i)

a HJ value function, a scalar-valued function that measures

“distance” to collision, and (ii) a set of controls that prevents

the safety measure from decreasing further—precisely the

components needed for a safety concept.

Consider a target set Twhich is the set of collision states

between agents A(ego agent) and B(contender). The HJ

reachability formulation describes a two-player differential

game to determine whether it is possible for the ego agent

to avoid entering Tunder any family of closed-loop policies

of the contender, as well as the ego agent’s appropriate

control policy for ensuring safety. It is assumed that the con-

tender follows an adversarial policy and has the advantage

with respect to the information pattern. Using the principle

of dynamic programming, the collision avoidance problem

reduces to solving the Hamilton-Jacobi-Isaacs (HJI) partial

differential equation (PDE)[8],

∂V(x, t)

∂t + min n0,max

uA∈UAmin

uB∈UB∇xV(x, t)>f(x, uA, uB)o= 0

V(x, 0) = `(x)

(1)

where x∈ X denotes the joint state of agents Aand

B,uA∈ UAand uB∈ UBare the available (bounded)

controls1of agents Aand B, respectively, and f(·,·,·)is the

joint dynamics assumed to be measurable in uAand uBfor

each x, and uniformly continuous, bounded, and Lipschitz

continuous in xfor ﬁxed uAand uB.2The boundary

condition is deﬁned by a function `:X → Rwhose zero

sub-level set encodes the target set, i.e., T={x|`(x)<0}.

The solution V(x, t), t ∈[−T, 0], called the HJ value

function, captures the lowest value of `(·)along the system

trajectory within |t|seconds if the systems starts at xand

both agents Aand Bact optimally, that is, u∗

A(x), u∗

B(x) =

arg maxuA∈U A(arg) minuB∈U B∇xV(x, t)>f(x, uA, uB).

Thus the HJ value function fulﬁlls the ﬁrst aspect of a

safety concept. After obtaining the HJ value function,

we can also consider the set of controls that prevent

the HJ value function from decreasing over time. That

is, we can compute the safety-preserving control set,

safe(x) = {uA∈ UA|minuB∈U BdV(x,t)

dt ≥0}, thus

fulﬁlling the second aspect of a safety concept. By varying

the problem parameters, i.e., control sets, behavior type,

1The control sets UAand UBare typically chosen to reﬂect the physically

feasible limits of the system.

2This assumption ensures that trajectories are generated by a unique

control sequence.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LearningAutonomousVehicleSafetyConceptsfromDemonstrationsKarenLeung?y,SushantVeer?,EdwardSchmerling?z,MarcoPavone?zAbstractEvaluatingthesafetyofanautonomousvehicle(AV)dependsonthebehaviorofsurroundingagentswhichcanbeheavilyinuencedbyfactorssuchasenvironmentalcontextandinformally-deneddrivingetiqu...

展开>> 收起<<

Learning Autonomous Vehicle Safety Concepts from Demonstrations Karen Leungy Sushant Veer Edward Schmerlingz Marco Pavonez Abstract Evaluating the safety of an autonomous vehicle.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Learning Autonomous Vehicle Safety Concepts from Demonstrations Karen Leungy Sushant Veer Edward Schmerlingz Marco Pavonez Abstract Evaluating the safety of an autonomous vehicle

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: