Learning Autonomous Vehicle Safety Concepts from Demonstrations Karen Leungy Sushant Veer Edward Schmerlingz Marco Pavonez Abstract Evaluating the safety of an autonomous vehicle

2025-04-29 0 0 1.65MB 9 页 10玖币
侵权投诉
Learning Autonomous Vehicle Safety Concepts from Demonstrations
Karen Leung?, Sushant Veer?, Edward Schmerling?, Marco Pavone?
Abstract Evaluating the safety of an autonomous vehicle
(AV) depends on the behavior of surrounding agents which can
be heavily influenced by factors such as environmental context
and informally-defined driving etiquette. A key challenge is in
determining a minimum set of assumptions on what constitutes
reasonable foreseeable behaviors of other road users for the
development of AV safety models and techniques. In this
paper, we propose a data-driven AV safety design methodology
that first learns “reasonable” behavioral assumptions from
data, and then synthesizes an AV safety concept using these
learned behavioral assumptions. We borrow techniques from
control theory, namely high order control barrier functions
and Hamilton-Jacobi reachability, to provide inductive bias
to aid interpretability, verifiability, and tractability of our
approach. In our experiments, we learn an AV safety concept
using demonstrations collected from a highway traffic-weaving
scenario, compare our learned concept to existing baselines,
and showcase its efficacy in evaluating real-world driving logs.
I. INTRODUCTION
As autonomous vehicle (AV) operations grow, developing
appropriate methods for evaluating AV safety becomes ever
more imperative. The question of “is a vehicle in an unsafe
state?” is relevant for AV system developers, policymakers,
and the general public alike (see Figure 1). While guar-
anteeing safety may not be practical in the face of the
myriad uncertainties and complexities that come with real-
world driving, there is still a broad desire to codify, to
some extent, collectively agreed-upon notions of safety [1].
Should safety be defined using data-driven methods that
can account for the complexities of the AV’s environment
but lack interpretability and formal guarantees, or leverage
control theoretic techniques derived from first principles
which are interpretable and rigorous but not as scalable or
expressive as their learned counterparts? In this work, we
strike a middle-ground by a learning safety-critical driving
behavior model and integrating it within a robust control
framework to develop an interpretable and rigorous AV safety
model that is informed by data.
Towards this goal of building safe and trustworthy AVs,
various stakeholders have advanced safety concepts consist-
ing, in general, of two functions mapping world state (e.g.,
joint state of all agents and environmental context) to (i) a
scalar measure of safety, and (ii) a set of allowable (safe)
agent actions [2]. Numerous uses of such safety concepts
have been proposed throughout AV pipelines, e.g., a criterion
to prune away unsafe plans [3], a safety monitor to determine
when evasive action must be taken [4], a component in the
?NVIDIA, University of Washington, Stanford University,
{kymleung@uw.edu, kaleung@nvidia.com}
Fig. 1: Evaluating the safety of an autonomous vehicle (AV)
depends on what constitutes as reasonable foreseeable behaviors of
other road users. For example, determining whether the autonomous
car (blue) is currently in a safe state depends on how the human
driver (red) may behave (e.g., speed up or swerve away). In this
work, we develop a data-driven methodology to model reasonable
foreseeable behaviors and leverage the learned model for AV safety
evaluation.
planning objective [5], or for perception safety evaluation
metrics [6], [7]. Critically, different behavioral assumptions
lead to different safety concepts which in turn affects realized
AV safety and performance. To mitigate overconservatism
and enable maximum flexibility, we contend that the design
of an effective safety concept hinges upon characterizing
reasonable foreseeable behaviors of other agents, in a way
conducive to efficiently evaluating the safety of driving
scenes and producing associated safe controls.
To address this challenge, we propose designing novel
safety concepts by learning from data what are controls,
specifically, control sets, that humans operate with when their
safety is threatened, and then using these learned control sets
to inform AVs of what are reasonable foreseeable behaviors
of other agents in safety-critical scenarios. Equipped with
such a learned “human behavior collision avoidance model,
we perform safety concept synthesis by using robust control
theory, specifically Hamilton-Jacobi reachability [8], as a
powerful inductive bias for interpretability, verifiability, and
tractability. Our control set learning approach differs from
reward learning paradigms (i.e., inverse reinforcement learn-
ing / inverse optimal control [9], [10], [11]) which strive to
learn high-level human intentions from demonstrations, often
in the absence of constraints [3], [12]. Reward learning ap-
proaches aim to model a human’s high-level planning objec-
tive which encapsulates more than just safety considerations.
Whereas in this work, we are interested in learning control
constraint boundaries associated with collision avoidance
behaviors as opposed to nuanced behaviors (e.g., aggressive
versus passive).
Structure and Contributions. We provide a literature re-
view in Section II, give an overview on Hamilton-Jacobi
arXiv:2210.02761v1 [cs.RO] 6 Oct 2022
(HJ) reachability in Section III, and formally state our
safety concept learning problem in Section IV. Then we
describe the details of our key contributions: (i) We propose
a data-driven approach to learn humans’ collision avoidance
behaviors in the control space to capture “reasonable driving
behaviors” (Section V). Specifically, we learn safe control
sets from demonstrations via a high order control barrier
function (HOCBF) [13] framework. (ii) We develop a con-
strained game-theoretic optimization problem derived from a
HJ reachability formulation to synthesize novel data-driven
safety concepts that are robust to other agents’ behaviors
while respecting the learned collision avoidance behaviors
(Section VI). (iii) We demonstrate our proposed learning
framework using highway driving data and show that the
resulting data-driven safety concept is less conservative than
other common safety concepts (due to the way it captures
constraints on reasonable agent behavior) and thus is useful
as a “responsibility-aware” evaluation metric for the safety
of AV interactions (Section VII).
II. RELATED WORK
We use the term safety concept to help unify existing
safety theory prevalent in various robot planning and control
algorithms, such as velocity obstacles [14], [15], forward
reachable sets [16], [17], contingency planning [18], [19],
backward reachability [8], and other methods that make static
assumptions on agent behavior [20], [21]. The differences
between various safety concepts stem from the assumptions
about the behavior of other interacting agents, ranging from
worst-case assumptions [8] to presuming agents follow fixed
open-loop trajectories (e.g., braking [21], constant velocity
[15]). Indeed, a core challenge is in selecting behavioral
assumptions that balances conservatism, tractability, inter-
pretability, and compatibility with real-world interactions.
Recent works propose dynamically changing the conser-
vatism of the safety concept based on online estimations of
how confident the robot’s human behavior prediction model
is. If a human agent is behaving as expected (i.e., high
model confidence), then the worst-case assumptions in the
safety concept can be relaxed, and vice versa [22], [23].
However, the integrity of the adaptive safety concept depends
on the quality of the prediction model where obtaining an
accurate human behavior prediction model is in general quite
challenging and, indeed, is an active research field [24].
Another data-driven safe control technique is to use expert
demonstrations and learn a control barrier function (CBF)
[25] to describe unsafe regions in the state space [26], [27],
[28], [29]. The learned CBF is then directly used as the core
safety mechanism in synthesizing a safe policy. However,
CBFs are not well-suited for interactive settings where there
is uncertainty in how other interacting agents may behave. In
our work, we too consider learning CBFs (specifically, high
order CBFs (HOCBFs) [13]), but instead use the learned
HOCBF as an intermediate step towards formulating a more
rigorous notion of safety rooted in robust control theory
which enjoys interpretability and verifiability benefits.
III. SAFETY CONCEPT VIA HAMILTON-JACOBI
REACHABILITY
We define a safety concept as a combination of two
functions mapping world state to (i) a scalar measure of
safety, and (ii) a set of allowable actions for each agent in
which to preserve safety. A family of safety concepts can be
described via a HJ reachability formulation [2].
HJ reachability is a mathematical formalism used for
characterizing the safety properties of dynamical systems [8],
[30]. The outputs of a HJ reachability computation are (i)
a HJ value function, a scalar-valued function that measures
“distance” to collision, and (ii) a set of controls that prevents
the safety measure from decreasing further—precisely the
components needed for a safety concept.
Consider a target set Twhich is the set of collision states
between agents A(ego agent) and B(contender). The HJ
reachability formulation describes a two-player differential
game to determine whether it is possible for the ego agent
to avoid entering Tunder any family of closed-loop policies
of the contender, as well as the ego agent’s appropriate
control policy for ensuring safety. It is assumed that the con-
tender follows an adversarial policy and has the advantage
with respect to the information pattern. Using the principle
of dynamic programming, the collision avoidance problem
reduces to solving the Hamilton-Jacobi-Isaacs (HJI) partial
differential equation (PDE)[8],
V(x, t)
t + min n0,max
uA∈UAmin
uB∈UBxV(x, t)>f(x, uA, uB)o= 0
V(x, 0) = `(x)
(1)
where x∈ X denotes the joint state of agents Aand
B,uA∈ UAand uB∈ UBare the available (bounded)
controls1of agents Aand B, respectively, and f(·,·,·)is the
joint dynamics assumed to be measurable in uAand uBfor
each x, and uniformly continuous, bounded, and Lipschitz
continuous in xfor fixed uAand uB.2The boundary
condition is defined by a function `:X Rwhose zero
sub-level set encodes the target set, i.e., T={x|`(x)<0}.
The solution V(x, t), t [T, 0], called the HJ value
function, captures the lowest value of `(·)along the system
trajectory within |t|seconds if the systems starts at xand
both agents Aand Bact optimally, that is, u
A(x), u
B(x) =
arg maxuA∈U A(arg) minuB∈U BxV(x, t)>f(x, uA, uB).
Thus the HJ value function fulfills the first aspect of a
safety concept. After obtaining the HJ value function,
we can also consider the set of controls that prevent
the HJ value function from decreasing over time. That
is, we can compute the safety-preserving control set,
UA
safe(x) = {uA∈ UA|minuB∈U BdV(x,t)
dt 0}, thus
fulfilling the second aspect of a safety concept. By varying
the problem parameters, i.e., control sets, behavior type,
1The control sets UAand UBare typically chosen to reflect the physically
feasible limits of the system.
2This assumption ensures that trajectories are generated by a unique
control sequence.
摘要:

LearningAutonomousVehicleSafetyConceptsfromDemonstrationsKarenLeung?y,SushantVeer?,EdwardSchmerling?z,MarcoPavone?zAbstract—Evaluatingthesafetyofanautonomousvehicle(AV)dependsonthebehaviorofsurroundingagentswhichcanbeheavilyinuencedbyfactorssuchasenvironmentalcontextandinformally-deneddrivingetiqu...

展开>> 收起<<
Learning Autonomous Vehicle Safety Concepts from Demonstrations Karen Leungy Sushant Veer Edward Schmerlingz Marco Pavonez Abstract Evaluating the safety of an autonomous vehicle.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:9 页 大小:1.65MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注