Planning with Occluded Traffic Agents using Bi-Level Variational Occlusion Models Filippos Christianos13 Peter Karkus1 Boris Ivanovic1 Stefano V . Albrecht23and Marco Pavone14

2025-05-02 0 0 524.36KB 7 页 10玖币
侵权投诉
Planning with Occluded Traffic Agents using
Bi-Level Variational Occlusion Models
Filippos Christianos1,3, Peter Karkus1, Boris Ivanovic1, Stefano V. Albrecht2,3and Marco Pavone1,4
Abstract Reasoning with occluded traffic agents is a sig-
nificant open challenge for planning for autonomous vehicles.
Recent deep learning models have shown impressive results for
predicting occluded agents based on the behaviour of nearby
visible agents; however, as we show in experiments, these models
are difficult to integrate into downstream planning. To this end,
we propose Bi-level Variational Occlusion Models (BiVO), a
two-step generative model that first predicts likely locations of
occluded agents, and then generates likely trajectories for the
occluded agents. In contrast to existing methods, BiVO outputs
a trajectory distribution which can then be sampled from and
integrated into standard downstream planning. We evaluate
the method in closed-loop replay simulation using the real-
world nuScenes dataset. Our results suggest that BiVO can
successfully learn to predict occluded agent trajectories, and
these predictions lead to better subsequent motion plans in
critical scenarios.
I. INTRODUCTION
Reasoning with occluded traffic agents is an important
open challenge for planning for autonomous vehicles. Plan-
ning under occlusions has an extensive literature in robotics;
however, many prior works assume static occluded ob-
jects [13, 7], or objects that are already detected and become
occluded only temporarily [4, 16]. Urban driving requires
reasoning with the most challenging type of occlusions in-
volving dynamic and previously undetected objects, because
traffic agents, such as vehicles, cyclists, pedestrians, may
emerge from occluded areas potentially with a high velocity.
An example is shown in Fig. 1.
Classical planning approaches that reason with dynamic
undetected traffic agents are often based on maintaining
bounds on a worst case scenario [12, 21]; however, the worst
case scenario results in prohibitively conservative plans for
driving in dense urban traffic. More recently, data-driven
methods have been proposed that learn to predict likely
occluded traffic agents from data. In particular, Itkina et al.
[9] proposed a data-driven method that uses “people as
sensors”, that is, it trains a variational model to predict
possible occluded areas given the past trajectory of visible
traffic agents. While the method showed promising results
on real-world data, it only predicts occluded space likely
occupied by an agent, but not the agents’ dynamic state or
possible future trajectory. For this reason, the model cannot
1NVIDIA Research, NVIDIA, Santa Clara, CA. {pkarkus,
bivanovic, mpavone}@nvidia.com
2Five AI / Bosch. stefano.albrecht@five.ai
3School of Informatics, University of Edinburgh.
{f.christianos, s.albrecht}@ed.ac.uk
4Department of Aeronautics and Astronautics, Stanford University.
{pavone@stanford.edu}
This work was done during an internship at NVIDIA.
occluded area
Fig. 1: The ego vehicle is travelling with a constant speed and
can observe a stopped truck on its right. The area behind the
truck is occluded, and is hiding a bicycle that is attempting
to cross the road. Both the ego vehicle and the bicycle are
unaware of each other, leading to a dangerous situation.
be easily integrated into downstream planning with dynamic
agents, which we will further highlight in our experiments.
We introduce the Bi-level Variational Occlusion model
(BiVO), a data-driven occlusion prediction model that allows
downstream planning with dynamic, previously undetected
traffic agents. In its first step BiVO follows Itkina et al. [9]:
it predicts a probabilistic occupancy grid map (OGM) [5]
that captures possible occluded agents using a Conditional
Variational Autoencoder (CVAE) [19], that is conditioned
on the past trajectory of a visible traffic agent. The model
predicts an OGM for each visible agent and then fuses them
into a single global OGM. In the second, critical stage, BiVO
predicts a distribution over future trajectories of possibly
occluded agents using a second CVAE model, which is
conditioned on the global OGM and other features of the
environment.
We integrate BiVO into a sampling-based planning algo-
rithm [10] for autonomous driving. The planner samples a set
of dynamically feasible trajectories for the ego-vehicle, and
selects the most promising trajectory given a hand-crafted
cost function. BiVO predictions enter the planner through a
collision avoidance term in the cost function. Specifically,
we sample a large number of trajectories from BiVO along
with their probabilities, and calculate the expected collision
cost, treating the predicted occluded agent trajectories the
same way as predictions for visible traffic agents.
To validate BiVO, we use real-world trajectory data from
the nuScenes Prediction dataset [1], both for direct prediction
metrics, open-loop planning metrics, and in a closed-loop
replay simulation. As one would expect, occluded objects
rarely affect the desired trajectories of our planner, but when
arXiv:2210.14584v1 [cs.LG] 26 Oct 2022
they do, reasoning about occlusions significantly improves
the plan quality; and BiVO is significantly more effective
than alternative learned models that were not designed for
planning (see Section VI-B).
In summary, the contributions of this paper are as follows:
We introduce a generative model, BiVO, based on vari-
ational autoencoders that is able to produce trajectories
of occluded vehicles.
We integrate BiVO into a fast sampling-based planning
algorithm and evaluate it in open and closed-loop replay
simulation with the real-world nuScenes dataset.
We demonstrate that BiVO predictions integrated into
planning leads to better motion plans in critical scenar-
ios.
To the best of our knowledge we are the first to integrate
a learned occlusion model with a planning algorithm for
autonomous driving.
II. RELATED WORK
Detecting and reasoning with occluded objects in robotics
has an extensive literature [2, 6]. In the context of au-
tonomous vehicles, occlusions can be of critical importance.
Indeed, prior work has proposed various methods that predict
and/or plan with occluded traffic agents.
Planning with occluded agents: Planning algorithms that
reason with occluded agents typically rely on handcrafted
occlusion models. For example, Orzechowski, Meyer, and
Lauer [15] propose an approach to predicting the presence
of a vehicle coming out of an occluded region and ensures the
existence of a fail-safe manoeuvre. Wang, Burger, and Stiller
[20] extend this work by eliminating some of the occluded
traffic by reasoning about the history of occlusions. Zhang
and Fisac [21] propose a method of navigating through
traffic with occluded regions by making sure a potentially
hidden pursuer should never intersect with the set of possible
inevitable collision states. Hanna et al. [8] use a model-driven
approach that infers a joint distribution over the state of the
occluded areas and the goals of other vehicles, using the
observed trajectories of the vehicles.
In contrast to these hand-crafted approaches, we propose a
data-driven approach that learns a model of occluded agents
from real-world data.
Data-driven occlusion models: Learning based models
for occluded object prediction include Schulter et al. [18],
Purkait, Zach, and Reid [17], and Han, Banfi, and Campbell
[7]. However these models make assumptions about static
objects or environments which are not pertinent in urban
driving. Some learned models can handle dynamic traffic
agents. Notably, Itkina et al. [9] use an autoencoder ar-
chitecture to infer the surroundings of visible objects and
later reconstruct them into occupancy grid maps that encode
the probability of occupied areas in 2D space. However,
it is not straightforward to integrate approaches that make
occupancy predictions of areas with existing planners, since
the predictions lack the information on how agents might
emerge out of occlusions and interfere with the ego vehicle.
In our work, we predict dynamic agents together with
their possible future trajectories instead of only occluded
areas. Our model’s predictions are key to integration with
existing downstream planners that make use of probabilistic
predictions of future trajectories.
III. TECHNICAL PRELIMINARIES
Variational Autoencoders: Variational autoencoders [11]
(VAEs) are generative models that aim to learn a density
function over some unobserved latent variables Zgiven a
dataset input xX. Given an unknown true posterior
p(z|x), VAEs approximate it with a parametric distribution
qθ(z|x). The KL-divergence from the parametric distribution
to the true posterior can be computed using:
DKL(qθ(z|x)kp(z|x)) = log p(x)
Ezqθ(z|x)[log pu(x|z)] + DKL(qθ(z|x)kq(z)),
where DKL is the KL divergence between two distribu-
tions, and the log-evidence term log p(x)is constant. The
expectation and the KL-divergence (second line) are com-
monly called the negative evidence lower bound (ELBO).
Minimising the ELBO is equivalent to minimising the KL-
divergence between the parametric and the true posterior.
States and trajectories: A state si
tfor a vehicle iis
defined as the location, heading, velocity, and acceleration at
the current timestep t. A trajectory xi
t:t+Tis a sequence of
states si
t, si
t+1, . . . , si
t+Tthat defines how an agent imoved
in time T.
Agents: We will refer to the controlled vehicle as “ego”.
Other vehicles that are not controlled by the planner, pedes-
trians, or other road users will be referred to as “agents”.
Agents can be visible by the ego if they are in the line of
sight, or occluded if there is an obstacle blocking their view
(further details of this calculation is in Section VI-A).
Occupancy grid maps: OGMs encode the occupancy of
an area. Mobs
i[0,1]H,W is the H×Warea surrounding
agent iin a 1×1meter resolution, and each grid cell contains
1if it is occupied or 0if free. Locations that are not visible
with a direct line of sight from the position of vehicle iare
marked as occluded with a value 0.5.Mgt
iis the ground truth
occupancy map of the same area.
IV. BI-LEVEL VARIATIONAL OCCLUSION MODELS
The objective of our occlusion model is to generate likely
trajectories for agents emerging from occluded regions, given
a known map of occluded regions, the past and present state
of visible agents, and a lane graph.
Our approach, BiVO, is shown in Fig. 2. We break
down the problem into two subproblems and train separate
CVAE models for each. Intuitively, the first step locates
the subspace of occluded areas that have high potential of
hidden objects; and the second step infers how these hidden
object may emerge from the occluded space. Overall, BiVO
parameterizes a distribution over trajectories that start from
known occluded regions, and allows fast sampling from this
distribution for subsequent planning.
摘要:

PlanningwithOccludedTrafcAgentsusingBi-LevelVariationalOcclusionModelsFilipposChristianos1;3,PeterKarkus1,BorisIvanovic1,StefanoV.Albrecht2;3andMarcoPavone1;4Abstract—Reasoningwithoccludedtrafcagentsisasig-nicantopenchallengeforplanningforautonomousvehicles.Recentdeeplearningmodelshaveshownimpres...

展开>> 收起<<
Planning with Occluded Traffic Agents using Bi-Level Variational Occlusion Models Filippos Christianos13 Peter Karkus1 Boris Ivanovic1 Stefano V . Albrecht23and Marco Pavone14.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:524.36KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注