GLAD Grounded Layered Autonomous Driving for Complex Service Tasks Yan Ding Cheng Cui Xiaohan Zhang and Shiqi Zhang

2025-04-29 0 0 2.63MB 7 页 10玖币
侵权投诉
GLAD: Grounded Layered Autonomous Driving
for Complex Service Tasks
Yan Ding, Cheng Cui, Xiaohan Zhang, and Shiqi Zhang
Abstract Given the current point-to-point navigation capa-
bilities of autonomous vehicles, researchers are looking into
complex service requests that require the vehicles to visit
multiple points of interest. In this paper, we develop a lay-
ered planning framework, called GLAD, for complex service
requests in autonomous urban driving. There are three layers
for service-level, behavior-level, and motion-level planning. The
layered framework is unique in its tight coupling, where the
different layers communicate user preferences, safety estimates,
and motion costs for system optimization. GLAD is visually
grounded by perceptual learning from a dataset of 13.8k
instances collected from driving behaviors. GLAD enables
autonomous vehicles to efficiently and safely fulfill complex
service requests. Experimental results from abstract and full
simulation show that our system outperforms a few competitive
baselines from the literature.
I. INTRODUCTION
Self-driving cars are changing people’s everyday lives.
Narrowly defined autonomous driving technology is con-
cerned with point-to-point navigation and obstacle avoid-
ance [1], where recent advances in perception and machine
learning have made significant achievements. In this paper,
we are concerned with urban driving scenarios, where vehi-
cles must follow traffic rules and social norms to perform
driving behaviors, such as merging lanes and parking on the
right. At the same time, the vehicles need to fulfill service
requests from end users. Consider the following scenario:
Emma asks her autonomous car to drive her home
after work. On her way home, Emma needs to pick
up her kid Lucas from school, stop at a gas station,
and visit a grocery store. In rush hour, driving in
some areas can be difficult. Lucas does not like the
gas smell, but he likes shopping with Emma.
The goal of Emma’s autonomous car is to efficiently and
safely fulfill her requests while respecting the preferences of
Emma (and her family). We say a service request is complex,
if fulfilling it requires the vehicle to visit two or more
points of interest (POIs), such as a gas station and a grocery
store, each corresponding to a driving task. Facing such a
service request, a straightforward idea is to first sequence
the driving tasks of visiting different POIs, and then perform
behavioral and motion planning to complete those tasks.
However, this idea is less effective in practice, because of the
unforeseen execution-time dynamism of traffic conditions.
For instance, the vehicle might find it difficult to merge right
The authors are with the Department of Computer Science, SUNY Bing-
hamton, Binghamton NY 13902. {yding25, ccui7, xzhan244,
zhangs}@binghamton.edu
and park at a gas station because of unanticipated heavy
traffic. This observation motivates this work that leverages
visual perception to bridge the communication gap between
different decision-making layers for urban driving.
In this paper, we develop Grounded Layered Autonomous
Driving (GLAD), a planning framework for urban driving
that includes three decision-making layers for service, be-
havior, and motion respectively. The service (top) layer is
for sequencing POIs to be visited in order to fulfill users’
service requests. User preferences, such as “Lucas likes
shopping with Emma”, can be incorporated into this layer.
The behavior (middle) layer plans driving behaviors, such as
“merge left” and “drive straight”. The motion (bottom) layer
aims to compute trajectories and follow them to realize the
middle-layer behaviors. GLAD is novel in its bidirectional
communication mechanism between different layers. For
example, the bottom layer reports motion cost estimates up to
the top two layers for plan optimization. The safety estimates
of different driving behaviors (middle layer) are reported to
the top layer, and the safety estimation is conditioned on
the motion trajectories in the bottom layer. An overview of
GLAD is presented in Fig. 1.
“Grounding” is a concept that was initially developed in
the literature of symbolic reasoning [2]. In this work, the
vehicle’s behavioral planner (middle layer) relies on sym-
bolic rules, such as “If the current situation is safe, a merge
left behavior will move a vehicle to the lane on its left.
While classical planning methods assume perfect information
about “X is safe,” an autonomous vehicle needs its perception
algorithms to visually ground such symbols in the real world.
We used the CARLA simulator [3] to collect a dataset of
13.8kinstances, each including 16 images, for evaluating
the safety levels of driving behaviors. Learning from our
gathered dataset enables GLAD to visually ground symbolic
predicates for planning driving behaviors. We have compared
GLAD with baseline methods [4], [5] that support decision-
making at behavioral and motion levels. Results show that
GLAD produced the highest overall utility compared to the
baseline methods.
II. BACKGROUND AND RELATED WORK
Service agents sometimes need more than one action
to fulfill service requests. Task planning methods aim to
sequence symbolic actions to complete such complex tasks.
There are at least two types of task planning, namely auto-
mated planning [6], [7] and planning under uncertainty [8],
that can be distinguished based on their assumptions about
the determinism of action outcomes. Automated planning
arXiv:2210.02302v1 [cs.RO] 5 Oct 2022
Service-level
planning
Behavior-level
planning
Motion-level
planning
Navigation
cost
Safety
estimates
Behavior planning
goals
Motion planning
goals
Grounded Layered Autonomous Driving (GLAD)
User
preferences
Traffic
conditions
Vehicle
localization
Visual
perception
Soft
constraints
Current
positions
Service requests
Pretrained
encoder
Back
Safety estimator
Front
Left
Right
4*8192
16*288*288*3
Binary
classifier
Image
dataset
[0.0, 0.1]
Training
Top
Middle
Bottom
Fig. 1. An overview of the GLAD planning framework for complex driving tasks in urban scenarios. GLAD consists of three decision-making layers
about fulfilling service requests, sequencing driving behaviors, and computing motion trajectories respectively. GLAD is a visually grounded planning
framework, because the safety levels of driving behaviors are evaluated using computer vision.
systems compute a sequence of actions that lead transitions
to a goal state. Assuming non-deterministic action outcomes,
planning under uncertainty algorithms model state transi-
tions using Markov decision processes (MDPs) and compute
a policy that maps the current state to action. Other than
task planning, autonomous vehicles need motion planning
algorithms [9], [10] to compute motion trajectories and
generate control signals to follow those trajectories. This
work involves both task planning and motion planning.
Next, we discuss how those different planning paradigms
have been used in driving and robotics literature.
A. Planning for Autonomous Driving
Within the context of planning in autonomous driving,
the majority of research has been focusing on comput-
ing trajectories connecting a vehicle’s current position to
its desired goal [11]–[15]. For instance, Hu et al. (2018)
developed an approach that generates an optimal path in
real time as well as the appropriate acceleration and speed
for the autonomous vehicle in order to avoid both static
and moving obstacles [11]. Those works concentrated on
the motion planning and control problems in autonomous
driving, whereas we further include task-level planning for
driving behaviors and fulfilling service requests.
Driving behaviors, such as merging lanes and turning,
have been modeled as a set of symbolic actions [16]–[19].
Those works focused on task-level, discrete-space behavioral
planning, and did not consider safety, cost, or both at the
motion level. Another difference is that those works did not
consider user preferences, whereas we do in this work.
Recent research has incorporated state estimation into
behavioral planning in autonomous urban driving. For in-
stance, the work of Phiquepal and Toussaint (2019) modeled
urban driving as a partially observable Markov decision
process to enable active information gathering for decision-
making [20]. Another example is the work of Ding et
al. (2020) that quantitatively estimated safety levels for
behavioral planning [5]. In line with those methods, we
consider partially observable worlds, and compute plans
under uncertainty. Beyond that, GLAD has a perception com-
ponent, and leverages egocentric vision for safety evaluation,
which improves the real-world applicability of our approach
compared with those methods.
B. Task and Motion Planning in Robotics
Researchers in robotics have developed a variety of task
and motion planning (TAMP) algorithms to enable robots to
plan to fulfill task-level goals while maintaining motion-level
feasibility, as summarized in review articles [21], [22]. The
majority of TAMP research concentrated on manipulation do-
mains, where ensuring plan feasibility is key, and efficiency
in task completion is secondary [23]–[26]. Urban driving
tasks tend to be time-consuming, and sub-optimal plans
might result in a very long execution time. Different from
most TAMP methods, our work incorporates task-completion
efficiency in addition to plan feasibility.
There are a few TAMP works that incorporated efficiency
into plan optimization [27]–[29]. For instance, the work
of Zhang et al. (2022) leveraged computer vision towards
computing efficient and feasible task-motion plans for a
mobile manipulator in indoor environments [29]. In line with
those methods, we consider both task-completion efficiency,
and plan feasibility. Different from those TAMP methods
for indoor scenarios, GLAD focuses on urban driving, and
further considers user preferences and road safety.
III. PROBLEM STATEMENT
Planning-time Input: A service request is specified as
hard constraints that can be satisfied using multiple driving
behaviors, denoted as csthard. User preferences are formulated
using soft constraints, denoted as cstsoft. Violating such a soft
constraint introduces a penalty. In our domains, service task
specifications and user preferences are provided as part of
the problem definition. The hard and soft constraints can
be encoded in different ways depending on the syntax of
planning languages and systems.
摘要:

GLAD:GroundedLayeredAutonomousDrivingforComplexServiceTasksYanDing,ChengCui,XiaohanZhang,andShiqiZhangAbstract—Giventhecurrentpoint-to-pointnavigationcapa-bilitiesofautonomousvehicles,researchersarelookingintocomplexservicerequeststhatrequirethevehiclestovisitmultiplepointsofinterest.Inthispaper,wed...

展开>> 收起<<
GLAD Grounded Layered Autonomous Driving for Complex Service Tasks Yan Ding Cheng Cui Xiaohan Zhang and Shiqi Zhang.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:2.63MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注