Traffic-Aware Autonomous Driving with Differentiable Traffic Simulation Laura Zheng Sanghyun Son and Ming C. Lin

2025-05-06 0 0 907.3KB 10 页 10玖币
侵权投诉
Traffic-Aware Autonomous Driving
with Differentiable Traffic Simulation
Laura Zheng, Sanghyun Son, and Ming C. Lin
gamma.umd.edu/trafficdriving
Abstract While there have been advancements in au-
tonomous driving control and traffic simulation, there have
been little to no works exploring their unification with deep
learning. Works in both areas seem to focus on entirely different
exclusive problems, yet traffic and driving are inherently related
in the real world. In this paper, we present Traffic-Aware
Autonomous Driving (TrAAD), a generalizable distillation-style
method for traffic-informed imitation learning that directly
optimizes for faster traffic flow and lower energy consumption.
TrAAD focuses on the supervision of speed control in imitation
learning systems, as most driving research focuses on perception
and steering. Moreover, our method addresses the lack of co-
simulation between traffic and driving simulators and provides
a basis for directly involving traffic simulation with autonomous
driving in future work. Our results show that, with information
from traffic simulation involved in the supervision of imitation
learning methods, an autonomous vehicle can learn how to
accelerate in a fashion that is beneficial for traffic flow and
overall energy consumption for all nearby vehicles.
I. INTRODUCTION
The ideal autonomous vehicle (AV) should be able to
minimize the travel time of a route, maximize the energy
efficiency of the vehicle, and provide a smooth and safe
experience for the riders. These objectives are not only
important to the passenger’s experience, but also for greater
global benefits. Annually in the US, traffic congestion
accounts for 29 billion dollars in costs [1], transportation
accounts for 27% of carbon emissions [2], and motor vehicle
accidents are the leading cause of unnatural death as of
2022 [3]. From a learning perspective, the extent an AV can
improve on these objectives is affected by the environment
around it, e.g. the surrounding vehicle traffic and the road
networks. Similarly to other multi-agent problems, improving
traffic flow and energy efficiency for all vehicles in the system
can also benefit each vehicle’s individual objectives. Inversely,
the vehicle’s motion also impacts the traffic around it. An
AV influences the flow of human-driven traffic by acting as a
pace car to those behind it. A “pace car” is commonly used in
racing to control the speeds of competing vehicles; this same
notion can be applied to traffic control in the physical world.
In a driving policy, we can emulate this effect by defining
an individual vehicle’s objective in terms of global metrics.
As long as a model of the traffic and its dynamics on a road
network are accessible, an autonomous driving policy can
directly optimize for traffic flow, energy efficiency, and smooth
acceleration. Decades of traffic engineering research present
The authors are with Department of Computer Science, University of Mary-
land at College Park, MD, U.S.A. E-mail:
{
lyzheng,shh1295,lin
}
@umd.edu
Imitation
Network Controller
Differentiable
Traffic
Simulator
CARLA
Acceleration
Agent Reward
Update
Acceleration
Agent
Offline data
Observation
Supervised Loss
Update
Expert +
Privileged
Agent
Co-Simulated Environment
Accel Control
Accel
Control
Phase 1: “Learn to Accel”
Phase 2: “Learn to do Everything Else”
Fig. 1:
Training for Traffic-aware Autonomous Driving.
Our method can be adapted to most existing imitation learning
frameworks for driving by just adding an additional phase of
training, where we isolate “Learn to Accelerate”. An agent
whose action space only spans possible acceleration actions
navigates through a co-simulated environment and is rewarded
when it improves overall traffic flow and fuel consumption.
In Phase 2, where we “Learn to do Everything Else”, the
acceleration agent is frozen and supervises speed control of
an imitation learning agent via distillation.
sophisticated mathematical equations modeling traffic, includ-
ing car-following models [4]–[7], traffic flow models [8]–[11]
and theory [12]. These mathematical models, though often too
simplified to account for the uncertainty of driver behaviors,
are computationally efficient, differentiable, and do not require
data. One possibility is to model the traffic environment with
a neural network, as in some of the latest work [13]–[16].
Although these methods are more accurate than ODEs, they
also require large amounts of data to generalize and are
subject to problems associated with deep learning, such as
bias and distributional shifts. In this paper, we couple a
learning-based traffic control algorithm with differentiable
ODE traffic models. We use the gradients of forward traffic
dynamics to guide the learning of a driving policy, so long as
the policy has access to traffic information. Minimal traffic
information involves simulation states of position and velocity
over time, which is further explained in Section III-A.
We present a generalizable method for traffic-aware au-
tonomous driving (TrAAD) which takes advantage of differ-
entiable traffic simulation. By coupling a driving environment
with traffic simulation, the driving policy can retrieve traffic
information during training and learn behaviors that are both
beneficial to individual and global goals.
arXiv:2210.03772v5 [cs.RO] 7 Apr 2023
In TrAAD, we add a phase of training in addition to
traditional imitation learning for driving, where the vehicle
“learns to accelerate”. This phase involves maximizing the
overall traffic flow of a vehicle’s local lane, minimizing
the fuel consumption of all vehicles, and discouraging the
acceleration actions from being too jerky. Because our method
supervises acceleration via distillation, it is generalizable to
nearly any standard imitation learning framework, regard-
less of architecture or design. Our results show that our
method, when implemented on top of existing state-of-the-art
driving frameworks, improves traffic flow, minimizes energy
consumption for the AV, and enhances the passenger’s ride
experience.
In summary, we present the following key contributions:
1)
A simulated traffic-annotated driving dataset for imita-
tion learning for self-driving cars;
2)
Use of gradients from differentiable traffic simulation
to improve sample efficiency for autonomous vehicles;
3)
A generalizable method for traffic-aware autonomous
driving, which learns to control the vehicle via rewards
based on societal traffic-based objectives.
Additional results, materials, code, datasets, and information
can be found on our project website.
II. RELATED WORKS
A. Autonomous Driving with Traffic Information
Zhu et al. recently proposed a method for safe, efficient,
and comfortable velocity control using RL [17]. Similarly to
one of our objectives, they aim to learn acceleration behavior
that exceeds the safety and comfort of human expert drivers.
One major difference is that our work complements existing
end-to-end autonomous driving systems with multi-modal
sensor data, and learned acceleration behavior cooperates
with learned control behavior from imitation learning rather
than learning acceleration in a pure traffic simulation setting.
In addition, our objective is to directly optimize on an
entire traffic state, not just the objectives for the autonomous
vehicle itself. The reward objectives of [17] are also inferred
from a partially-observed point of view. Other works have
considered learning driving behavior with passenger comfort
and safety in mind, but many do not directly involve traffic
state information beyond partially-observed settings [18]–[20].
Wegener et al. present a method for energy-efficient urban
driving via RL [21] in a partially-observed setting purely in
traffic simulation, however, does not address integration with
current works for more complex vehicle control. In short,
our method addresses a broader method for learning a policy
beneficial to both individual and societal traffic objectives,
while can be easily integrated into existing state-of-the-art
end-to-end driving control methods.
B. Differentiable Microscopic Traffic Simulation
While differentiable physics simulation has been gaining
popularity in recent years, differentiable traffic simulation
is under-explored, especially in applications for autonomous
driving. In 2021, Andelfinger first introduced the potential of
differentiable agent-based traffic simulation, as well as tech-
niques to address discontinuities of control flow [22]. In his
work, Andelfinger highlights continuous solutions for discrete
or discontinuous operations such as conditional branching,
iteration, time-dependent behavior, or stochasticity in forward
simulation, ultimately enabling the use of automatic differen-
tiation (autodiff) libraries for applications such as traffic light
control. One key difference between our work and [22] is that
our implementation of differentiable simulation accounts for
learning agents acting independently from agents following a
car-following model, and is compatible with existing learning
frameworks. In addition, we optimize traffic-related learning
by defining analytical gradients rather than relying solely
on auto-differentiation. Most recently, Son et al. proposed
a novel differentiable hybrid traffic simulator that computes
gradients for both macroscopic, or fluid-like, representations
and agent-based microscopic representations, as well as the
transitions between them [23]. In our work, we focus solely
on microscopic agent-based simulation to maintain relevance
to autonomous driving frameworks.
C. Deep Learning with Traffic Simulation
Deep reinforcement learning has been used to address
futuristic and complex problems for control of autonomous
vehicles in traffic. One survey on Deep RL for motion
planning for autonomous vehicles by Aradi [24] delineates
challenges facing the application of DRL to traffic problems,
one of which is the long and potentially unsuccessful learning
process. This has been addressed in several ways through
curriculum learning [25]–[27], adversarial learning [28], [29],
or model-based action choice. In our work, we address
this issue via sample enhancement for on-policy deep re-
inforcement learning. With differentiable traffic simulation
and access to gradients of reward with respect to policy
action, we can artificially generate “helpful” samples during
learning with respect to reward. “FLOW” by Wu et al. [30]
presents a deep reinforcement learning (DRL) benchmarking
framework, built on the popular microscopic traffic simulator
SUMO [31]. Wu et al. provide motivation for integrating
traffic dynamics into autonomous driving objectives with
DRL, defining the problem/task as “mixed autonomy”. Novel
objectives for driving include reducing congestion, carbon
emissions, and other societal costs; these are all in futuristic
anticipation of mixed autonomy traffic. Based on FLOW,
Vinitsky et al. published a series of benchmarks highlighting
4 main scenarios regarding traffic light control, bottleneck
throughput, optimizing intersection capacity, and controlling
merge on-ramp shock waves [32]. We extend the environments
from FLOW’s DRL framework to be differentiable and show
benchmark results for enhanced DRL algorithms utilizing
traffic flow gradients for optimization.
III. BACKGROUND
A. Simulation-related Notation and Definitions
To integrate traffic simulation into learning and optimiza-
tion frameworks for autonomous driving, we need differen-
tiable forward simulation. Agent-based traffic simulation is
摘要:

Trafc-AwareAutonomousDrivingwithDifferentiableTrafcSimulationLauraZheng,SanghyunSon,andMingC.Lingamma.umd.edu/trafficdrivingAbstract—Whiletherehavebeenadvancementsinau-tonomousdrivingcontrolandtrafcsimulation,therehavebeenlittletonoworksexploringtheirunicationwithdeeplearning.Worksinbothareassee...

展开>> 收起<<
Traffic-Aware Autonomous Driving with Differentiable Traffic Simulation Laura Zheng Sanghyun Son and Ming C. Lin.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:907.3KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注