Trafﬁc-Aware Autonomous Driving with Differentiable Trafﬁc Simulation Laura Zheng Sanghyun Son and Ming C. Lin

2025-05-06 0 0 907.3KB 10 页 10玖币

侵权投诉

Trafﬁc-Aware Autonomous Driving

with Differentiable Trafﬁc Simulation

Laura Zheng, Sanghyun Son, and Ming C. Lin

gamma.umd.edu/trafficdriving

Abstract— While there have been advancements in au-

tonomous driving control and trafﬁc simulation, there have

been little to no works exploring their uniﬁcation with deep

learning. Works in both areas seem to focus on entirely different

exclusive problems, yet trafﬁc and driving are inherently related

in the real world. In this paper, we present Trafﬁc-Aware

Autonomous Driving (TrAAD), a generalizable distillation-style

method for trafﬁc-informed imitation learning that directly

optimizes for faster trafﬁc ﬂow and lower energy consumption.

TrAAD focuses on the supervision of speed control in imitation

learning systems, as most driving research focuses on perception

and steering. Moreover, our method addresses the lack of co-

simulation between trafﬁc and driving simulators and provides

a basis for directly involving trafﬁc simulation with autonomous

driving in future work. Our results show that, with information

from trafﬁc simulation involved in the supervision of imitation

learning methods, an autonomous vehicle can learn how to

accelerate in a fashion that is beneﬁcial for trafﬁc ﬂow and

overall energy consumption for all nearby vehicles.

I. INTRODUCTION

The ideal autonomous vehicle (AV) should be able to

minimize the travel time of a route, maximize the energy

efﬁciency of the vehicle, and provide a smooth and safe

experience for the riders. These objectives are not only

important to the passenger’s experience, but also for greater

global beneﬁts. Annually in the US, trafﬁc congestion

accounts for 29 billion dollars in costs [1], transportation

accounts for 27% of carbon emissions [2], and motor vehicle

accidents are the leading cause of unnatural death as of

2022 [3]. From a learning perspective, the extent an AV can

improve on these objectives is affected by the environment

around it, e.g. the surrounding vehicle trafﬁc and the road

networks. Similarly to other multi-agent problems, improving

trafﬁc ﬂow and energy efﬁciency for all vehicles in the system

can also beneﬁt each vehicle’s individual objectives. Inversely,

the vehicle’s motion also impacts the trafﬁc around it. An

AV inﬂuences the ﬂow of human-driven trafﬁc by acting as a

pace car to those behind it. A “pace car” is commonly used in

racing to control the speeds of competing vehicles; this same

notion can be applied to trafﬁc control in the physical world.

In a driving policy, we can emulate this effect by deﬁning

an individual vehicle’s objective in terms of global metrics.

As long as a model of the trafﬁc and its dynamics on a road

network are accessible, an autonomous driving policy can

directly optimize for trafﬁc ﬂow, energy efﬁciency, and smooth

acceleration. Decades of trafﬁc engineering research present

The authors are with Department of Computer Science, University of Mary-

land at College Park, MD, U.S.A. E-mail:

{

lyzheng,shh1295,lin

}

@umd.edu

Imitation

Network Controller

Differentiable

Traffic

Simulator

CARLA

Acceleration

Agent Reward

Update

Acceleration

Agent

Offline data

Observation

Supervised Loss

Update

Expert +

Privileged

Agent

Co-Simulated Environment

Accel Control

Accel

Control

Phase 1: “Learn to Accel”

Phase 2: “Learn to do Everything Else”

Fig. 1:

Training for Trafﬁc-aware Autonomous Driving.

Our method can be adapted to most existing imitation learning

frameworks for driving by just adding an additional phase of

training, where we isolate “Learn to Accelerate”. An agent

whose action space only spans possible acceleration actions

navigates through a co-simulated environment and is rewarded

when it improves overall trafﬁc ﬂow and fuel consumption.

In Phase 2, where we “Learn to do Everything Else”, the

acceleration agent is frozen and supervises speed control of

an imitation learning agent via distillation.

sophisticated mathematical equations modeling trafﬁc, includ-

ing car-following models [4]–[7], trafﬁc ﬂow models [8]–[11]

and theory [12]. These mathematical models, though often too

simpliﬁed to account for the uncertainty of driver behaviors,

are computationally efﬁcient, differentiable, and do not require

data. One possibility is to model the trafﬁc environment with

a neural network, as in some of the latest work [13]–[16].

Although these methods are more accurate than ODEs, they

also require large amounts of data to generalize and are

subject to problems associated with deep learning, such as

bias and distributional shifts. In this paper, we couple a

learning-based trafﬁc control algorithm with differentiable

ODE trafﬁc models. We use the gradients of forward trafﬁc

dynamics to guide the learning of a driving policy, so long as

the policy has access to trafﬁc information. Minimal trafﬁc

information involves simulation states of position and velocity

over time, which is further explained in Section III-A.

We present a generalizable method for trafﬁc-aware au-

tonomous driving (TrAAD) which takes advantage of differ-

entiable trafﬁc simulation. By coupling a driving environment

with trafﬁc simulation, the driving policy can retrieve trafﬁc

information during training and learn behaviors that are both

beneﬁcial to individual and global goals.

arXiv:2210.03772v5 [cs.RO] 7 Apr 2023

In TrAAD, we add a phase of training in addition to

traditional imitation learning for driving, where the vehicle

“learns to accelerate”. This phase involves maximizing the

overall trafﬁc ﬂow of a vehicle’s local lane, minimizing

the fuel consumption of all vehicles, and discouraging the

acceleration actions from being too jerky. Because our method

supervises acceleration via distillation, it is generalizable to

nearly any standard imitation learning framework, regard-

less of architecture or design. Our results show that our

method, when implemented on top of existing state-of-the-art

driving frameworks, improves trafﬁc ﬂow, minimizes energy

consumption for the AV, and enhances the passenger’s ride

experience.

In summary, we present the following key contributions:

A simulated trafﬁc-annotated driving dataset for imita-

tion learning for self-driving cars;

Use of gradients from differentiable trafﬁc simulation

to improve sample efﬁciency for autonomous vehicles;

A generalizable method for trafﬁc-aware autonomous

driving, which learns to control the vehicle via rewards

based on societal trafﬁc-based objectives.

Additional results, materials, code, datasets, and information

can be found on our project website.

II. RELATED WORKS

A. Autonomous Driving with Trafﬁc Information

Zhu et al. recently proposed a method for safe, efﬁcient,

and comfortable velocity control using RL [17]. Similarly to

one of our objectives, they aim to learn acceleration behavior

that exceeds the safety and comfort of human expert drivers.

One major difference is that our work complements existing

end-to-end autonomous driving systems with multi-modal

sensor data, and learned acceleration behavior cooperates

with learned control behavior from imitation learning rather

than learning acceleration in a pure trafﬁc simulation setting.

In addition, our objective is to directly optimize on an

entire trafﬁc state, not just the objectives for the autonomous

vehicle itself. The reward objectives of [17] are also inferred

from a partially-observed point of view. Other works have

considered learning driving behavior with passenger comfort

and safety in mind, but many do not directly involve trafﬁc

state information beyond partially-observed settings [18]–[20].

Wegener et al. present a method for energy-efﬁcient urban

driving via RL [21] in a partially-observed setting purely in

trafﬁc simulation, however, does not address integration with

current works for more complex vehicle control. In short,

our method addresses a broader method for learning a policy

beneﬁcial to both individual and societal trafﬁc objectives,

while can be easily integrated into existing state-of-the-art

end-to-end driving control methods.

B. Differentiable Microscopic Trafﬁc Simulation

While differentiable physics simulation has been gaining

popularity in recent years, differentiable trafﬁc simulation

is under-explored, especially in applications for autonomous

driving. In 2021, Andelﬁnger ﬁrst introduced the potential of

differentiable agent-based trafﬁc simulation, as well as tech-

niques to address discontinuities of control ﬂow [22]. In his

work, Andelﬁnger highlights continuous solutions for discrete

or discontinuous operations such as conditional branching,

iteration, time-dependent behavior, or stochasticity in forward

simulation, ultimately enabling the use of automatic differen-

tiation (autodiff) libraries for applications such as trafﬁc light

control. One key difference between our work and [22] is that

our implementation of differentiable simulation accounts for

learning agents acting independently from agents following a

car-following model, and is compatible with existing learning

frameworks. In addition, we optimize trafﬁc-related learning

by deﬁning analytical gradients rather than relying solely

on auto-differentiation. Most recently, Son et al. proposed

a novel differentiable hybrid trafﬁc simulator that computes

gradients for both macroscopic, or ﬂuid-like, representations

and agent-based microscopic representations, as well as the

transitions between them [23]. In our work, we focus solely

on microscopic agent-based simulation to maintain relevance

to autonomous driving frameworks.

C. Deep Learning with Trafﬁc Simulation

Deep reinforcement learning has been used to address

futuristic and complex problems for control of autonomous

vehicles in trafﬁc. One survey on Deep RL for motion

planning for autonomous vehicles by Aradi [24] delineates

challenges facing the application of DRL to trafﬁc problems,

one of which is the long and potentially unsuccessful learning

process. This has been addressed in several ways through

curriculum learning [25]–[27], adversarial learning [28], [29],

or model-based action choice. In our work, we address

this issue via sample enhancement for on-policy deep re-

inforcement learning. With differentiable trafﬁc simulation

and access to gradients of reward with respect to policy

action, we can artiﬁcially generate “helpful” samples during

learning with respect to reward. “FLOW” by Wu et al. [30]

presents a deep reinforcement learning (DRL) benchmarking

framework, built on the popular microscopic trafﬁc simulator

SUMO [31]. Wu et al. provide motivation for integrating

trafﬁc dynamics into autonomous driving objectives with

DRL, deﬁning the problem/task as “mixed autonomy”. Novel

objectives for driving include reducing congestion, carbon

emissions, and other societal costs; these are all in futuristic

anticipation of mixed autonomy trafﬁc. Based on FLOW,

Vinitsky et al. published a series of benchmarks highlighting

4 main scenarios regarding trafﬁc light control, bottleneck

throughput, optimizing intersection capacity, and controlling

merge on-ramp shock waves [32]. We extend the environments

from FLOW’s DRL framework to be differentiable and show

benchmark results for enhanced DRL algorithms utilizing

trafﬁc ﬂow gradients for optimization.

III. BACKGROUND

A. Simulation-related Notation and Deﬁnitions

To integrate trafﬁc simulation into learning and optimiza-

tion frameworks for autonomous driving, we need differen-

tiable forward simulation. Agent-based trafﬁc simulation is

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Trafc-AwareAutonomousDrivingwithDifferentiableTrafcSimulationLauraZheng,SanghyunSon,andMingC.Lingamma.umd.edu/trafficdrivingAbstractWhiletherehavebeenadvancementsinau-tonomousdrivingcontrolandtrafcsimulation,therehavebeenlittletonoworksexploringtheirunicationwithdeeplearning.Worksinbothareassee...

展开>> 收起<<

Trafﬁc-Aware Autonomous Driving with Differentiable Trafﬁc Simulation Laura Zheng Sanghyun Son and Ming C. Lin.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Trafﬁc-Aware Autonomous Driving with Differentiable Trafﬁc Simulation Laura Zheng Sanghyun Son and Ming C. Lin

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: