Repainting and Imitating Learning for Lane Detection Yue He heyue04baidu.com

2025-04-29 0 0 4.52MB 9 页 10玖币
侵权投诉
Repainting and Imitating Learning for Lane Detection
Yue He
heyue04@baidu.com
Baidu Inc.
Beijing, China
Minyue Jiang
jiangminyue@baidu.com
Baidu Inc.
Beijing, China
Xiaoqing Ye
yexiaoqing@baidu.com
Baidu Inc.
Shanghai, China
Liang Du
duliang@mail.ustc.edu.cn
Fudan University
Shanghai, China
Zhikang Zou
zouzhikang@baidu.com
Baidu Inc.
Shenzhen, China
Wei Zhang
zhangwei99@baidu.com
Baidu Inc.
Shenzhen, China
Xiao Tan
tanxiao01@baidu.com
Baidu Inc.
Shenzhen, China
Errui Ding
dingerrui@baidu.com
Baidu Inc.
Beijing, China
ABSTRACT
Current lane detection methods are struggling with the invisibility
lane issue caused by heavy shadows, severe road mark degradation,
and serious vehicle occlusion. As a result, discriminative lane fea-
tures can be barely learned by the network despite elaborate designs
due to the inherent invisibility of lanes in the wild. In this paper, we
target at nding an enhanced feature space where the lane features
are distinctive while maintaining a similar distribution of lanes in
the wild. To achieve this, we propose a novel Repainting and Imi-
tating Learning (RIL) framework containing a pair of teacher and
student without any extra data or extra laborious labeling. Speci-
cally, in the repainting step, an enhanced ideal virtual lane dataset
is built in which only the lane regions are repainted while non-lane
regions are kept unchanged, maintaining the similar distribution
of lanes in the wild. The teacher model learns enhanced discrimi-
native representation based on the virtual data and serves as the
guidance for a student model to imitate. In the imitating learning
step, through the scale-fusing distillation module, the student net-
work is encouraged to generate features that mimic the teacher
model both on the same scale and cross scales. Furthermore, the
coupled adversarial module builds the bridge to connect not only
teacher and student models but also virtual and real data, adjusting
the imitating learning process dynamically. Note that our method
introduces no extra time cost during inference and can be plug-and-
play in various cutting-edge lane detection networks. Experimental
results prove the eectiveness of the RIL framework both on CU-
Lane and TuSimple for four modern lane detection methods. The
code and model will be available soon.
Equal contribution.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
MM ’22, October 10–14, 2022, Lisboa, Portugal
©2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9203-7/22/10. . . $15.00
https://doi.org/10.1145/3503161.3548042
CCS CONCEPTS
Computing methodologies
!
Interest point and salient
region detections;
KEYWORDS
lane detection, image repainting, scale-fusing distillation, coupled
adversarial module
ACM Reference Format:
Yue He, Minyue Jiang, Xiaoqing Ye, Liang Du, Zhikang Zou, Wei Zhang,
Xiao Tan, and Errui Ding. 2022. Repainting and Imitating Learning for
Lane Detection. In Proceedings of the 30th ACM International Conference on
Multimedia (MM ’22), October 10–14, 2022, Lisboa, Portugal. ACM, New York,
NY, USA, 9 pages. https://doi.org/10.1145/3503161.3548042
1 INTRODUCTION
Lane detection is a crucial task in autonomous driving [
5
], which
could serve as visual cues for advanced driver assistance systems
(ADAS) to keep vehicles stably following lane markings. Thus,
autonomous vehicles need to locate each lane’s precise position.
With the development of deep learning, neural networks [
32
]
have been widely used in lane detection for their compelling per-
formance. Early deep-learning-based methods detect lanes through
pixel-wise segmentation based framework [
22
,
23
,
27
], where each
pixel is assigned a binary label to indicate whether it belongs to a
lane or not. More recently, various anchor-based methods [
3
,
13
,
30
]
are proposed, dierent forms of anchors such as line anchors and
box anchors are used among these methods, aiming to let the
networks focus the optimization on the line shape by regress-
ing the relative coordinates. Besides, row-wise classication meth-
ods [
14
,
24
,
26
,
37
] rely on the shape prior of lanes and predict
the location for each row. Parametric prediction methods [
15
,
31
]
directly output parameters of curve equation for lines. Multi-task
learning is further combined to improve lane detection accuracy in
a complicated environment. For example, VPGNet [
12
] integrates
road marking detection and vanishing point prediction to obtain
auxiliary information for lane detection. However, additional man-
ual labeling is time-consuming and laborious.
MM ’22, October 10–14, 2022, Lisboa, Portugal Yue He et al.
(a)
(b)
(c)
(d)
Figure 1: Samples of complex scenarios in the CULane dataset:
(a) Crowded (b) Dazzle (c) Shadow (d) Night. The rst column
shows the ground truth, and we highlight the elliptical re-
gion for comparison between lines before repainting (second
column) and after repainting (third column).
Either combining shape prior of lanes or designing the auxiliary
task has shown its comprehensive consideration and competitive
results for lane detection, it is still a challenging task due to many
factors such as the wide variety of lane markings appearance includ-
ing solid or dashed, white or yellow. Moreover, complex road and
light conditions leading to occlusion and low illumination increase
the invisibility of lanes. All these diculties require the method to
have the ability to extract lanes under complicated environments.
Taking CULane dataset [
23
] as an example in Fig. 1, we present
four representative scenarios, including crowded, dazzle, shadow,
and night. In these scenarios, complete lanes cannot be visually
detected by comparing lanes in the wild (the second column) with
the ground truth annotations (the rst column). The inherent in-
visibility of lanes hinders the progress of the algorithms. In this
paper, we focus on nding a more discriminative lane feature space
and maintaining a similar distribution of lanes in the wild. The
Repainting and Imitating Learning (RIL) framework is proposed to
increase the visibility of lanes by repainting module without extra
data or labor labeling, and improve feature discrimination while
transferring lane knowledge from teacher to student via imitating
learning simultaneously.
First and foremost, the inherent invisibility of lanes in the wild
intractable by current lane detection methods is alleviated by a sim-
ple but eective repainting module in our repainting step. Through
this module, virtual data is generated based on the location of lanes
annotated in the ground truth without the requirement of extra data
or laborious labeling. This module highlights the fuzzy lanes and
makes lines more prominent and continuous. As shown in Fig. 1,
the lane regions in the third column become more distinctive and
continuous while maintaining other non-lane regions unchanged.
Based on these ideal lanes, the teacher will be trained in advanced
and achieves upper bound performance.
A simple solution by directly adding the virtual data as data
augmentation does not work, since there exists a distribution gap
between virtual data and real data. Thus, to better utilize virtual data,
imitating learning step is introduced including a scale-fusing distil-
lation module and a coupled adversarial module. In the scale-fusing
distillation module, dierent stages of feature representations from
the teacher are treated as the ideal enhanced feature space. Noticing
that the teacher model has the same architecture as the student.
The teacher’s feature maps of the same size as the student’s feature
maps are distilled directly. Besides, the teacher’s larger feature maps
are down-sampled to distill student’s semantic feature maps simul-
taneously for ner lane details. Both the same scale and cross scale
information are distilled, helping student to imitate the teacher.
To further eliminate distribution gaps not only between dierent
networks but also between dierent input data, coupled adversarial
module is proposed to build a bridge to connect networks as well
as data. A pair of discriminators are coupled by adding in another
student’s output of virtual data. The rst net-sensitive discrimina-
tor is to distinguish teacher and student networks when they are
fed into virtual data. The second data-sensitive discriminator is to
distinguish between virtual data and real data which feed to the
student network. By coupling two discriminators, the student can
better imitate the enhanced teacher features dynamically through
the learning process.
Our main contributions are summarized as follows:
We introduce a simple yet eective Repainting and Imitating
framework (RIL) for lane detection, focusing on discriminat-
ing lane features and maintaining the similar distribution of
lanes in the wild by nding an enhanced feature space.
We repaint the real lane data to ideal virtual data in the
repainting step, achieving enhanced representation under
complicated environments.
We combine the scale-fusing distillation module with the
coupled adversarial module in the imitating step, building the
bridge between networks and data to weaken the learning
gap.
The proposed RIL framework can be easily plug-and-play in most
cutting-edge methods without any extra inference cost. Experi-
mental results prove the eectiveness of RIL framework both on
CULane [
23
] and TuSimple [
33
] for four modern lane detection
methods including UFAST [
26
], ERFNet [
27
], ESA [
11
] and Cond-
LaneNet [14] respectively.
2 RELATED WORK
Lane detection Recent approaches [
4
,
16
,
19
,
25
] focus on the deep
neural networks and signicantly boost the lane detection perfor-
mance due to the powerful representation learning ability. Some
methods[
11
,
23
,
27
] treat lane detection as a semantic segmentation
task. For instance, SCNN [
23
] designs slice-by-slice convolutions
within feature maps to exchange pixel information between pixels
across rows and columns in a layer. Inspired by network architec-
ture search (NAS), CurveLanes-NAS [
36
] designs a lane-sensitive
architecture to incorporate both long-ranged coherent lane infor-
mation and short-ranged local lane information. Despite the promis-
ing results, the computational complexity in these methods brings
heavy inference overhead. Therefore, row-wise classication based
methods [
9
,
14
,
26
,
37
] have been proposed for ecient lane detec-
tion. These approaches divide the input image into grids and predict
the most probable cell to contain a part of a lane, which realize the
trade-obetween speed and accuracy. More recently, SGNet [
28
]
introduces a structure-guided framework to accurately classify, lo-
cate and restore the shape of unlimited lanes. Unlike the systematic
摘要:

RepaintingandImitatingLearningforLaneDetectionYueHe∗heyue04@baidu.comBaiduInc.Beijing,ChinaMinyueJiang∗jiangminyue@baidu.comBaiduInc.Beijing,ChinaXiaoqingYe∗yexiaoqing@baidu.comBaiduInc.Shanghai,ChinaLiangDu∗duliang@mail.ustc.edu.cnFudanUniversityShanghai,ChinaZhikangZouzouzhikang@baidu.comBaiduInc....

展开>> 收起<<
Repainting and Imitating Learning for Lane Detection Yue He heyue04baidu.com.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:4.52MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注