LidarAugment Searching for Scalable 3D LiDAR Data Augmentations Zhaoqi Leng1 Guowang Li1 Chenxi Liu1 Ekin Dogus Cubuk2 Pei Sun1 Tong He1 Dragomir Anguelov1and Mingxing Tan1

2025-05-03 0 0 3.47MB 7 页 10玖币
侵权投诉
LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations
Zhaoqi Leng1, Guowang Li1, Chenxi Liu1, Ekin Dogus Cubuk2, Pei Sun1, Tong He1,
Dragomir Anguelov1and Mingxing Tan1
Abstract Data augmentations are important in training
high-performance 3D object detectors for point clouds. Despite
recent efforts on designing new data augmentations, perhaps
surprisingly, most state-of-the-art 3D detectors only use a
few simple data augmentations. In particular, different from
2D image data augmentations, 3D data augmentations need
to account for different representations of input data and
require being customized for different models, which introduces
significant overhead. In this paper, we resort to a search-based
approach, and propose LidarAugment, a practical and effective
data augmentation strategy for 3D object detection. Unlike
previous approaches where all augmentation policies are tuned
in an exponentially large search space, we propose to factorize
and align the search space of each data augmentation, which
cuts down the 20+ hyperparameters to 2, and significantly
reduces the search complexity. We show LidarAugment can
be customized for different model architectures with different
input representations by a simple 2D grid search, and con-
sistently improve both convolution-based UPillars/StarNet/RSN
and transformer-based SWFormer. Furthermore, LidarAug-
ment mitigates overfitting and allows us to scale up 3D detectors
to much larger capacity. In particular, by combining with latest
3D detectors, our LidarAugment achieves a new state-of-the-art
74.8 mAPH L2 on Waymo Open Dataset.
I. INTRODUCTION
Data augmentations are widely used in training deep
neural networks. In particular, for autonomous driving, many
data augmentations are developed to improve data efficiency
and model generalization. However, most recent 3D object
detectors only use a few basic data augmentation operations
such as rotation, flip and ground-truth sampling [1], [2], [3],
[4], [5], [6], [7]. This is in a surprising contrast to 2D image
recognition and detection, where much more sophisticated
2D data augmentations are commonly used in modern image-
based models [8], [9], [10], [11], [12], [13]. In this paper,
we aim to answer: is it practical to adopt more advanced 3D
data augmentations to improve modern 3D object detectors,
especially for high-capacity models?
The main challenge of adopting advanced 3D data aug-
mentations is that 3D augmentations are often sensitive to
input representations and model capacity. For example, range
image based models and point cloud based models require
different types of data augmentation due to different input
representations. High capacity 3D detectors are typically
prone to overfitting and require stronger overall data augmen-
tation compared to lite models with fewer parameters. There-
fore, tailoring each 3D augmentation for different models is
necessary. However, the search space scales exponentially
with respect to the number of hyperparameters, which leads
1Waymo Research, 2Google Brain, lengzhaoqi@waymo.com
UPillar UPillar-L
50
55
60
65
70
75
80
3D mAPH L2
57.8 60.0
63.7
71.0
Baseline LidarAugment
Fig. 1: Model scaling with LidarAugment on Waymo
Open Dataset. Baseline augmentations are from the prior
art of [14]. When scaling up UPillars to UPillars-L, our
LidarAugment improves both models, and the gains are more
significant for the larger model, thanks to its customizable
regularization. More results in Table IV.
to significant search cost. Recent studies [15], [16] attempt to
address these challenges by using efficient search algorithms.
Those approaches typically construct a fixed search space,
and run a complex search algorithms (such as population-
based search [17]) to find a data augmentation strategy for
a model. However, our studies reveal that the search spaces
used in prior works are suboptimal. Despite having complex
search algorithms, without a systematic way to define a good
search space, we cannot unleash the potential of a model.
In this paper, we propose LidarAugment, a simplified
search-based approach for 3D data augmentations. Unlike
previous methods that rely on complex search algorithms to
explore an exponentially large search space, our approach
aims to define a simplified search space that contains a
variety of data augmentations but has minimal (i.e. two)
hyperparameters, such that users can easily customize a
diverse set of 3D data augmentations for different models.
Specifically, we construct the LidarAugment search space
by first factorizing a large search space based on operations
and exploring each sub search space with a per-operation
search. Then, we normalize and align the sub search space
for each data augmentation to form the LidarAugment search
space. The final LidarAugment search space contains only
two shared hyperparameters: m[0,)controls the nor-
malized magnitude and p[0,1]controls the probability of
applying each data augmentation policies. Our LidarAug-
ment search space significantly simplifies prior works [15]
by cutting down the number of hyperparameters to two, a
15×reduction in number of hyperparameters.
Despite only having two hyperparamters, our LidarAug-
ment search space contains a variety of existing 3D data
arXiv:2210.13488v1 [cs.CV] 24 Oct 2022
(1) Original (2) Global
Rotate
(3) Global
Scale (4) Global
Translate
(5) Global
Flip
(6) Global
Drop
(7) Frustum
Drop
(8) Frustum
Noise
(9) Drop
Box
(10) Paste
Box
(11) Swap
Background
(a) (b)
Fig. 2: Visualizing LidarAugment. (a) all data augmentation operations used in LidarAugment. For non-global operations,
we highlight the augmented parts in red (boxes). (b) occlusion introduced by data augmentation, e.g., paste a car object, is
handled by removing overlapping rays in range view based on distance. We show point clouds and the corresponding range
images with (bottom)/without (top) removing overlapping rays in the range view.
augmentations, such as drop/paste 3D bounding boxes, ro-
tate/scale/dropping points, and copy-paste objects and back-
grounds. In addition, LidarAugment supports coherent aug-
mentation across both point and range view representations,
which generalizes to multi-view 3D detectors.
We perform extensive experiments on the Waymo Open
Dataset [18] and demonstrate LidarAugment is effec-
tive and generalizes well to different model architectures
(convolutions-based and transformer-based), different input
views (3D point view and range image), and different
temporal scales (single and multi frames). Notably, Li-
darAugment advances state-of-the-art (SOTA) transformer-
based SWFormer by 1.4 mAPH on the test set. Furthermore,
LidarAugment provides customizable regularization, which
allows us to scale up 3D object detectors to much higher
capacity without overfitting. As summarized in Figure 1,
LidarAugment consistently improves UPillars models, and
the performance gains are particularly large for high-capacity
models. Our contributions can be summarized as:
1) New insight: we reveal that common 3D data aug-
mentation search spaces are suboptimal and should be
tailored for different models.
2) LidarAugment: we propose the LidarAugment search
space, which supports jointly optimizing 10 augmen-
tation policies with only two hyperparameters (15×
reduction compares to prior works), offering diverse
yet practical augmentations. In addition, we develop
a new method to coherently augment both point and
range-view input representations.
3) State-of-the-art performance: LidarAugment consis-
tently improves both convolution-based UPillars/Star-
Net/RSN and attention-based SWFormer. With Li-
darAugment, we achieve new state-of-the-art results
on Waymo Open Dataset. In addition, LidarAugment
enables model scaling to achieve much better quality
for high-capacity 3D detectors.
II. RELATED WORKS
Data augmentation. Data augmentation is widely used
in training deep neural networks In particular, for 3D object
detection from point clouds, several global and local data
augmentations, such as rotation, flip, pasting objects, and
frustum noise, are used to improve model performance [19],
[1], [20], [2], [4], [21], [15], [22], [23], [24]. However, as
3D data augmentations are sensitive to model architectures
and capacity, it often requires extensive manual tuning to
use these augmentations. Therefore, most existing 3D object
detectors [2], [6], [25], [26], [14] only adopt a few simple
augmentations, such as flip and shift pixels.
Several recent works attempt to use range images for
multi-view 3D detection, but very few augmentations are
developed for range images. [5] attempts to paste objects in
the range image without handling occlusions. Our Paste Box
augmentation support coherently augmenting both range-
view and point-view input data while handling occluded
objects in a simple way (more details in Figure 2), which
enables more realistic augmented scenes and enriches the
data augmentations for multi-view 3D detectors.
Learning data augmentation policies. Designing good
data augmentation normally requires manual tuning and
domain expertise. Several search-based approaches have
been proposed for 2D images, such as AutoAugment [9],
RandAugment [12], and Fast AutoAugment [27]. Our Li-
darAugment is inspired by RandAugment in the sense that
we both try to construct a simplified search space. However,
unlike 2D image augmentations, where a search space works
well for many models, we reveal that existing search space
for 3D detection tasks are suboptimal, which motivates us to
propose the first systematical method to define search spaces
for 3D detection tasks.
On the other hand, for 3D detection, PPBA [15] and
PointAugment [16] propose efficient learning-based data
augmentation frameworks for 3D point clouds. However,
both works require users to run a complex algorithm on
an exponentially large but not well-designed search space.
In contrast, our work provides a systematical framework to
摘要:

LidarAugment:SearchingforScalable3DLiDARDataAugmentationsZhaoqiLeng1,GuowangLi1,ChenxiLiu1,EkinDogusCubuk2,PeiSun1,TongHe1,DragomirAnguelov1andMingxingTan1Abstract—Dataaugmentationsareimportantintraininghigh-performance3Dobjectdetectorsforpointclouds.Despiterecenteffortsondesigningnewdataaugmentati...

展开>> 收起<<
LidarAugment Searching for Scalable 3D LiDAR Data Augmentations Zhaoqi Leng1 Guowang Li1 Chenxi Liu1 Ekin Dogus Cubuk2 Pei Sun1 Tong He1 Dragomir Anguelov1and Mingxing Tan1.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:3.47MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注