PCGen Point Cloud Generator for LiDAR Simulation Chenqi Li Yuan Ren and Bingbing Liu Huawei Noahs Ark Lab Toronto Canada

2025-05-02 0 0 4.9MB 7 页 10玖币
侵权投诉
PCGen: Point Cloud Generator for LiDAR Simulation
Chenqi Li, Yuan Ren and Bingbing Liu
Huawei Noah’s Ark Lab, Toronto, Canada
{chenqi.li, yuan.ren3, liu.bingbing}@huawei.com
Abstract Data is a fundamental building block for LiDAR
perception systems. Unfortunately, real-world data collection
and annotation is extremely costly & laborious. Recently, real
data based LiDAR simulators have shown tremendous potential
to complement real data, due to their scalability and high-
fidelity compared to graphics engine based methods. Before
simulation can be deployed in the real-world, two shortcomings
need to be addressed. First, existing methods usually generate
data which are more noisy and complete than the real point
clouds, due to 3D reconstruction error and pure geometry-
based raycasting method. Second, prior works on simulation
for object detection focus solely on rigid objects, like cars, but
Vulnerable Road User (VRU)s, like pedestrians, are important
road participants. To tackle the first challenge, we propose
First Peak Averaging (FPA) raycasting and surrogate model
raydrop. FPA enables the simulation of both point cloud
coordinates and sensor features, while taking into account
reconstruction noise. The ray-wise surrogate raydrop model
mimics the physical properties of LiDAR’s laser receiver to
determine whether a simulated point would be recorded by a
real LiDAR. With minimal training data, the surrogate model
can generalize to different geographies and scenes, closing
the domain gap between raycasted and real point clouds. To
tackle the simulation of deformable VRU simulation, we employ
Skinned Multi-Person Linear model (SMPL) dataset to provide
a pedestrian simulation baseline and compare the domain gap
between CAD and reconstructed objects. Applying our pipeline
to perform novel sensor synthesis, results show that object
detection models trained by simulation data can achieve similar
result as the real data trained model.
I. INTRODUCTION
The success of deep learning is deeply rooted in the
availability of large-scale, high-fidelity datasets. Pioneering
datasets [1]–[13] facilitate development of cutting edge visual
recognition systems, providing challenging benchmarks for
the community. However, collection and annotation of data
in the real world is very inefficient, slow and uneconomical.
Simulation, on the other hand, give users the flexibility to
generate diverse scenarios with ease, as well as providing
automatically generated ground truth annotations. For LiDAR
simulation, two distinct approaches have been explored:
graphics engine based and real data based. Results show that
real data based methods produces simulation data with lower
domain gap compared to graphics engine based methods. This
paper makes the following contributions:
We present
FPA
raycasting to simulate LiDAR point
clouds and sensor features, accounting for noise in the
reconstructed scenes.
We develop the surrogate model of a single laser
head and use it for raydrop. Comparing to the UNet
based raydrop method, the proposed method is scene-
independent. The surrogate model of a specific LiDAR
can be trained once and used in different scenes.
We perform novel sensor synthesis with our simulation
pipeline. The test results show that it provides high-
fidelity data for the new sensor configuration, achieving
similar result as the model trained on the real data.
We propose Learned Point Cloud Similarity (
LPCS
) met-
ric to measure domain gap between real and simulation
point clouds, from the perspective of perception models
We provide a baseline pedestrian simulation result, using
SMPL and reconstructed human models
II. RELATED WORK
A. Graphics Engine Based LiDAR Simulator
Initial attempts to LiDAR simulation were spearheaded
by Car Learning to Act (
CARLA
) [14]. Building on top of
Unreal Engine 4 (
UE4
) [15],
CARLA
s simulation platform
allows the user to customize scenarios, including agent model,
density, interaction with the world, weather conditions, and
sensor suite. Similarly, Yue et al. leveraged the popular,
high-fidelity simulation of Grand Theft Auto V (
GTA V
)
to automatically extract point cloud with ground truth labels
[16]. The framework also enables users to construct diverse,
customized scenarios interactively to test neural network
performance in corner cases. Experiments have shown that
retraining with additional synthetic point cloud significantly
improves model’s performance on KITTI dataset [17], [18].
This work is further extended by the Precise Synthetic
Image and LiDAR (
PreSIL
) dataset, which improves the
raycasting functionality within
GTA V
to address the issues
of approximating human with cylinders and missed ray-
scene collisions [19].
PreSIL
provides a large simulated
dataset in KITTI format, and demonstrate that it can boost
the performance on state-of-art model in KITTI 3D Object
Detection benchmark.
B. Real Data Based LiDAR Simulator
Generating CAD model assets and complex scenarios are
labor-intensive, making simulation difficult and costly to scale.
Furthermore, domain gap between noiseless simulation and
real-world data leads to poor model performance if only
trained on simulation data, prompting the development of
domain adaptation techniques [18], [20], [21]. Fang et al.
investigated a hybrid, data-driven approach to point cloud
generation framework, which combines real world scanned
background point cloud and synthetic foreground objects
arXiv:2210.08738v1 [cs.RO] 17 Oct 2022
[22]. They show that by augmenting the real dataset with
synthetic frames, instance segmentation and object detection
performance is improved. Concurrently, LiDARsim employed
a similar approach, leveraging real data to reconstruct both
background and foreground objects [23]. LiDARsim further
extended the simulation with the addition of a learning
system to model the physics of LiDAR raydrop, closing
the gap between real and simulation point clouds. They show
that the simulation data trained models can obtain similar
performance in object detection and semantic segmentation
as models trained using real data, without domain adaptation
techniques. Similar to LiDARsim, Langer et al. developed
a simulation pipeline for domain transfer in the context of
semantic segmentation [24]. Their results show that closest
point raycasting, along with geodesic correlation alignment,
successfully generated simulation data to adjust a model
trained on the source domain (Velodyne-64) to the target
domain (Velodyne-32).
Fig. 1: Overview of Simulation Pipeline
III. METHODOLOGY
A. 3D Scene Reconstruction
Given a sequence of single frame point clouds, ground truth
bounding box annotations are used to crop out foreground
object points. If an instance moves more than 0.5 meters
in the global frame, the instance is treated as a dynamic
object. Since annotation for dynamic objects are less accurate,
the bounding box dimensions are slightly enlarged before
cropping, to ensure complete removal of all foreground
points. Using odometry, single frame point cloud without
foreground objects are transformed into the global coordinate
frame and subsequently accumulated to obtain a dense 3D
reconstruction of the sequence. Voxel downsampling and
point cloud radius outlier removal are performed as post-
processing steps, in order to reduce memory requirements
and remove noisy points. In the case where odometry is
inaccurate, Simoutaneous Localization and Mapping (
SLAM
)
and Iterative Closest Point (
ICP
) [25] can be used to improve
mapping quality.
For object reconstruction, all bounding boxes belonging to
the same object throughout the sequence can be identified.
The cropped point cloud cluster from each frame can be
returned to x-axis aligned origin using bounding box location
and orientation. To reduce the impact of human annotation
and odometry imperfections, a generalized
ICP
can be used
to improve the alignment. However, if the source and/or
target point cloud contain very few points, minimizing the
point cloud distance using
ICP
often leads to unreasonable
reconstructions. Thus, we adaptively employ
ICP
if both
source and target point cloud contains greater than threshold
number of points, otherwise, object clusters are simply
accumulated. The foreground objects can then be inserted
into the background to simulate a variety of scenarios.
B. Raycasting method
Theoretically, the reconstructed 3D scene is a 2D surface
embedded in the 3D space, represented in the form of a dense
point cloud. Through raycasting, intersections between the
laser beams and point cloud can be computed. LiDARsim
rendered dense point cloud as surfels [26] and used Intel
Embree Engine to compute ray-disc intersections [27]. Langer
et al. used Closest Point (
CP
) raycasting, which projects dense
point cloud into a range image, and the closest point from
each pixel is extracted to render the raycasted point cloud [24].
However, localization, calibration and sensor synchronization
are subject to error, points in the reconstructed point cloud
do not strictly lie on 2D surface of the scene. Both raycasting
methods suffer from noisy reconstructions, leading to noisy
raycasted point clouds. Moreover,
CP
raycasting only works
with spinning scan LiDAR. For LiDARs with irregular scan
pattern, such as the DJI Livox, multiple rays might land in
the same pixel, leading to redundant points.
To solve this problem, we propose First Peak Averaging
(
FPA
) raycasting. First, the reconstructed point cloud is
projected into a range image. Each pixel of the range image
corresponds to a frustum in the 3D space, as shown in Figure
2(a). Each point within the frustum can be defined using the
Spherical coordinates (
d,λ,φ
) or the Cartesian coordinates
(
x,y,z
). Typically, the depth distribution of all points within
the frustum forms multiple peaks, due to occlusion and
observation of the scene from multiple angles. The intuition
behind FPA raycasting is to average points from the closest
peak, in order to estimate the true 2D surface from the noisy
3D point cloud. We are interested in the closest peak, because
摘要:

PCGen:PointCloudGeneratorforLiDARSimulationChenqiLi,YuanRenandBingbingLiuHuaweiNoah'sArkLab,Toronto,Canada{chenqi.li,yuan.ren3,liu.bingbing}@huawei.comAbstract—DataisafundamentalbuildingblockforLiDARperceptionsystems.Unfortunately,real-worlddatacollectionandannotationisextremelycostly&laborious.Rece...

展开>> 收起<<
PCGen Point Cloud Generator for LiDAR Simulation Chenqi Li Yuan Ren and Bingbing Liu Huawei Noahs Ark Lab Toronto Canada.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:4.9MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注