PCGen: Point Cloud Generator for LiDAR Simulation
Chenqi Li, Yuan Ren and Bingbing Liu
Huawei Noah’s Ark Lab, Toronto, Canada
{chenqi.li, yuan.ren3, liu.bingbing}@huawei.com
Abstract— Data is a fundamental building block for LiDAR
perception systems. Unfortunately, real-world data collection
and annotation is extremely costly & laborious. Recently, real
data based LiDAR simulators have shown tremendous potential
to complement real data, due to their scalability and high-
fidelity compared to graphics engine based methods. Before
simulation can be deployed in the real-world, two shortcomings
need to be addressed. First, existing methods usually generate
data which are more noisy and complete than the real point
clouds, due to 3D reconstruction error and pure geometry-
based raycasting method. Second, prior works on simulation
for object detection focus solely on rigid objects, like cars, but
Vulnerable Road User (VRU)s, like pedestrians, are important
road participants. To tackle the first challenge, we propose
First Peak Averaging (FPA) raycasting and surrogate model
raydrop. FPA enables the simulation of both point cloud
coordinates and sensor features, while taking into account
reconstruction noise. The ray-wise surrogate raydrop model
mimics the physical properties of LiDAR’s laser receiver to
determine whether a simulated point would be recorded by a
real LiDAR. With minimal training data, the surrogate model
can generalize to different geographies and scenes, closing
the domain gap between raycasted and real point clouds. To
tackle the simulation of deformable VRU simulation, we employ
Skinned Multi-Person Linear model (SMPL) dataset to provide
a pedestrian simulation baseline and compare the domain gap
between CAD and reconstructed objects. Applying our pipeline
to perform novel sensor synthesis, results show that object
detection models trained by simulation data can achieve similar
result as the real data trained model.
I. INTRODUCTION
The success of deep learning is deeply rooted in the
availability of large-scale, high-fidelity datasets. Pioneering
datasets [1]–[13] facilitate development of cutting edge visual
recognition systems, providing challenging benchmarks for
the community. However, collection and annotation of data
in the real world is very inefficient, slow and uneconomical.
Simulation, on the other hand, give users the flexibility to
generate diverse scenarios with ease, as well as providing
automatically generated ground truth annotations. For LiDAR
simulation, two distinct approaches have been explored:
graphics engine based and real data based. Results show that
real data based methods produces simulation data with lower
domain gap compared to graphics engine based methods. This
paper makes the following contributions:
•
We present
FPA
raycasting to simulate LiDAR point
clouds and sensor features, accounting for noise in the
reconstructed scenes.
•
We develop the surrogate model of a single laser
head and use it for raydrop. Comparing to the UNet
based raydrop method, the proposed method is scene-
independent. The surrogate model of a specific LiDAR
can be trained once and used in different scenes.
•
We perform novel sensor synthesis with our simulation
pipeline. The test results show that it provides high-
fidelity data for the new sensor configuration, achieving
similar result as the model trained on the real data.
•
We propose Learned Point Cloud Similarity (
LPCS
) met-
ric to measure domain gap between real and simulation
point clouds, from the perspective of perception models
•
We provide a baseline pedestrian simulation result, using
SMPL and reconstructed human models
II. RELATED WORK
A. Graphics Engine Based LiDAR Simulator
Initial attempts to LiDAR simulation were spearheaded
by Car Learning to Act (
CARLA
) [14]. Building on top of
Unreal Engine 4 (
UE4
) [15],
CARLA
’s simulation platform
allows the user to customize scenarios, including agent model,
density, interaction with the world, weather conditions, and
sensor suite. Similarly, Yue et al. leveraged the popular,
high-fidelity simulation of Grand Theft Auto V (
GTA V
)
to automatically extract point cloud with ground truth labels
[16]. The framework also enables users to construct diverse,
customized scenarios interactively to test neural network
performance in corner cases. Experiments have shown that
retraining with additional synthetic point cloud significantly
improves model’s performance on KITTI dataset [17], [18].
This work is further extended by the Precise Synthetic
Image and LiDAR (
PreSIL
) dataset, which improves the
raycasting functionality within
GTA V
to address the issues
of approximating human with cylinders and missed ray-
scene collisions [19].
PreSIL
provides a large simulated
dataset in KITTI format, and demonstrate that it can boost
the performance on state-of-art model in KITTI 3D Object
Detection benchmark.
B. Real Data Based LiDAR Simulator
Generating CAD model assets and complex scenarios are
labor-intensive, making simulation difficult and costly to scale.
Furthermore, domain gap between noiseless simulation and
real-world data leads to poor model performance if only
trained on simulation data, prompting the development of
domain adaptation techniques [18], [20], [21]. Fang et al.
investigated a hybrid, data-driven approach to point cloud
generation framework, which combines real world scanned
background point cloud and synthetic foreground objects
arXiv:2210.08738v1 [cs.RO] 17 Oct 2022