PCGen Point Cloud Generator for LiDAR Simulation Chenqi Li Yuan Ren and Bingbing Liu Huawei Noahs Ark Lab Toronto Canada

2025-05-02 1 0 4.9MB 7 页 10玖币

侵权投诉

PCGen: Point Cloud Generator for LiDAR Simulation

Chenqi Li, Yuan Ren and Bingbing Liu

Huawei Noah’s Ark Lab, Toronto, Canada

{chenqi.li, yuan.ren3, liu.bingbing}@huawei.com

Abstract— Data is a fundamental building block for LiDAR

perception systems. Unfortunately, real-world data collection

and annotation is extremely costly & laborious. Recently, real

data based LiDAR simulators have shown tremendous potential

to complement real data, due to their scalability and high-

ﬁdelity compared to graphics engine based methods. Before

simulation can be deployed in the real-world, two shortcomings

need to be addressed. First, existing methods usually generate

data which are more noisy and complete than the real point

clouds, due to 3D reconstruction error and pure geometry-

based raycasting method. Second, prior works on simulation

for object detection focus solely on rigid objects, like cars, but

Vulnerable Road User (VRU)s, like pedestrians, are important

road participants. To tackle the ﬁrst challenge, we propose

First Peak Averaging (FPA) raycasting and surrogate model

raydrop. FPA enables the simulation of both point cloud

coordinates and sensor features, while taking into account

reconstruction noise. The ray-wise surrogate raydrop model

mimics the physical properties of LiDAR’s laser receiver to

determine whether a simulated point would be recorded by a

real LiDAR. With minimal training data, the surrogate model

can generalize to different geographies and scenes, closing

the domain gap between raycasted and real point clouds. To

tackle the simulation of deformable VRU simulation, we employ

Skinned Multi-Person Linear model (SMPL) dataset to provide

a pedestrian simulation baseline and compare the domain gap

between CAD and reconstructed objects. Applying our pipeline

to perform novel sensor synthesis, results show that object

detection models trained by simulation data can achieve similar

result as the real data trained model.

I. INTRODUCTION

The success of deep learning is deeply rooted in the

availability of large-scale, high-ﬁdelity datasets. Pioneering

datasets [1]–[13] facilitate development of cutting edge visual

recognition systems, providing challenging benchmarks for

the community. However, collection and annotation of data

in the real world is very inefﬁcient, slow and uneconomical.

Simulation, on the other hand, give users the ﬂexibility to

generate diverse scenarios with ease, as well as providing

automatically generated ground truth annotations. For LiDAR

simulation, two distinct approaches have been explored:

graphics engine based and real data based. Results show that

real data based methods produces simulation data with lower

domain gap compared to graphics engine based methods. This

paper makes the following contributions:

•

We present

FPA

raycasting to simulate LiDAR point

clouds and sensor features, accounting for noise in the

reconstructed scenes.

•

We develop the surrogate model of a single laser

head and use it for raydrop. Comparing to the UNet

based raydrop method, the proposed method is scene-

independent. The surrogate model of a speciﬁc LiDAR

can be trained once and used in different scenes.

•

We perform novel sensor synthesis with our simulation

pipeline. The test results show that it provides high-

ﬁdelity data for the new sensor conﬁguration, achieving

similar result as the model trained on the real data.

•

We propose Learned Point Cloud Similarity (

LPCS

) met-

ric to measure domain gap between real and simulation

point clouds, from the perspective of perception models

•

We provide a baseline pedestrian simulation result, using

SMPL and reconstructed human models

II. RELATED WORK

A. Graphics Engine Based LiDAR Simulator

Initial attempts to LiDAR simulation were spearheaded

by Car Learning to Act (

CARLA

) [14]. Building on top of

Unreal Engine 4 (

UE4

) [15],

CARLA

’s simulation platform

allows the user to customize scenarios, including agent model,

density, interaction with the world, weather conditions, and

sensor suite. Similarly, Yue et al. leveraged the popular,

high-ﬁdelity simulation of Grand Theft Auto V (

GTA V

)

to automatically extract point cloud with ground truth labels

[16]. The framework also enables users to construct diverse,

customized scenarios interactively to test neural network

performance in corner cases. Experiments have shown that

retraining with additional synthetic point cloud signiﬁcantly

improves model’s performance on KITTI dataset [17], [18].

This work is further extended by the Precise Synthetic

Image and LiDAR (

PreSIL

) dataset, which improves the

raycasting functionality within

GTA V

to address the issues

of approximating human with cylinders and missed ray-

scene collisions [19].

PreSIL

provides a large simulated

dataset in KITTI format, and demonstrate that it can boost

the performance on state-of-art model in KITTI 3D Object

Detection benchmark.

B. Real Data Based LiDAR Simulator

Generating CAD model assets and complex scenarios are

labor-intensive, making simulation difﬁcult and costly to scale.

Furthermore, domain gap between noiseless simulation and

real-world data leads to poor model performance if only

trained on simulation data, prompting the development of

domain adaptation techniques [18], [20], [21]. Fang et al.

investigated a hybrid, data-driven approach to point cloud

generation framework, which combines real world scanned

background point cloud and synthetic foreground objects

arXiv:2210.08738v1 [cs.RO] 17 Oct 2022

[22]. They show that by augmenting the real dataset with

synthetic frames, instance segmentation and object detection

performance is improved. Concurrently, LiDARsim employed

a similar approach, leveraging real data to reconstruct both

background and foreground objects [23]. LiDARsim further

extended the simulation with the addition of a learning

system to model the physics of LiDAR raydrop, closing

the gap between real and simulation point clouds. They show

that the simulation data trained models can obtain similar

performance in object detection and semantic segmentation

as models trained using real data, without domain adaptation

techniques. Similar to LiDARsim, Langer et al. developed

a simulation pipeline for domain transfer in the context of

semantic segmentation [24]. Their results show that closest

point raycasting, along with geodesic correlation alignment,

successfully generated simulation data to adjust a model

trained on the source domain (Velodyne-64) to the target

domain (Velodyne-32).

Fig. 1: Overview of Simulation Pipeline

III. METHODOLOGY

A. 3D Scene Reconstruction

Given a sequence of single frame point clouds, ground truth

bounding box annotations are used to crop out foreground

object points. If an instance moves more than 0.5 meters

in the global frame, the instance is treated as a dynamic

object. Since annotation for dynamic objects are less accurate,

the bounding box dimensions are slightly enlarged before

cropping, to ensure complete removal of all foreground

points. Using odometry, single frame point cloud without

foreground objects are transformed into the global coordinate

frame and subsequently accumulated to obtain a dense 3D

reconstruction of the sequence. Voxel downsampling and

point cloud radius outlier removal are performed as post-

processing steps, in order to reduce memory requirements

and remove noisy points. In the case where odometry is

inaccurate, Simoutaneous Localization and Mapping (

SLAM

)

and Iterative Closest Point (

ICP

) [25] can be used to improve

mapping quality.

For object reconstruction, all bounding boxes belonging to

the same object throughout the sequence can be identiﬁed.

The cropped point cloud cluster from each frame can be

returned to x-axis aligned origin using bounding box location

and orientation. To reduce the impact of human annotation

and odometry imperfections, a generalized

ICP

can be used

to improve the alignment. However, if the source and/or

target point cloud contain very few points, minimizing the

point cloud distance using

ICP

often leads to unreasonable

reconstructions. Thus, we adaptively employ

ICP

if both

source and target point cloud contains greater than threshold

number of points, otherwise, object clusters are simply

accumulated. The foreground objects can then be inserted

into the background to simulate a variety of scenarios.

B. Raycasting method

Theoretically, the reconstructed 3D scene is a 2D surface

embedded in the 3D space, represented in the form of a dense

point cloud. Through raycasting, intersections between the

laser beams and point cloud can be computed. LiDARsim

rendered dense point cloud as surfels [26] and used Intel

Embree Engine to compute ray-disc intersections [27]. Langer

et al. used Closest Point (

) raycasting, which projects dense

point cloud into a range image, and the closest point from

each pixel is extracted to render the raycasted point cloud [24].

However, localization, calibration and sensor synchronization

are subject to error, points in the reconstructed point cloud

do not strictly lie on 2D surface of the scene. Both raycasting

methods suffer from noisy reconstructions, leading to noisy

raycasted point clouds. Moreover,

raycasting only works

with spinning scan LiDAR. For LiDARs with irregular scan

pattern, such as the DJI Livox, multiple rays might land in

the same pixel, leading to redundant points.

To solve this problem, we propose First Peak Averaging

(

FPA

) raycasting. First, the reconstructed point cloud is

projected into a range image. Each pixel of the range image

corresponds to a frustum in the 3D space, as shown in Figure

2(a). Each point within the frustum can be deﬁned using the

Spherical coordinates (

d,λ,φ

) or the Cartesian coordinates

(

x,y,z

). Typically, the depth distribution of all points within

the frustum forms multiple peaks, due to occlusion and

observation of the scene from multiple angles. The intuition

behind FPA raycasting is to average points from the closest

peak, in order to estimate the true 2D surface from the noisy

3D point cloud. We are interested in the closest peak, because

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

PCGen:PointCloudGeneratorforLiDARSimulationChenqiLi,YuanRenandBingbingLiuHuaweiNoah'sArkLab,Toronto,Canada{chenqi.li,yuan.ren3,liu.bingbing}@huawei.comAbstractDataisafundamentalbuildingblockforLiDARperceptionsystems.Unfortunately,real-worlddatacollectionandannotationisextremelycostly&laborious.Rece...

展开>> 收起<<

PCGen Point Cloud Generator for LiDAR Simulation Chenqi Li Yuan Ren and Bingbing Liu Huawei Noahs Ark Lab Toronto Canada.pdf

共7页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

PCGen Point Cloud Generator for LiDAR Simulation Chenqi Li Yuan Ren and Bingbing Liu Huawei Noahs Ark Lab Toronto Canada

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: