Uncertainty-aware LiDAR Panoptic Segmentation Kshitij Sirohi1 Sajad Marvi1 Daniel B uscher1and Wolfram Burgard2 Abstract Modern autonomous systems often rely on LiDAR

2025-05-06 0 0 5.68MB 7 页 10玖币
侵权投诉
Uncertainty-aware LiDAR Panoptic Segmentation
Kshitij Sirohi1, Sajad Marvi1, Daniel B¨
uscher1and Wolfram Burgard2
Abstract Modern autonomous systems often rely on LiDAR
scanners, in particular for autonomous driving scenarios. In
this context, reliable scene understanding is indispensable.
Current learning-based methods typically try to achieve max-
imum performance for this task, while neglecting a proper
estimation of the associated uncertainties. In this work, we
introduce a novel approach for solving the task of uncertainty-
aware panoptic segmentation using LiDAR point clouds. Our
proposed EvLPSNet network is the first to solve this task
efficiently in a sampling-free manner. It aims to predict per-
point semantic and instance segmentations, together with per-
point uncertainty estimates. Moreover, it incorporates methods
for improving the performance by employing the predicted
uncertainties. We provide several strong baselines combining
state-of-the-art panoptic segmentation networks with sampling-
free uncertainty estimation techniques. Extensive evaluations
show that we achieve the best performance on uncertainty-
aware panoptic segmentation quality and calibration compared
to these baselines. We make our code available at: https:
//github.com/kshitij3112/EvLPSNet
I. INTRODUCTION
A perception system capable of providing comprehensive
and reliable scene understanding is crucial for the safe
operation of an autonomous vehicle. The recently introduced
panoptic segmentation [1] unifies the semantic segmentation
of stuff and instance segmentation of thing classes into a
single task. This facilitates the evaluation of the overall
accuracy, which is crucial for a holistic scene understanding.
In practice, however, the performance can only be evalu-
ated on a limited dataset, while the real-world consists of
scenarios and objects possibly not present in the dataset.
Therefore, in addition to an evidence signal, a reliable
uncertainty estimate is crucial for safety-critical applications,
such as autonomous driving. Hence, the task of uncertainty-
aware panoptic segmentation [2] for a unified evaluation of
panoptic segmentation and uncertainty estimation offers a
better potential for deployment. Our method aims to solve
this task for LiDAR point-clouds, as illustrated in Fig. 1.
The regular grid structure of images allows a number of
works on the panoptic segmentation to take advantage of
recent advances in deep learning, in particular using convo-
lutional neural networks (CNNs) [3], [4]. On the other hand,
the irregular, sparse and unordered structure of LiDAR point
clouds posed unique challenges. However, LiDARs provide
an illumination-independent accurate geometric description
of the environment, yielding a great advantage over images.
This motivated recent works for panoptic segmentation of
LiDAR point clouds, represented in various ways, such
1Department of Computer Science at University of Freiburg, Germany.
2Department of Engineering at Technical University N¨
urnberg, Germany.
This work was financed by the Baden-W¨
urttemberg Stiftung gGmbH.
(a) LiDAR panoptic segmentation
(b) LiDAR panoptic uncertainties
Fig. 1: Panoptic segmentation and associated uncertainties as pre-
dicted by our EvLPSNet for the SemanticKITTI validation dataset.
as range images [5], [6], [7], 3D voxels [8], birds-eye-
views (BEVs) [9], or direct points [10]. These methods are
generally classified into proposal-based [5] and proposal-free
[11].
Conventional CNN-based methods, utilizing the softmax
operation, typically show overconfidence in their predictions
[12]. On the other hand, popular sampling-based methods
for uncertainty estimate methods, such as Monte Carlo
dropout [13], and Bayesian neural networks (BNNs) [14],
are time and memory intense hence not suitable for real-
time applications. Therefore, there is a recent interest in
sampling-free methods for uncertainty estimation, such as
evidential deep learning [12], predicting uncertainties in a
single pass. However, most of these works for classification
or segmentation deal with the image domain. Hence, to the
best of our knowledge, there is still no existing approach to
provide sampling-free point-wise uncertainty estimates for
the panoptic segmentation of LiDAR point clouds.
In this work, we present the novel Evidential LiDAR
Panoptic Segmentation Network (EvLPSNet), the first net-
work to tackle this task, by utilizing evidential deep learning.
We use the 2D polar BEV grid representation [11] for our
network, facilitating fast inference times and better sepa-
rability of instances. However, the projection into the grid
structure leads to discretization errors, as all points in a grid
cell are assigned the same prediction. We approach this issue
using the 3D point information, as well as our uncertainty
estimates, proposing a novel learnable uncertainty-based
Query and Refinement (uQR) module. This module employs
a simple point-based convolution layer to achieve point-wise
arXiv:2210.04472v1 [cs.CV] 10 Oct 2022
predictions for points selected based on their uncertainty. We
also propose to utilize the predicted probabilities to create
an efficient version of the k nearest neighbors algorithm
(pKNN). Furthermore, we provide several baselines and
evaluate their results on the task of uncertainty-aware LiDAR
panoptic segmentation. In summary, our contributions are as
follows:
The novel proposal-free EvLPSNet architecture for
uncertainty-aware LiDAR panoptic segmentation.
The uQR module for refining the prediction for the most
uncertain points.
The efficient pKNN algorithm utilizing the predicted
class probabilities.
Several baselines for comparison with EvLPSNet.
II. RELATED WORK
A. Segmentation of LiDAR Point Clouds
The release of the SemanticKITTI dataset [15] led to
the emergence of many works, initially for the semantic
segmentation of LiDAR point clouds. These can generally
be classified based on the point cloud representations they
employ, such as projected range images [16], [17], [18], 3D
voxels [19], point-based [10], and BEV polar coordinates
[9]. Most panoptic approaches utilize these representations
as well.
Panoptic segmentation approaches can be classified as
proposal-based and proposal-free. While both employ sep-
arate semantic and instance segmentation branches, the dis-
tinction lies in the latter. Proposal-based methods typically
employ bounding box regression for discovering instances,
such as Mask-RCNN [20] in the case of EfficientLPS [5].
On the other hand, proposal-free approaches perform clus-
tering on the semantic prediction to obtain instance ids for
objects belonging to separate instances. Panoptic-PolarNet
[11] utilizes a Panoptic Deeplab-based [4] instance head to
regress offsets and centers for different instances. DS-Net
[21] proposes a dynamic shifting module to move instance
points towards their respective center. Panoptic-PHNet [22]
utilizes two different encoders, BEV and voxel-based, to
encode point cloud features, followed by a KNN-transformer
module to model interaction among voxels belonging to thing
classes.
B. Uncertainty Estimation
Many works for estimating uncertainty in segmentation
tasks employ sampling-based methods, such as Bayesian
Neural Networks [14] or Monte Carlo dropout [13], [23].
However, such methods are time and memory-intensive,
requiring multiple passes or sampling operations. For LiDAR
point clouds, SalsaNext [17] is an uncertainty-aware seman-
tic segmentation utilizing BNNs. Even though the network
output is quick to evaluate, due to the sampling of the
BNN approach the uncertainty is slow to obtain. Further,
no metric is presented to quantify the calibration of the
predicted uncertainty for this approach. We believe these
are severe limitations for safety-critical real-time applications
like autonomous driving. The need for single-pass sampling-
free uncertainty estimation motivates many works in the field.
Classical neural networks utilize softmax operations of the
final logits to predict per class score or probability, which
is not a reliable estimate of the network’s confidence in the
prediction, as shown by [12]. Guo et al. [24] propose the
Temperature Scaling (TS) method to learn a logit scaling fac-
tor on the softmax operation to provide calibrated probability
predictions. Other methods, such as [25], learn to separate
different classes in a latent space and, based on the distance
of the predicted to the nearest class feature, calculate the
uncertainty.
Sensoy et al. [12] proposed evidential deep learning to
provide reliable and fast uncertainty estimation with minimal
changes to a network. Petek et al. [26] utilize this method to
simultaneously predict semantic segmentation and bounding
box regression uncertainty. Sirohi et al. [2] introduce the
uncertainty-aware panoptic segmentation task and provide a
sampling-free network for a unified panoptic segmentation
and uncertainty for images. In our present work, we build
upon this to extend the approach to LiDAR point-clouds and
we provide a comprehensive quantitative analysis.
III. TECHNICAL APPROACH
An overview of our network architecture is shown in
Fig. 2. It is based on the proposal-free Panoptic-PolarNet
network [11]. Our evidential semantic segmentation head
and Panoptic-Deeplab based [4] instance segmentation head
utilize the learned features to predict per-point semantic
segmentation, semantic uncertainty, instance center and off-
sets. The predictions from both heads are fused to provide
panoptic segmentation results. Leveraging the segmentation
uncertainties, our proposed query and refine module helps to
improve the prediction for points within uncertain voxels.
Moreover, post-processing using our efficient probability-
based KNN improves the results further.
A. Network Architecture
We project the LiDAR points into a polar BEV grid
utilizing the encoder design proposed by PolarNet [9]. First,
the points (represented in 3D polar coordinates) are grouped
according to their location in a 2D polar BEV grid. The
grid has the dimensions of H×W= 480 ×360, where H
corresponds to the range and Wto the heading angle. Then,
for each grid cell, the corresponding points are encoded using
a simplified PointNet [27]. This is followed by a max pooling
operation to calculate the feature vector for every 2D grid cell
and to create a fixed-size grid representation of W×H×F,
where F= 512 is our number of feature channels.
The subsequent encoder-decoder network utilizes the U-
net [28] architecture. Its first three decoder layers are shared
by the semantic and instance segmentation branches, while
the remaining layers are separate. The instance segmentation
regresses the instance center heatmap and the instance offsets
on the BEV grid.
摘要:

Uncertainty-awareLiDARPanopticSegmentationKshitijSirohi1,SajadMarvi1,DanielB¨uscher1andWolframBurgard2Abstract—ModernautonomoussystemsoftenrelyonLiDARscanners,inparticularforautonomousdrivingscenarios.Inthiscontext,reliablesceneunderstandingisindispensable.Currentlearning-basedmethodstypicallytrytoa...

展开>> 收起<<
Uncertainty-aware LiDAR Panoptic Segmentation Kshitij Sirohi1 Sajad Marvi1 Daniel B uscher1and Wolfram Burgard2 Abstract Modern autonomous systems often rely on LiDAR.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:7 页 大小:5.68MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注