Number-Adaptive Prototype Learning for 3D Point Cloud Semantic Segmentation Yangheng Zhao1 Jun Wang2 Xiaolong Li3 Yue Hu1 Ce Zhang3

2025-05-02 0 0 1.6MB 9 页 10玖币
侵权投诉
Number-Adaptive Prototype Learning for
3D Point Cloud Semantic Segmentation
Yangheng Zhao1, Jun Wang2, Xiaolong Li3, Yue Hu1, Ce Zhang3,
Yanfeng Wang1,4, and Siheng Chen1,4
1Shanghai Jiao Tong University 2University of Maryland
3Virginia Tech 4Shanghai AI laboratory
{zhaoyangheng-sjtu,18671129361,wangyanfeng,sihengc}@sjtu.edu.cn
junwang@umiacs.umd.edu {lxiaol9,zce}@vt.edu
Abstract. 3D point cloud semantic segmentation is one of the funda-
mental tasks for 3D scene understanding and has been widely used in the
metaverse applications. Many recent 3D semantic segmentation methods
learn a single prototype (classifier weights) for each semantic class, and
classify 3D points according to their nearest prototype. However, learning
only one prototype for each class limits the model’s ability to describe the
high variance patterns within a class. Instead of learning a single proto-
type for each class, in this paper, we propose to use an adaptive number
of prototypes to dynamically describe the different point patterns within
a semantic class. With the powerful capability of vision transformer, we
design a Number-Adaptive Prototype Learning (NAPL) model for point
cloud semantic segmentation. To train our NAPL model, we propose a
simple yet effective prototype dropout training strategy, which enables
our model to adaptively produce prototypes for each class. The experi-
mental results on SemanticKITTI dataset demonstrate that our method
achieves 2.3% mIoU improvement over the baseline model based on the
point-wise classification paradigm.
Keywords: Point Cloud, Semantic Segmentation, Prototype Learning
1 Introduction
3D scene understanding is critical for numerous applications, including meta-
verse, digital twins and robotics [3]. As one of the most important tasks for
3D scene understanding, point cloud semantic segmentation provides point-level
understanding of the surrounding 3D environment and gets increasing attention.
A popular paradigm for 3D point cloud semantic segmentation follows the
point-wise classification, where an encoder-decoder network extracts point-wise
features and feeds them into a classifier predicting label, as shown in Fig. 1 (a).
Following the spirit of prototype learning in image semantic segmentation [16],
the point-wise classification model can be viewed as learning one prototype (clas-
sifier weights) for each semantic category, and assigning points with the label of
the nearest prototype. However, the common single-prototype-per-class design
arXiv:2210.09948v1 [cs.CV] 18 Oct 2022
2 Y. Zhao et al.
Point cloud Point-wise
feature
Point-wise
prediction
Classifier
weights
Point encoder-decoder
(a) Point-wise classification (b) Number-adaptive prototype learning
Adaptive number
of prototypes
Point cloud Point-wise
feature
Point-wise
prediction
Point encoder-decoder
Prototype learning
module
Fig. 1: Difference between point-wise classification (PWC) and number-adaptive
prototype learning (NAPL). The PWC paradigm [8,9,17] learns a single proto-
type (classifier weights) for each class, while our proposed NAPL uses a prototype
learning module to adaptively produce multiple prototypes for each class.
in point-wise classification models limits the model’s capacity in the semantic
categories with high intra-class variance. More critically, the 3D point cloud data
we are interested in is sparse and non-uniform. The issues of distance variation
and the occlusion in 3D point cloud can make the geometric characteristics of
objects of the same category very different, and this challenge is even more sig-
nificant in large-scale 3D data. Experiments show that one prototype per class
is usually insufficient to describe those patterns with high variations; see Fig. 3.
To better handle the data variance, an intuitive idea is to use more than one
prototype for each category. However, we have no prior knowledge about how
many prototypes each category needs, and too many prototypes per category
may increase the computational costs while also lead to potential overfitting
issues. The question is – can we find a smarter way to identify the necessary
prototypes and effectively increase existing models’ capacity? In this work, we
propose to use an adaptive way to set the number of prototypes per semantic
category, as shown in Fig. 1 (b). We call this paradigm as Number-Adaptive
Prototype Learning (NAPL). To instantiate the proposed NAPL model, inspired
by the recent work [5,11] , we use a transformer decoder to learn adaptive number
of prototypes for each category. Unlike previous work [5,11], which is limited by
learning one prototype for each semantic category, we design a novel prototype
dropout training strategy, to enable the model adaptively produce prototypes for
each class. The experimental results on SemanticKITTI [1] dataset show that by
plugging our design to a common encoder-decoder network, our method achieves
a 2.3% mIoU gain than the baseline point-wise classification model.
2 Related work
2.1 3D point cloud semantic segmentation
3D point cloud semantic segmentation has been widely used in metaverse, dig-
ital twins, robotics and autonomous driving [4,14]. Based on different represen-
tations, existing 3D semantic segmentation methods can be divided into three
categories: projection-based, point-based and voxel-based. The projection-based
摘要:

Number-AdaptivePrototypeLearningfor3DPointCloudSemanticSegmentationYanghengZhao1,JunWang2,XiaolongLi3,YueHu1,CeZhang3,YanfengWang1,4,andSihengChen1,41ShanghaiJiaoTongUniversity2UniversityofMaryland3VirginiaTech4ShanghaiAIlaboratory{zhaoyangheng-sjtu,18671129361,wangyanfeng,sihengc}@sjtu.edu.cnjunwan...

展开>> 收起<<
Number-Adaptive Prototype Learning for 3D Point Cloud Semantic Segmentation Yangheng Zhao1 Jun Wang2 Xiaolong Li3 Yue Hu1 Ce Zhang3.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:1.6MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注