Number-Adaptive Prototype Learning for 3D Point Cloud Semantic Segmentation Yangheng Zhao1 Jun Wang2 Xiaolong Li3 Yue Hu1 Ce Zhang3

2025-05-02 0 0 1.6MB 9 页 10玖币

侵权投诉

Number-Adaptive Prototype Learning for

3D Point Cloud Semantic Segmentation

Yangheng Zhao1, Jun Wang2, Xiaolong Li3, Yue Hu1, Ce Zhang3,

Yanfeng Wang1,4, and Siheng Chen1,4

1Shanghai Jiao Tong University 2University of Maryland

3Virginia Tech 4Shanghai AI laboratory

{zhaoyangheng-sjtu,18671129361,wangyanfeng,sihengc}@sjtu.edu.cn

junwang@umiacs.umd.edu {lxiaol9,zce}@vt.edu

Abstract. 3D point cloud semantic segmentation is one of the funda-

mental tasks for 3D scene understanding and has been widely used in the

metaverse applications. Many recent 3D semantic segmentation methods

learn a single prototype (classiﬁer weights) for each semantic class, and

classify 3D points according to their nearest prototype. However, learning

only one prototype for each class limits the model’s ability to describe the

high variance patterns within a class. Instead of learning a single proto-

type for each class, in this paper, we propose to use an adaptive number

of prototypes to dynamically describe the diﬀerent point patterns within

a semantic class. With the powerful capability of vision transformer, we

design a Number-Adaptive Prototype Learning (NAPL) model for point

cloud semantic segmentation. To train our NAPL model, we propose a

simple yet eﬀective prototype dropout training strategy, which enables

our model to adaptively produce prototypes for each class. The experi-

mental results on SemanticKITTI dataset demonstrate that our method

achieves 2.3% mIoU improvement over the baseline model based on the

point-wise classiﬁcation paradigm.

Keywords: Point Cloud, Semantic Segmentation, Prototype Learning

1 Introduction

3D scene understanding is critical for numerous applications, including meta-

verse, digital twins and robotics [3]. As one of the most important tasks for

3D scene understanding, point cloud semantic segmentation provides point-level

understanding of the surrounding 3D environment and gets increasing attention.

A popular paradigm for 3D point cloud semantic segmentation follows the

point-wise classiﬁcation, where an encoder-decoder network extracts point-wise

features and feeds them into a classiﬁer predicting label, as shown in Fig. 1 (a).

Following the spirit of prototype learning in image semantic segmentation [16],

the point-wise classiﬁcation model can be viewed as learning one prototype (clas-

siﬁer weights) for each semantic category, and assigning points with the label of

the nearest prototype. However, the common single-prototype-per-class design

arXiv:2210.09948v1 [cs.CV] 18 Oct 2022

2 Y. Zhao et al.

Point cloud Point-wise

feature

Point-wise

prediction

Classifier

weights

Point encoder-decoder

(a) Point-wise classification (b) Number-adaptive prototype learning

Adaptive number

of prototypes

Point cloud Point-wise

feature

Point-wise

prediction

Point encoder-decoder

Prototype learning

module

Fig. 1: Diﬀerence between point-wise classiﬁcation (PWC) and number-adaptive

prototype learning (NAPL). The PWC paradigm [8,9,17] learns a single proto-

type (classiﬁer weights) for each class, while our proposed NAPL uses a prototype

learning module to adaptively produce multiple prototypes for each class.

in point-wise classiﬁcation models limits the model’s capacity in the semantic

categories with high intra-class variance. More critically, the 3D point cloud data

we are interested in is sparse and non-uniform. The issues of distance variation

and the occlusion in 3D point cloud can make the geometric characteristics of

objects of the same category very diﬀerent, and this challenge is even more sig-

niﬁcant in large-scale 3D data. Experiments show that one prototype per class

is usually insuﬃcient to describe those patterns with high variations; see Fig. 3.

To better handle the data variance, an intuitive idea is to use more than one

prototype for each category. However, we have no prior knowledge about how

many prototypes each category needs, and too many prototypes per category

may increase the computational costs while also lead to potential overﬁtting

issues. The question is – can we ﬁnd a smarter way to identify the necessary

prototypes and eﬀectively increase existing models’ capacity? In this work, we

propose to use an adaptive way to set the number of prototypes per semantic

category, as shown in Fig. 1 (b). We call this paradigm as Number-Adaptive

Prototype Learning (NAPL). To instantiate the proposed NAPL model, inspired

by the recent work [5,11] , we use a transformer decoder to learn adaptive number

of prototypes for each category. Unlike previous work [5,11], which is limited by

learning one prototype for each semantic category, we design a novel prototype

dropout training strategy, to enable the model adaptively produce prototypes for

each class. The experimental results on SemanticKITTI [1] dataset show that by

plugging our design to a common encoder-decoder network, our method achieves

a 2.3% mIoU gain than the baseline point-wise classiﬁcation model.

2 Related work

2.1 3D point cloud semantic segmentation

3D point cloud semantic segmentation has been widely used in metaverse, dig-

ital twins, robotics and autonomous driving [4,14]. Based on diﬀerent represen-

tations, existing 3D semantic segmentation methods can be divided into three

categories: projection-based, point-based and voxel-based. The projection-based

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Number-AdaptivePrototypeLearningfor3DPointCloudSemanticSegmentationYanghengZhao1,JunWang2,XiaolongLi3,YueHu1,CeZhang3,YanfengWang1,4,andSihengChen1,41ShanghaiJiaoTongUniversity2UniversityofMaryland3VirginiaTech4ShanghaiAIlaboratory{zhaoyangheng-sjtu,18671129361,wangyanfeng,sihengc}@sjtu.edu.cnjunwan...

展开>> 收起<<

Number-Adaptive Prototype Learning for 3D Point Cloud Semantic Segmentation Yangheng Zhao1 Jun Wang2 Xiaolong Li3 Yue Hu1 Ce Zhang3.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Number-Adaptive Prototype Learning for 3D Point Cloud Semantic Segmentation Yangheng Zhao1 Jun Wang2 Xiaolong Li3 Yue Hu1 Ce Zhang3

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: