Learning Inter-Superpoint Aﬃnity for Weakly Supervised 3D Instance Segmentation Linghua Tang Le Hui and Jin Xie

2025-04-24 0 0 4.51MB 16 页 10玖币

侵权投诉

Learning Inter-Superpoint Aﬃnity for Weakly

Supervised 3D Instance Segmentation

Linghua Tang, Le Hui, and Jin Xie

Nanjing University of Science and Technology, Nanjing, China

{tanglinghua, le.hui, csjxie}@njust.edu.cn

Abstract. Due to the few annotated labels of 3D point clouds, how

to learn discriminative features of point clouds to segment object in-

stances is a challenging problem. In this paper, we propose a simple

yet eﬀective 3D instance segmentation framework that can achieve good

performance by annotating only one point for each instance. Speciﬁcally,

to tackle extremely few labels for instance segmentation, we ﬁrst over-

segment the point cloud into superpoints in an unsupervised manner

and extend the point-level annotations to the superpoint level. Then,

based on the superpoint graph, we propose an inter-superpoint aﬃn-

ity mining module that considers the semantic and spatial relations to

adaptively learn inter-superpoint aﬃnity to generate high-quality pseudo

labels via semantic-aware random walk. Finally, we propose a volume-

aware instance reﬁnement module to segment high-quality instances by

applying volume constraints of objects in clustering on the superpoint

graph. Extensive experiments on the ScanNet-v2 and S3DIS datasets

demonstrate that our method achieves state-of-the-art performance in

the weakly supervised point cloud instance segmentation task, and even

outperforms some fully supervised methods. Source code is available at

https://github.com/fpthink/3D-WSIS.

1 Introduction

Point cloud instance segmentation is a classic task in 3D computer vision, and

it can be applied in many ﬁelds, including indoor navigation systems, aug-

mented reality, and robotics. The fully supervised instance segmentation meth-

ods [18,2,13,10] have achieved impressive results, but they rely on numerous

manually labeled data. However, annotating a large number of point clouds is

extremely time-consuming and expensive. Thus, it is meaningful to segment

point clouds in a semi-/weakly supervised manner that requires a small number

of annotations. However, how to fully exploit the limited labels to improve the

performance of instance segmentation is still a challenging problem.

Few eﬀorts have been dedicated to semi-/weakly supervised point cloud in-

stance segmentation. As a pioneer, Liao et al. [19] proposed a semi-supervised

point cloud instance segmentation method using bounding boxes as supervi-

sion, where a network is used to generate bounding box proposals. And instance

segmentation is achieved by reﬁning the point cloud within the bounding box

arXiv:2210.05534v1 [cs.CV] 11 Oct 2022

2 L. Tang et al.

proposals. Besides, Tao et al. [25] proposed a two-stage seg-level supervision 3D

instance and semantic segmentation method, which ﬁrst leverages a segment

grouping network to generate pseudo labels for the whole scenes, and then the

generated pseudo point-level labels are used as the ground truth to train the net-

work. However, these simple pseudo label generation strategies cannot eﬀectively

generate high-quality pseudo labels, resulting in poor 3D instance segmentation

results.

In this paper, we propose a simple yet eﬀective weakly supervised 3D in-

stance segmentation framework, which can achieve impressive results with one

point annotation per instance. For weakly supervised point cloud instance seg-

mentation with few annotated labels, our intuition lies in two folds: (1) Under

rare annotations, eﬀective label propagation is essential to produce high-quality

pseudo labels, especially in 3D instance segmentation. (2) Weakly supervised 3D

instance segmentation is more challenging than weakly supervised 3D semantic

segmentation, so we consider introducing the object volume constraint to im-

prove the instance segmentation results. Speciﬁcally, we ﬁrst use an unsuper-

vised method [15] to oversegment the point cloud into superpoints and build the

superpoint graph. In this way, point-level labels can be extended to superpoint-

level labels. Then, we propose an inter-superpoint aﬃnity mining module to

generate high-quality pseudo labels based on a few annotated superpoint-level

labels. Based on the superpoint graph, we leverage the semantic and spatial in-

formation of adjacent superpoints to adaptively learn inter-superpoint aﬃnity,

which can be used to propagate superpoint labels along the superpoint graph

via semantic-aware random walk. Finally, we propose a volume-aware instance

reﬁnement module to improve instance segmentation performance. Based on the

trained model using superpoint-level propagation, we can obtain coarse instance

segmentation results through superpoint clustering and further infer the object

volume information from the instance segmentation results. The object volume

information contains the number of voxels and the radius of the object. The

inferred object volume information is regarded as the ground truth of the corre-

sponding instance to retrain the network. In the test phase, based on the object

volume information, we utilize the predicted object volume information to intro-

duce a volume-aware instance clustering algorithm for segmenting high-quality

instances. Extensive experiments on the ScanNet-v2 [6] and S3DIS [1] datasets

can demonstrate the eﬀectiveness of our method.

The main contributions of our paper are as follows:

–We present an inter-superpoint aﬃnity mining module that considers the

semantic and spatial relation to adaptively learn inter-superpoint aﬃnity for

random-walk based label propagation.

–We present a volume-aware instance reﬁnement module, which guides the

superpoint clustering on the superpoint graph to segment instances by using

the object volume information.

–Our simple yet eﬀective framework achieves state-of-the-art weakly super-

vised 3D instance segmentation performance on popular datasets ScanNet-v2

and S3DIS.

Weakly Supervised 3D Instance Segmentation 3

2 Related Work

2.1 3D Semantic Segmentation

Fully supervised 3D semantic segmentation. Many methods have been pro-

posed to achieve point cloud semantic segmentation. Some methods [16,26,12]

project point clouds into a series of regular 2D images from diﬀerent views, and

then fuse features extracted through 2D convolutional neural networks (CNNs).

To apply 3D CNNs on the irregular point cloud and alleviate large memory

costs, many eﬀorts [8,5] ﬁrst voxelize the point cloud into voxels and then utilize

the sparse convolutional neural network to extract features of the point cloud.

PointNet [21] directly extracts features from points with shared multi-layer per-

ceptrons and max-pooling layer. Inspired by PointNet, diﬀerent local feature

aggregation operators [22,27,33,4] are proposed to work on point cloud, which

directly consume point cloud. Besides, various methods [31,11] capture intrinsic

spatial and geometric features by constructing the graph on the point cloud.

Various approaches exploit diﬀerent local feature aggregation networks to ex-

tract discriminative point features and use multi-layer perceptrons to achieve

3D semantic segmentation.

Semi-/Weakly supervised 3D semantic segmentation. Inspired by

class activation map in 2D images, Wei et al. [32] introduce a multi-path re-

gion mining module to generate pseudo labels, which only requires cloud-level

weak labels. Xu et al. [34] use three additional losses to constrain on unlabeled

points, achieving impressive performance with 10% labels. Cheng et al. [3] use

a dynamic label propagation strategy to generate pseudo labels, and learn dis-

criminative features with a coupled attention module. Zhang et al. [37] exploit

the consistency generated by perturbation to obtain additional supervision and

propagate implicit labels by constructing the graph topology of the point cloud.

Liu et al. [20] ﬁrst build a supervoxel graph on the point cloud and then conduct

label propagation by learning the similarity among graph nodes. Li et al. [17]

utilize a hybrid contrastive regularization strategy with point cloud augmenta-

tion to provide additional constraints for network training. To generate pseudo

labels for outdoor point cloud scenes, Shi et al. [23] design a matching module

to propagate pseudo labels in both temporal and spatial spaces.

2.2 3D Instance Segmentation

Fully supervised 3D instance segmentation. Compared with point cloud

semantic segmentation, instance segmentation is more challenging because it not

only requires predicting semantic scores but also distinguishing instances of the

same class. According to the diﬀerent manners of generating instances, instance

segmentation methods can be mainly divided into clustering-based methods and

proposal-based methods. Given point clouds as input, clustering-based methods

regard instance segmentation as the post-processing task after network infer-

ence, and the result is obtained by clustering on point clouds with the predicted

features. As a pioneer, Wang et al. [29] introduce a similarity matrix to measure

4 L. Tang et al.

the distances between the features of all point pairs, which guides clustering

points as proposals. Wang et al. [30] integrate semantic and instance segmenta-

tion into a parallel framework, which beneﬁts from each other task. Lahoud et

al. [14] design a multi-task neural network architecture, where instances are si-

multaneously separated in the feature vector space and direction vector space

by a discriminative loss [7] and a directional loss. Jiang et al. [13] generate pro-

posals by clustering points on the original and oﬀset-shifted coordinate spaces,

which beneﬁts from both advantages. Hou et al. [9] jointly learn color and geom-

etry features for instance segmentation from diﬀerent modalities. Lately, Chen et

al. [2] introduce a hierarchical aggregation method that iteratively clusters point

clouds into instance proposals. Liang et al. [18] propose a semantic superpoint

tree structure and achieved instance segmentation by tree traversal and splitting.

Vu et al. [28] design a soft group algorithm to reduce the semantic prediction

errors to signiﬁcantly boost the segmentation performance.

For proposal-based methods, instance segmentation consists of two proce-

dures, ﬁrst generating rough proposals and then predicting precise instance

masks. Yang et al. [35] propose an end-to-end trainable network which directly

generates 3D bounding boxes as proposals and infers point-wise instance masks

for points inside proposals. Instead of getting proposals via 3D bounding box

regression, Yi et al. [36] introduce an approach to obtain proposals by object

generation, and then predict instance masks within proposals.

Semi-/Weakly supervised 3D instance segmentation. Few eﬀorts have

been made on semi-/weakly supervised point cloud instance segmentation. Tao et

al. [25] propose a method to generate pseudo labels for the whole training scene

and the generated pseudo point-level labels are used to train existing full su-

pervised methods for point cloud instance segmentation, where one point per

instance is clicked as the weak label. Nonetheless, the quality of pseudo labels is

limited due to lack of learning discriminative instance features. With bounding

boxes as weak labels, Liao et al. [19] propose a semi-supervised point cloud in-

stance segmentation method, where a network is leveraged to generate bounding

box proposals and instance segmentation is achieved by reﬁning points within

bounding box proposals.

3 Method

The overall architecture of our method is depicted in Fig. 1. The backbone

network (Sec. 3.1) ﬁrst takes the point cloud and superpoint graph as input

and predicts superpoint-wise semantic labels and oﬀset vectors. Then, the inter-

superpoint aﬃnity mining module (Sec. 3.2) propagates labels on the superpoint

graph via semantic-aware random walk. Finally, the volume-aware instance re-

ﬁnement module (Sec. 3.3) learns object volume information to improve instance

segmentation performance.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LearningInter-SuperpointAnityforWeaklySupervised3DInstanceSegmentationLinghuaTang,LeHui,andJinXieNanjingUniversityofScienceandTechnology,Nanjing,China{tanglinghua,le.hui,csjxie}@njust.edu.cnAbstract.Duetothefewannotatedlabelsof3Dpointclouds,howtolearndiscriminativefeaturesofpointcloudstosegmentobje...

展开>> 收起<<

Learning Inter-Superpoint Aﬃnity for Weakly Supervised 3D Instance Segmentation Linghua Tang Le Hui and Jin Xie.pdf

共16页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Learning Inter-Superpoint Aﬃnity for Weakly Supervised 3D Instance Segmentation Linghua Tang Le Hui and Jin Xie

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: