Learning Inter-Superpoint Affinity for Weakly Supervised 3D Instance Segmentation Linghua Tang Le Hui and Jin Xie

2025-04-24 0 0 4.51MB 16 页 10玖币
侵权投诉
Learning Inter-Superpoint Affinity for Weakly
Supervised 3D Instance Segmentation
Linghua Tang, Le Hui, and Jin Xie
Nanjing University of Science and Technology, Nanjing, China
{tanglinghua, le.hui, csjxie}@njust.edu.cn
Abstract. Due to the few annotated labels of 3D point clouds, how
to learn discriminative features of point clouds to segment object in-
stances is a challenging problem. In this paper, we propose a simple
yet effective 3D instance segmentation framework that can achieve good
performance by annotating only one point for each instance. Specifically,
to tackle extremely few labels for instance segmentation, we first over-
segment the point cloud into superpoints in an unsupervised manner
and extend the point-level annotations to the superpoint level. Then,
based on the superpoint graph, we propose an inter-superpoint affin-
ity mining module that considers the semantic and spatial relations to
adaptively learn inter-superpoint affinity to generate high-quality pseudo
labels via semantic-aware random walk. Finally, we propose a volume-
aware instance refinement module to segment high-quality instances by
applying volume constraints of objects in clustering on the superpoint
graph. Extensive experiments on the ScanNet-v2 and S3DIS datasets
demonstrate that our method achieves state-of-the-art performance in
the weakly supervised point cloud instance segmentation task, and even
outperforms some fully supervised methods. Source code is available at
https://github.com/fpthink/3D-WSIS.
1 Introduction
Point cloud instance segmentation is a classic task in 3D computer vision, and
it can be applied in many fields, including indoor navigation systems, aug-
mented reality, and robotics. The fully supervised instance segmentation meth-
ods [18,2,13,10] have achieved impressive results, but they rely on numerous
manually labeled data. However, annotating a large number of point clouds is
extremely time-consuming and expensive. Thus, it is meaningful to segment
point clouds in a semi-/weakly supervised manner that requires a small number
of annotations. However, how to fully exploit the limited labels to improve the
performance of instance segmentation is still a challenging problem.
Few efforts have been dedicated to semi-/weakly supervised point cloud in-
stance segmentation. As a pioneer, Liao et al. [19] proposed a semi-supervised
point cloud instance segmentation method using bounding boxes as supervi-
sion, where a network is used to generate bounding box proposals. And instance
segmentation is achieved by refining the point cloud within the bounding box
arXiv:2210.05534v1 [cs.CV] 11 Oct 2022
2 L. Tang et al.
proposals. Besides, Tao et al. [25] proposed a two-stage seg-level supervision 3D
instance and semantic segmentation method, which first leverages a segment
grouping network to generate pseudo labels for the whole scenes, and then the
generated pseudo point-level labels are used as the ground truth to train the net-
work. However, these simple pseudo label generation strategies cannot effectively
generate high-quality pseudo labels, resulting in poor 3D instance segmentation
results.
In this paper, we propose a simple yet effective weakly supervised 3D in-
stance segmentation framework, which can achieve impressive results with one
point annotation per instance. For weakly supervised point cloud instance seg-
mentation with few annotated labels, our intuition lies in two folds: (1) Under
rare annotations, effective label propagation is essential to produce high-quality
pseudo labels, especially in 3D instance segmentation. (2) Weakly supervised 3D
instance segmentation is more challenging than weakly supervised 3D semantic
segmentation, so we consider introducing the object volume constraint to im-
prove the instance segmentation results. Specifically, we first use an unsuper-
vised method [15] to oversegment the point cloud into superpoints and build the
superpoint graph. In this way, point-level labels can be extended to superpoint-
level labels. Then, we propose an inter-superpoint affinity mining module to
generate high-quality pseudo labels based on a few annotated superpoint-level
labels. Based on the superpoint graph, we leverage the semantic and spatial in-
formation of adjacent superpoints to adaptively learn inter-superpoint affinity,
which can be used to propagate superpoint labels along the superpoint graph
via semantic-aware random walk. Finally, we propose a volume-aware instance
refinement module to improve instance segmentation performance. Based on the
trained model using superpoint-level propagation, we can obtain coarse instance
segmentation results through superpoint clustering and further infer the object
volume information from the instance segmentation results. The object volume
information contains the number of voxels and the radius of the object. The
inferred object volume information is regarded as the ground truth of the corre-
sponding instance to retrain the network. In the test phase, based on the object
volume information, we utilize the predicted object volume information to intro-
duce a volume-aware instance clustering algorithm for segmenting high-quality
instances. Extensive experiments on the ScanNet-v2 [6] and S3DIS [1] datasets
can demonstrate the effectiveness of our method.
The main contributions of our paper are as follows:
We present an inter-superpoint affinity mining module that considers the
semantic and spatial relation to adaptively learn inter-superpoint affinity for
random-walk based label propagation.
We present a volume-aware instance refinement module, which guides the
superpoint clustering on the superpoint graph to segment instances by using
the object volume information.
Our simple yet effective framework achieves state-of-the-art weakly super-
vised 3D instance segmentation performance on popular datasets ScanNet-v2
and S3DIS.
Weakly Supervised 3D Instance Segmentation 3
2 Related Work
2.1 3D Semantic Segmentation
Fully supervised 3D semantic segmentation. Many methods have been pro-
posed to achieve point cloud semantic segmentation. Some methods [16,26,12]
project point clouds into a series of regular 2D images from different views, and
then fuse features extracted through 2D convolutional neural networks (CNNs).
To apply 3D CNNs on the irregular point cloud and alleviate large memory
costs, many efforts [8,5] first voxelize the point cloud into voxels and then utilize
the sparse convolutional neural network to extract features of the point cloud.
PointNet [21] directly extracts features from points with shared multi-layer per-
ceptrons and max-pooling layer. Inspired by PointNet, different local feature
aggregation operators [22,27,33,4] are proposed to work on point cloud, which
directly consume point cloud. Besides, various methods [31,11] capture intrinsic
spatial and geometric features by constructing the graph on the point cloud.
Various approaches exploit different local feature aggregation networks to ex-
tract discriminative point features and use multi-layer perceptrons to achieve
3D semantic segmentation.
Semi-/Weakly supervised 3D semantic segmentation. Inspired by
class activation map in 2D images, Wei et al. [32] introduce a multi-path re-
gion mining module to generate pseudo labels, which only requires cloud-level
weak labels. Xu et al. [34] use three additional losses to constrain on unlabeled
points, achieving impressive performance with 10% labels. Cheng et al. [3] use
a dynamic label propagation strategy to generate pseudo labels, and learn dis-
criminative features with a coupled attention module. Zhang et al. [37] exploit
the consistency generated by perturbation to obtain additional supervision and
propagate implicit labels by constructing the graph topology of the point cloud.
Liu et al. [20] first build a supervoxel graph on the point cloud and then conduct
label propagation by learning the similarity among graph nodes. Li et al. [17]
utilize a hybrid contrastive regularization strategy with point cloud augmenta-
tion to provide additional constraints for network training. To generate pseudo
labels for outdoor point cloud scenes, Shi et al. [23] design a matching module
to propagate pseudo labels in both temporal and spatial spaces.
2.2 3D Instance Segmentation
Fully supervised 3D instance segmentation. Compared with point cloud
semantic segmentation, instance segmentation is more challenging because it not
only requires predicting semantic scores but also distinguishing instances of the
same class. According to the different manners of generating instances, instance
segmentation methods can be mainly divided into clustering-based methods and
proposal-based methods. Given point clouds as input, clustering-based methods
regard instance segmentation as the post-processing task after network infer-
ence, and the result is obtained by clustering on point clouds with the predicted
features. As a pioneer, Wang et al. [29] introduce a similarity matrix to measure
4 L. Tang et al.
the distances between the features of all point pairs, which guides clustering
points as proposals. Wang et al. [30] integrate semantic and instance segmenta-
tion into a parallel framework, which benefits from each other task. Lahoud et
al. [14] design a multi-task neural network architecture, where instances are si-
multaneously separated in the feature vector space and direction vector space
by a discriminative loss [7] and a directional loss. Jiang et al. [13] generate pro-
posals by clustering points on the original and offset-shifted coordinate spaces,
which benefits from both advantages. Hou et al. [9] jointly learn color and geom-
etry features for instance segmentation from different modalities. Lately, Chen et
al. [2] introduce a hierarchical aggregation method that iteratively clusters point
clouds into instance proposals. Liang et al. [18] propose a semantic superpoint
tree structure and achieved instance segmentation by tree traversal and splitting.
Vu et al. [28] design a soft group algorithm to reduce the semantic prediction
errors to significantly boost the segmentation performance.
For proposal-based methods, instance segmentation consists of two proce-
dures, first generating rough proposals and then predicting precise instance
masks. Yang et al. [35] propose an end-to-end trainable network which directly
generates 3D bounding boxes as proposals and infers point-wise instance masks
for points inside proposals. Instead of getting proposals via 3D bounding box
regression, Yi et al. [36] introduce an approach to obtain proposals by object
generation, and then predict instance masks within proposals.
Semi-/Weakly supervised 3D instance segmentation. Few efforts have
been made on semi-/weakly supervised point cloud instance segmentation. Tao et
al. [25] propose a method to generate pseudo labels for the whole training scene
and the generated pseudo point-level labels are used to train existing full su-
pervised methods for point cloud instance segmentation, where one point per
instance is clicked as the weak label. Nonetheless, the quality of pseudo labels is
limited due to lack of learning discriminative instance features. With bounding
boxes as weak labels, Liao et al. [19] propose a semi-supervised point cloud in-
stance segmentation method, where a network is leveraged to generate bounding
box proposals and instance segmentation is achieved by refining points within
bounding box proposals.
3 Method
The overall architecture of our method is depicted in Fig. 1. The backbone
network (Sec. 3.1) first takes the point cloud and superpoint graph as input
and predicts superpoint-wise semantic labels and offset vectors. Then, the inter-
superpoint affinity mining module (Sec. 3.2) propagates labels on the superpoint
graph via semantic-aware random walk. Finally, the volume-aware instance re-
finement module (Sec. 3.3) learns object volume information to improve instance
segmentation performance.
摘要:

LearningInter-SuperpointAnityforWeaklySupervised3DInstanceSegmentationLinghuaTang,LeHui,andJinXieNanjingUniversityofScienceandTechnology,Nanjing,China{tanglinghua,le.hui,csjxie}@njust.edu.cnAbstract.Duetothefewannotatedlabelsof3Dpointclouds,howtolearndiscriminativefeaturesofpointcloudstosegmentobje...

展开>> 收起<<
Learning Inter-Superpoint Affinity for Weakly Supervised 3D Instance Segmentation Linghua Tang Le Hui and Jin Xie.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:16 页 大小:4.51MB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注