Overlap-guided Gaussian Mixture Models for Point Cloud Registration Guofeng Mei University of Technology Sydney

2025-04-29 0 0 3.74MB 11 页 10玖币
侵权投诉
Overlap-guided Gaussian Mixture Models for Point Cloud Registration
Guofeng Mei
University of Technology Sydney
Sydney, Australia
guofeng.mei@student.uts.edu.au
Fabio Poiesi
Fondazione Bruno Kessler
Trento, Italy
poiesi@fbk.eu
Cristiano Saltori
University of Trento
Trento, Italy
cristiano.saltori@unitn.it
Jian Zhang
University of Technology Sydney
Sydney, Australia
jian.zhang@uts.edu.au
Elisa Ricci
University of Trento &FBK
Trento, Italy
e.ricci@unitn.it
Nicu Sebe
University of Trento
Trento, Italy
niculae.sebe@unitn.it
Abstract
Probabilistic 3D point cloud registration methods have
shown competitive performance in overcoming noise, out-
liers, and density variations. However, registering point
cloud pairs in the case of partial overlap is still a challenge.
This paper proposes a novel overlap-guided probabilistic
registration approach that computes the optimal transfor-
mation from matched Gaussian Mixture Model (GMM) pa-
rameters. We reformulate the registration problem as the
problem of aligning two Gaussian mixtures such that a sta-
tistical discrepancy measure between the two corresponding
mixtures is minimized. We introduce a Transformer-based
detection module to detect overlapping regions, and repre-
sent the input point clouds using GMMs by guiding their
alignment through overlap scores computed by this detection
module. Experiments show that our method achieves supe-
rior registration accuracy and efficiency than state-of-the-art
methods when handling point clouds with partial overlap
and different densities on synthetic and real-world datasets.
https://github.com/gfmei/ogmm
1. Introduction
With the rise of inexpensive 3D sensors for both indoor
(e.g. Microsoft Kinect) and outdoor (e.g. LiDAR) scenes,
the point cloud has become an important data source, which
presents rich 3D spatial information efficiently. To produce
large-scale point clouds, 3D point cloud registration has been
widely investigated. The point cloud registration refers to
the problem of finding a rigid relative pose transformation
that aligns a pair of point clouds into the same coordinate
frame [
18
,
28
]. The registration quality can directly impact
applications such as robotics [
38
,
55
], augmenting reality [
3
],
autonomous driving [
30
,
32
], and radiotherapy [
26
,
27
].
However, sensor noise, varying point densities, outliers, oc-
clusions, and partial views are challenges that still affect the
performance of these applications in the real world [21].
Point cloud registration approaches can be broadly cat-
egorized into correspondence-free and correspondence-
based [
28
]. Correspondence-free registration approaches
aim at minimizing the difference between the global features
extracted from two input point clouds [
28
,
18
,
1
]. These
global features are typically computed based on all the points
of a point cloud, making correspondence-free approaches in-
adequate to handle scenes with partial overlaps, such as those
captured in the real world [
51
,
6
]. Correspondence-based
registration approaches rely on point-level correspondences
between two input point clouds [
10
,
39
,
2
]. Despite showing
promising results, these approaches suffer from two major
challenges: i) real-world point clouds do not contain exact
point-level correspondences due to sensor noise and density
variations [
17
,
49
,
36
]; ii) the size of the correspondence
search space increases quadratically with the number of
points of the two point clouds [
49
]. An alternative to find-
ing point-to-point correspondences is using distribution-to-
distribution matching through probabilistic models [
17
,
49
].
These probabilistic registration techniques showed greater
robustness to noise and density variations than their point-
to-point counterpart [
36
], however, they typically require
their inputs to share the same distribution parameters (e.g.,
Gaussian Mixture Models). Due to this, they can only handle
complete-to-complete [
49
] or partial-to-complete [
36
] point
cloud registration setups. Partial-to-partial setups, which are
typical in real-world applications, may have disjoint distribu-
tion parameters. Therefore, when state-of-the-art approaches
are used in these setups are highly likely to underperform.
In this paper, we propose an overlap-guided GMM-based
registration method, named OGMM, to mitigate the limita-
tions of partial-to-partial setups without using exact point-
level correspondences. We reformulate the problem of reg-
arXiv:2210.09836v1 [cs.CV] 17 Oct 2022
istering point cloud pairs as the alignment of two Gaussian
mixtures by minimizing a statistical discrepancy measure
between the two corresponding mixtures. We introduce an
overlap score to measure how likely points locate in the
overlapping areas between source and target point clouds
via a Transformer-based deep neural network. The input
point cloud is modeled through GMMs under the guidance
of this overlap score. The self-attention or cross-attention of
Transformer networks come with computational and memory
requirements that scale quadratically with the size of point
clouds (
N2
), hindering their applicability to large-scale point
cloud datasets. Therefore, we introduce the idea of clustered
attention, which is a fast approximation of self-attention.
Clustered attention groups a set of points into
J
clusters and
compute the attention for these clusters only, making the
complexity linear with the number of clusters, i.e.,
N·J
,
where
J << N
. OGMM is inspired by DeepGMR [
49
],
but it differs from it in two ways. First, our probabilistic
paradigm can handle partial-to-partial point cloud registra-
tion problems through the overlap score constraint. Second,
our network learns a consistent GMM representation across
feature and geometric space rather than fitting a GMM in a
single feature space. We evaluate our approach on Model-
Net40 [
41
], 7Scene [
35
], and ICL-NUIM [
14
], comparing
our approach against traditional and deep learning-based
point cloud registration approaches. We use the typical eval-
uation protocol for point cloud registration that is used in
this literature [10]. OGMM achieves state-of-the-art results
and largely outperforms DeepGMR on all the benchmarks.
In summary, the main contributions of this work are:
We propose a learning-based probabilistic registration
framework under the guidance of overlap scores;
We propose a cluster-based overlap Transformer mod-
ule to embed cross point cloud information that enables
the detection of overlap regions;
We introduce a cluster-based loss to ensure that our
network learns a consistent GMM representation across
feature and geometric space rather than fitting a GMM
in a single feature space;
We achieve state-of-the-art accuracy and efficiency on a
comprehensive set of experiments, including synthetic
and real-world datasets.
2. Related Work
We review both correspondence-free and correspondence-
based point cloud registration methods, and Transformer-
related works as it is a major component in our approach.
2.1. Correspondence-free Registration
The core idea of the correspondence-free registration ap-
proaches, such as PointNetLK [
1
] and FMR [
18
], is to first
extract the global features from the source and target point
clouds, and then regress the rigid motion parameters by min-
imizing the difference between global features of two input
point clouds. Such methods do not require point-point cor-
respondences and are robust to density variations. These
methods use extracted global features to reduce the point
cloud’s dimension so that the algorithm’s time complexity
does not increase as the number of points increases. How-
ever, they highly depend on fairly high overlaps (more than
90%) between two point clouds and suffer from performance
degradation in the case of partially-overlapping point clouds,
which are typical in real-world scenarios [51, 6].
2.2. Correspondence-based Registration
2.2.1 Point-level Correspondence-Based Registration.
The most popular point-to-point registration algorithm is
ICP [
2
], which alternates between rigid motion estimation
and the correspondences searching [
39
,
19
] by solving a
L2
-
optimization. However, ICP converges to spurious local min-
ima due to the non-convexity nature of the problem. Based
on this algorithm, many variants have been proposed. For ex-
ample, the Levenberg-Marquardt ICP [9] uses a Levenberg-
Marquardt algorithm to yield the transformation by replacing
the singular value decomposition with gradient descent and
Gaussian-Newton approaches, accelerating the convergence
while ensuring high accuracy. Go-ICP [
46
] solves the point
cloud alignment problem globally by using a Branch-and-
Bound (BnB) optimization framework without prior infor-
mation on correspondence or transformation. RANSAC-like
algorithms are widely used for robust finding the correct
correspondences for registration [
24
]. FGR [
53
] optimizes a
Geman-McClure cost-induced correspondence-based objec-
tive function in a graduated nonconvex strategy and achieves
high performance. TEASER [
45
] reformulates the registra-
tion problem as an intractable optimization and provides
readily checkable conditions to verify the optimal solution.
These methods still face challenges when point clouds are
affected by noise, outlier, and density variations [51].
Recently, several deep features [
18
,
51
] are proposed to
accurately estimate the point-level correspondences. For
instance, DCP [
39
] employs DGCNN [
40
] for feature ex-
traction and an attention module to generate soft matching
pairs. RPMNet [
47
] proposes a method to solve the point
cloud partial visibility by integrating the Sinkhorn algorithm
into a network to get soft correspondences from local fea-
tures. Soft correspondences can increase the robustness of
registration accuracy. RGM [
10
] utilizes a Transformer to
aggregate information by generating soft graph edges for
point-wise matching. IDAM [
25
] incorporates both geomet-
ric and distance features into the iterative matching process.
RIENet [
34
] propose a method to identify the inlier based on
the graph-structure difference between the neighborhoods.
Although achieving remarkable performance, most of these
approaches rely on point-to-point correspondences, and thus
they are still sensitive to noise and density variations [17].
2.2.2 Probabilistic Registration.
The probabilistic registration methods model the distribu-
tion of point clouds as a density function often via the use
of GMMs and perform alignment either by employing a
correlation-based approach or using an EM-based optimiza-
tion framework [
49
,
23
]. A commonly used formulation,
such as CPD [
29
] and FilterReg [
12
], represents the geome-
try of the target point cloud using a GMM distribution over
3D Euclidean space. The source point cloud is then fitted
to the GMM distribution under the maximum likelihood es-
timation (MLE) framework. Another statistical approach,
including GMMReg [
20
], JRMPC [
8
], and DeepGMR [
49
],
builds GMM probability distribution on both the source and
the target point clouds. These methods show robustness to
outliers, noise, and density variation [
49
]. These models usu-
ally assume that both source and target point clouds share
the same GMM parameters or that one of the point clouds
is supposed to be “perfect”, thus leading to biased solutions
as long as both point clouds contain noise and outliers (such
as partial-to-partial registration) [8]. Considering the above
factors, we propose an overlap-guided 3D point cloud regis-
tration algorithm based on Gaussian Mixture Model (GMM)
that can be applied to partial-to-partial registration problems.
2.3. Transformers in 3D Point Clouds
Due to the inherent permutation invariance and strong
global feature learning ability, 3D Transformers are well
suited for point cloud processing and analysis. They have
surpassed state-of-the-art non-Transformer algorithm perfor-
mance. In particular, A-SCN [
42
] is the first such example
of a Transformer-based approach for point cloud learning.
Later, PCT [
13
], which is a pure global Transformer net-
work, generates positional embedding using 3D coordinates
of points and adopts a transformer with an offset attention
module to enrich features of points from its local neigh-
borhood. PointTransformer [
52
] to construct self-attention
networks for general 3D tasks with nearest neighbor search.
However, they suffer from the fact that as the size of the fea-
ture map increases, the computing and memory overheads
of the original Transformer increase quadratically. Efforts to
reduce the quadratic complexity of attention mainly focused
on self-attention. For instance, PatchFormer [
50
] reduces
the size of the attention map by first splitting the raw point
cloud into patches, and then aggregating the local feature in
each patch to generate an attention matrix. FastPointTrans-
former proposes centroid-aware voxelization and devoxeliza-
tion techniques to reduce space complexity. However, these
works are less suitable for feature matching, where we are re-
quired to perform self- and cross-attention on features within
and between point clouds, respectively. To this end, we pro-
pose clustered attention which groups a set of points into
J
clusters and computes the attention only for these clusters,
resulting in linear complexity for a fixed number of clusters.
3. Our approach
Rigid point cloud registration aims to recover a rigid trans-
formation matrix
TSE(3)
, which consists of rotation
RSO(3)
and translation
tR3,
that optimally aligns
the source point cloud
P={piR3i= 1,2, ..., N}
to
the target point cloud
Q={qjR3j= 1,2, ..., M}
,
where
N
and
M
represent the number of points in
P
and
Q
,
respectively. Fig. 1 illustrates our framework that consists
of three modules: feature extraction, overlap region detec-
tion, and overlap-guided GMM for registration. The shared
weighted encoder first extracts point-wise features
Fp
and
Fq
from point clouds
P
and
Q
, respectively. The clustered
self-attention module then updates the point-wise features
Fp
and
Fq
to capture global context. Next, the overlap
region detection module projects the updated features
P
and
Q
to overlap scores
op,oq
, respectively.
Fp
,
Fq
,
op
and
oq
are then used to estimate the distributions (GMMs) of
P
and
Q
. Finally, weighted SVD is adopted to estimate the
rigid transformation Tbased on the estimated distributions.
3.1. Feature Extraction
The feature extraction network consists of a Dynamic
Graph Convolutional Neural Network (DGCNN), positional
encoding, and a clustered self-attention network. Given a
point cloud pair
P
and
Q
, DGCNN extracts their associated
features
Fp={fpiRdi= 1,2, ..., N}
and
Fq=
{fqjRdj= 1,2, ..., M}. Here, d= 512.
3.1.1 Attention module
Transformer training and inference in previous works can
be computationally expensive due to the quadratic complex-
ity of self-attention over a long sequence of representations,
especially for high-resolution correspondences prediction
tasks. To mitigate this limitation, our novel cluster-based
Transformer architecture operates after local feature extrac-
tion:
Fp
and
Fq
are passed through the attention module
to extract context-dependent point-wise features. Intuitively,
the self-attention module transforms the DGCNN features
into more informative representations to facilitate matching.
Spherical positional encoding.
Transformers are typically
fed with only high-level features, which may not explicitly
encode the geometric structure of the point cloud [
39
,
16
].
This makes the learned features geometrically less discrim-
inative, causing severe matching ambiguity and numerous
outlier matches, especially in low-overlap cases [
33
]. A
straightforward recipe is to explicitly inject positional encod-
ing of 3D point coordinates, which assigns intrinsic geomet-
ric properties to the per-point feature by adding unique po-
sitional information that enhances distinctions among point
features in indistinctive regions [
52
]. However, the result-
ing coordinate-based attentions are naturally transformation-
variant [
33
], while registration requires transformation invari-
摘要:

Overlap-guidedGaussianMixtureModelsforPointCloudRegistrationGuofengMeiUniversityofTechnologySydneySydney,Australiaguofeng.mei@student.uts.edu.auFabioPoiesiFondazioneBrunoKesslerTrento,Italypoiesi@fbk.euCristianoSaltoriUniversityofTrentoTrento,Italycristiano.saltori@unitn.itJianZhangUniversityofTechn...

展开>> 收起<<
Overlap-guided Gaussian Mixture Models for Point Cloud Registration Guofeng Mei University of Technology Sydney.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:3.74MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注