Overlap-guided Gaussian Mixture Models for Point Cloud Registration Guofeng Mei University of Technology Sydney

2025-04-29 0 0 3.74MB 11 页 10玖币

侵权投诉

Overlap-guided Gaussian Mixture Models for Point Cloud Registration

Guofeng Mei

University of Technology Sydney

Sydney, Australia

guofeng.mei@student.uts.edu.au

Fabio Poiesi

Fondazione Bruno Kessler

Trento, Italy

poiesi@fbk.eu

Cristiano Saltori

University of Trento

Trento, Italy

cristiano.saltori@unitn.it

Jian Zhang

University of Technology Sydney

Sydney, Australia

jian.zhang@uts.edu.au

Elisa Ricci

University of Trento &FBK

Trento, Italy

e.ricci@unitn.it

Nicu Sebe

University of Trento

Trento, Italy

niculae.sebe@unitn.it

Abstract

Probabilistic 3D point cloud registration methods have

shown competitive performance in overcoming noise, out-

liers, and density variations. However, registering point

cloud pairs in the case of partial overlap is still a challenge.

This paper proposes a novel overlap-guided probabilistic

registration approach that computes the optimal transfor-

mation from matched Gaussian Mixture Model (GMM) pa-

rameters. We reformulate the registration problem as the

problem of aligning two Gaussian mixtures such that a sta-

tistical discrepancy measure between the two corresponding

mixtures is minimized. We introduce a Transformer-based

detection module to detect overlapping regions, and repre-

sent the input point clouds using GMMs by guiding their

alignment through overlap scores computed by this detection

module. Experiments show that our method achieves supe-

rior registration accuracy and efﬁciency than state-of-the-art

methods when handling point clouds with partial overlap

and different densities on synthetic and real-world datasets.

https://github.com/gfmei/ogmm

1. Introduction

With the rise of inexpensive 3D sensors for both indoor

(e.g. Microsoft Kinect) and outdoor (e.g. LiDAR) scenes,

the point cloud has become an important data source, which

presents rich 3D spatial information efﬁciently. To produce

large-scale point clouds, 3D point cloud registration has been

widely investigated. The point cloud registration refers to

the problem of ﬁnding a rigid relative pose transformation

that aligns a pair of point clouds into the same coordinate

frame [

]. The registration quality can directly impact

applications such as robotics [

], augmenting reality [

autonomous driving [

], and radiotherapy [

However, sensor noise, varying point densities, outliers, oc-

clusions, and partial views are challenges that still affect the

performance of these applications in the real world [21].

Point cloud registration approaches can be broadly cat-

egorized into correspondence-free and correspondence-

based [

]. Correspondence-free registration approaches

aim at minimizing the difference between the global features

extracted from two input point clouds [

]. These

global features are typically computed based on all the points

of a point cloud, making correspondence-free approaches in-

adequate to handle scenes with partial overlaps, such as those

captured in the real world [

]. Correspondence-based

registration approaches rely on point-level correspondences

between two input point clouds [

]. Despite showing

promising results, these approaches suffer from two major

challenges: i) real-world point clouds do not contain exact

point-level correspondences due to sensor noise and density

variations [

]; ii) the size of the correspondence

search space increases quadratically with the number of

points of the two point clouds [

]. An alternative to ﬁnd-

ing point-to-point correspondences is using distribution-to-

distribution matching through probabilistic models [

These probabilistic registration techniques showed greater

robustness to noise and density variations than their point-

to-point counterpart [

], however, they typically require

their inputs to share the same distribution parameters (e.g.,

Gaussian Mixture Models). Due to this, they can only handle

complete-to-complete [

] or partial-to-complete [

] point

cloud registration setups. Partial-to-partial setups, which are

typical in real-world applications, may have disjoint distribu-

tion parameters. Therefore, when state-of-the-art approaches

are used in these setups are highly likely to underperform.

In this paper, we propose an overlap-guided GMM-based

registration method, named OGMM, to mitigate the limita-

tions of partial-to-partial setups without using exact point-

level correspondences. We reformulate the problem of reg-

arXiv:2210.09836v1 [cs.CV] 17 Oct 2022

istering point cloud pairs as the alignment of two Gaussian

mixtures by minimizing a statistical discrepancy measure

between the two corresponding mixtures. We introduce an

overlap score to measure how likely points locate in the

overlapping areas between source and target point clouds

via a Transformer-based deep neural network. The input

point cloud is modeled through GMMs under the guidance

of this overlap score. The self-attention or cross-attention of

Transformer networks come with computational and memory

requirements that scale quadratically with the size of point

clouds (

), hindering their applicability to large-scale point

cloud datasets. Therefore, we introduce the idea of clustered

attention, which is a fast approximation of self-attention.

Clustered attention groups a set of points into

clusters and

compute the attention for these clusters only, making the

complexity linear with the number of clusters, i.e.,

N·J

where

J << N

. OGMM is inspired by DeepGMR [

but it differs from it in two ways. First, our probabilistic

paradigm can handle partial-to-partial point cloud registra-

tion problems through the overlap score constraint. Second,

our network learns a consistent GMM representation across

feature and geometric space rather than ﬁtting a GMM in a

single feature space. We evaluate our approach on Model-

Net40 [

], 7Scene [

], and ICL-NUIM [

], comparing

our approach against traditional and deep learning-based

point cloud registration approaches. We use the typical eval-

uation protocol for point cloud registration that is used in

this literature [10]. OGMM achieves state-of-the-art results

and largely outperforms DeepGMR on all the benchmarks.

In summary, the main contributions of this work are:

•

We propose a learning-based probabilistic registration

framework under the guidance of overlap scores;

•

We propose a cluster-based overlap Transformer mod-

ule to embed cross point cloud information that enables

the detection of overlap regions;

•

We introduce a cluster-based loss to ensure that our

network learns a consistent GMM representation across

feature and geometric space rather than ﬁtting a GMM

in a single feature space;

•

We achieve state-of-the-art accuracy and efﬁciency on a

comprehensive set of experiments, including synthetic

and real-world datasets.

2. Related Work

We review both correspondence-free and correspondence-

based point cloud registration methods, and Transformer-

related works as it is a major component in our approach.

2.1. Correspondence-free Registration

The core idea of the correspondence-free registration ap-

proaches, such as PointNetLK [

] and FMR [

], is to ﬁrst

extract the global features from the source and target point

clouds, and then regress the rigid motion parameters by min-

imizing the difference between global features of two input

point clouds. Such methods do not require point-point cor-

respondences and are robust to density variations. These

methods use extracted global features to reduce the point

cloud’s dimension so that the algorithm’s time complexity

does not increase as the number of points increases. How-

ever, they highly depend on fairly high overlaps (more than

90%) between two point clouds and suffer from performance

degradation in the case of partially-overlapping point clouds,

which are typical in real-world scenarios [51, 6].

2.2. Correspondence-based Registration

2.2.1 Point-level Correspondence-Based Registration.

The most popular point-to-point registration algorithm is

ICP [

], which alternates between rigid motion estimation

and the correspondences searching [

] by solving a

optimization. However, ICP converges to spurious local min-

ima due to the non-convexity nature of the problem. Based

on this algorithm, many variants have been proposed. For ex-

ample, the Levenberg-Marquardt ICP [9] uses a Levenberg-

Marquardt algorithm to yield the transformation by replacing

the singular value decomposition with gradient descent and

Gaussian-Newton approaches, accelerating the convergence

while ensuring high accuracy. Go-ICP [

] solves the point

cloud alignment problem globally by using a Branch-and-

Bound (BnB) optimization framework without prior infor-

mation on correspondence or transformation. RANSAC-like

algorithms are widely used for robust ﬁnding the correct

correspondences for registration [

]. FGR [

] optimizes a

Geman-McClure cost-induced correspondence-based objec-

tive function in a graduated nonconvex strategy and achieves

high performance. TEASER [

] reformulates the registra-

tion problem as an intractable optimization and provides

readily checkable conditions to verify the optimal solution.

These methods still face challenges when point clouds are

affected by noise, outlier, and density variations [51].

Recently, several deep features [

] are proposed to

accurately estimate the point-level correspondences. For

instance, DCP [

] employs DGCNN [

] for feature ex-

traction and an attention module to generate soft matching

pairs. RPMNet [

] proposes a method to solve the point

cloud partial visibility by integrating the Sinkhorn algorithm

into a network to get soft correspondences from local fea-

tures. Soft correspondences can increase the robustness of

registration accuracy. RGM [

] utilizes a Transformer to

aggregate information by generating soft graph edges for

point-wise matching. IDAM [

] incorporates both geomet-

ric and distance features into the iterative matching process.

RIENet [

] propose a method to identify the inlier based on

the graph-structure difference between the neighborhoods.

Although achieving remarkable performance, most of these

approaches rely on point-to-point correspondences, and thus

they are still sensitive to noise and density variations [17].

2.2.2 Probabilistic Registration.

The probabilistic registration methods model the distribu-

tion of point clouds as a density function often via the use

of GMMs and perform alignment either by employing a

correlation-based approach or using an EM-based optimiza-

tion framework [

]. A commonly used formulation,

such as CPD [

] and FilterReg [

], represents the geome-

try of the target point cloud using a GMM distribution over

3D Euclidean space. The source point cloud is then ﬁtted

to the GMM distribution under the maximum likelihood es-

timation (MLE) framework. Another statistical approach,

including GMMReg [

], JRMPC [

], and DeepGMR [

builds GMM probability distribution on both the source and

the target point clouds. These methods show robustness to

outliers, noise, and density variation [

]. These models usu-

ally assume that both source and target point clouds share

the same GMM parameters or that one of the point clouds

is supposed to be “perfect”, thus leading to biased solutions

as long as both point clouds contain noise and outliers (such

as partial-to-partial registration) [8]. Considering the above

factors, we propose an overlap-guided 3D point cloud regis-

tration algorithm based on Gaussian Mixture Model (GMM)

that can be applied to partial-to-partial registration problems.

2.3. Transformers in 3D Point Clouds

Due to the inherent permutation invariance and strong

global feature learning ability, 3D Transformers are well

suited for point cloud processing and analysis. They have

surpassed state-of-the-art non-Transformer algorithm perfor-

mance. In particular, A-SCN [

] is the ﬁrst such example

of a Transformer-based approach for point cloud learning.

Later, PCT [

], which is a pure global Transformer net-

work, generates positional embedding using 3D coordinates

of points and adopts a transformer with an offset attention

module to enrich features of points from its local neigh-

borhood. PointTransformer [

] to construct self-attention

networks for general 3D tasks with nearest neighbor search.

However, they suffer from the fact that as the size of the fea-

ture map increases, the computing and memory overheads

of the original Transformer increase quadratically. Efforts to

reduce the quadratic complexity of attention mainly focused

on self-attention. For instance, PatchFormer [

] reduces

the size of the attention map by ﬁrst splitting the raw point

cloud into patches, and then aggregating the local feature in

each patch to generate an attention matrix. FastPointTrans-

former proposes centroid-aware voxelization and devoxeliza-

tion techniques to reduce space complexity. However, these

works are less suitable for feature matching, where we are re-

quired to perform self- and cross-attention on features within

and between point clouds, respectively. To this end, we pro-

pose clustered attention which groups a set of points into

clusters and computes the attention only for these clusters,

resulting in linear complexity for a ﬁxed number of clusters.

3. Our approach

Rigid point cloud registration aims to recover a rigid trans-

formation matrix

T∈SE(3)

, which consists of rotation

R∈SO(3)

and translation

t∈R3,

that optimally aligns

the source point cloud

P={pi∈R3i= 1,2, ..., N}

the target point cloud

Q={qj∈R3j= 1,2, ..., M}

where

and

represent the number of points in

and

respectively. Fig. 1 illustrates our framework that consists

of three modules: feature extraction, overlap region detec-

tion, and overlap-guided GMM for registration. The shared

weighted encoder ﬁrst extracts point-wise features

and

from point clouds

and

, respectively. The clustered

self-attention module then updates the point-wise features

and

to capture global context. Next, the overlap

region detection module projects the updated features

and

to overlap scores

op,oq

, respectively.

and

are then used to estimate the distributions (GMMs) of

and

. Finally, weighted SVD is adopted to estimate the

rigid transformation Tbased on the estimated distributions.

3.1. Feature Extraction

The feature extraction network consists of a Dynamic

Graph Convolutional Neural Network (DGCNN), positional

encoding, and a clustered self-attention network. Given a

point cloud pair

and

, DGCNN extracts their associated

features

Fp={fpi∈Rdi= 1,2, ..., N}

and

Fq=

{fqj∈Rdj= 1,2, ..., M}. Here, d= 512.

3.1.1 Attention module

Transformer training and inference in previous works can

be computationally expensive due to the quadratic complex-

ity of self-attention over a long sequence of representations,

especially for high-resolution correspondences prediction

tasks. To mitigate this limitation, our novel cluster-based

Transformer architecture operates after local feature extrac-

tion:

and

are passed through the attention module

to extract context-dependent point-wise features. Intuitively,

the self-attention module transforms the DGCNN features

into more informative representations to facilitate matching.

Spherical positional encoding.

Transformers are typically

fed with only high-level features, which may not explicitly

encode the geometric structure of the point cloud [

This makes the learned features geometrically less discrim-

inative, causing severe matching ambiguity and numerous

outlier matches, especially in low-overlap cases [

]. A

straightforward recipe is to explicitly inject positional encod-

ing of 3D point coordinates, which assigns intrinsic geomet-

ric properties to the per-point feature by adding unique po-

sitional information that enhances distinctions among point

features in indistinctive regions [

]. However, the result-

ing coordinate-based attentions are naturally transformation-

variant [

], while registration requires transformation invari-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Overlap-guidedGaussianMixtureModelsforPointCloudRegistrationGuofengMeiUniversityofTechnologySydneySydney,Australiaguofeng.mei@student.uts.edu.auFabioPoiesiFondazioneBrunoKesslerTrento,Italypoiesi@fbk.euCristianoSaltoriUniversityofTrentoTrento,Italycristiano.saltori@unitn.itJianZhangUniversityofTechn...

展开>> 收起<<

Overlap-guided Gaussian Mixture Models for Point Cloud Registration Guofeng Mei University of Technology Sydney.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Overlap-guided Gaussian Mixture Models for Point Cloud Registration Guofeng Mei University of Technology Sydney

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: