Coarse-to-Fine Point Cloud Registration with SE3-Equivariant Representations Cheng-Wei Lin1 Tung-I Chen1 Hsin-Ying Lee1 Wen-Chin Chen1 and Winston H. Hsu12

2025-04-27 0 0 2.17MB 8 页 10玖币
侵权投诉
Coarse-to-Fine Point Cloud Registration with SE(3)-Equivariant
Representations
Cheng-Wei Lin1, Tung-I Chen1, Hsin-Ying Lee1, Wen-Chin Chen1, and Winston H. Hsu1,2
Abstract Point cloud registration is a crucial problem in
computer vision and robotics. Existing methods either rely
on matching local geometric features, which are sensitive to
the pose differences, or leverage global shapes, which leads
to inconsistency when facing distribution variances such as
partial overlapping. Combining the advantages of both types of
methods, we adopt a coarse-to-fine pipeline that concurrently
handles both issues. We first reduce the pose differences
between input point clouds by aligning global features; then
we match the local features to further refine the inaccurate
alignments resulting from distribution variances. As global
feature alignment requires the features to preserve the poses
of input point clouds and local feature matching expects the
features to be invariant to these poses, we propose an SE(3)-
equivariant feature extractor to simultaneously generate two
types of features. In this feature extractor, representations
that preserve the poses are first encoded by our novel SE(3)-
equivariant network and then converted into pose-invariant
ones by a pose-detaching module. Experiments demonstrate
that our proposed method increases the recall rate by 20%
compared to state-of-the-art methods when facing both pose
differences and distribution variances.
I. INTRODUCTION
Point cloud registration, the task of finding rigid transfor-
mations that align two input point clouds, is fundamental
for several applications in computer vision and robotics. De-
pending on the application domain, point cloud registration
methods need to fulfill the demand for various properties
[1]. For example, some applications such as SLAM [2], [3]
require the registration method to be real-time and accurate.
Other applications such as scene reconstruction [4] expect the
registration method to be robust to initial pose conditions.
The diversified requirements lead to a variety of regis-
tration approaches. Some prior approaches [5]–[7], namely
local approaches, focused on time-efficiency and accuracy.
However, due to the dependence on matching local geometric
features, these approaches are sensitive to the magnitude
changes in rigid transformations, and thus fail to handle large
initial pose differences (Fig. 1-a). On the other hand, global
approaches [8], [9] leverage global shape information to
maintain robust against initial pose difference. ‘Nonetheless,
these global approaches usually produce alignment results
inferior to those of local approaches when facing distribution
variances, such as partial overlapping (Fig. 1-b). Distribution
variances affect the overall shape that global methods rely
on, while regional geometries remain unchanged.
To solve the registration problems with both large initial
pose differences and distribution variances, we adopt a
coarse-to-fine pipeline that takes advantage of both global
1National Taiwan University, 2Mobile Drive Technology
Input (a) (b) (c)
Fig. 1. The registration results of methods from each category. We denote
source point clouds as red and target point clouds as blue. (a) Local methods
(e.g. IDAM) only produce accurate alignments when initial errors are small
enough. (b) Global methods (e.g. DeepGMR) are invariant to initial errors
but their performance are unreliable. In contrast, (c) our coarse-to-fine
method accurately aligns point clouds regardless of initial errors.
and local approaches (Fig. 2-a). A global register is applied
as our coarse-grained register to reduce pose differences
and roughly align input point clouds. Then, we utilize a
local register to refine the inaccurate alignments caused by
distribution variances.
On top of that, we employ a shared feature extractor to
generate representations for both global and local registers.
The feature alignment process in the global register requires
representations to preserve the pose of input point clouds.
On the contrary, the correspondence matching in the local
register expects representations to get rid of the influences
caused by the differences in the poses.
In our feature extractor, an SE(3)-equivariant neural net-
work and a pose-detaching module are proposed to pro-
duce pose-preserving features and convert them into pose-
invariant ones, respectively (Sec. III-A shows more details
about SE(3)-equivariance). Unlike existing SE(3)-equivariant
networks [10], [11], our SE(3)-equivariant neural network
avoids time-intensive approximations and kernels that con-
strain the expressiveness of the representation. We preserve
the translations by maintaining the center of feature em-
bedding and preserve the rotations using Vector Neuron
[12], a rotation-equivariant framework. Furthermore, our
pose-detaching module normalizes the input point clouds
arXiv:2210.02045v2 [cs.CV] 4 Mar 2023
to remove the translations and utilizes the orthogonality of
rotation matrices to eliminate the rotations. Consequently,
our novel SE(3)-equivariant feature extractor concurrently
produces pose-preserving and pose-invariant representations,
supporting both coarse- and fine-grained registers to effort-
lessly perform registrations.
We evaluate our method on ModelNet40 [13], which is
composed of various object models. Following RPMNet
[14], we pre-processed the dataset to simulate real-world
situations, including sensor noises, independent scans, and
partially overlapping point clouds. Furthermore, we evaluate
the performance over multiple initial angle ranges to exhibit
the influence of initial poses. Experimental results demon-
strate that our method outperforms state-of-the-art methods
and reaches a reliable performance under simulated real-
world scenarios. In addition, ablations (Sec. IV-F) support
that our feature extractor satisfies the feature requirements
of both global and local registers yet remains time efficient.
To sum up, the overall contributions of this work can be
summarized as follows:
We apply a coarse-to-fine pipeline to resist impacts from
both initial pose differences and distribution variances.
We introduce a novel SE(3)-equivariant feature extrac-
tor, simultaneously obtaining representations for both
global and local registers.
Our method outperforms state-of-the-art methods on
ModelNet40 under circumstances simulating the real
world across different initial pose difference ranges.
II. RELATED WORK
A. Local Registration Methods
Local approaches are often used under circumstances
where transformations are known to be small in magnitude.
Iterative Closest Point (ICP) [5] iteratively matches the
closest points as correspondences and minimizes the distance
between these correspondences, which often causes the result
to converge at a local minimum. To resolve this problem,
a variety of strategies have been proposed to deal with
outliers [15], handle the noises [16], or devise better distance
metrics [17], [18]. However, the limitation of matching
points on Euclidean space leads to recent work performing
matching on feature space. PPFNet [19], 3DSmoothNet [20],
SpinNet [21], and FCGF [22] follow this idea and solve
the Procrustes problem based on the correspondences paired
by their representations. Moreover, DCP [6], IDAM [7],
RPMNet [14], DGR [23], ImLoveNet [24], and DetarNet
[25] use the ground truth poses to supervise point matching
and feature learning. Predator [26] and REGTR [27] further
leverage the ground truth overlap regions. Another branch
of work such as D3Feat [28], DeepVCP [29], PRNet [30],
and GeoTransformer [31] leverages key points to enhance
the time efficiency. The remaining challenge is that perfect
correspondences rarely exist in real-world situations, and
thereby recent work [14], [32] utilizes soft matching [33] to
work under these conditions. Even so, these local methods
still fail to handle large initial perturbations.
B. Global Registration Methods
Unlike local approaches, global approaches are designed to
be invariant to the initial transformation error. Some methods
such as GO-ICP [34], GOGMA [35], and GOSMA [36]
search the SE(3) space using branch-and-bound techniques.
Other methods [37]–[39] match the feature with robust
optimization. However, these methods are unsuitable for
real-time applications due to their large computation time.
Fast Global Registration (FGR) [40] is presented to address
this issue, achieving a similar speed to that of many local
methods. To further improve the accuracy of the registration
result, recent work handles the registration problem via
learned global representations. DeepGMR [8] represents the
global feature through GMM distributions and EquivReg
[9] takes rotation-equivariant implicit feature embeddings as
its global representations. Nevertheless, these learning-based
methods often struggle with distribution differences.
C. Group Equivariant Neural Network
Some research concentrates on proposing group equivari-
ant neural networks as a means of resisting group transfor-
mations. For instance, Convolution Neural Network (CNN)
[41] are translation-equivariant, resulting in its performance
consistency among the same images with different 2D trans-
lations. To prevent the effect of rotation, recent studies [11],
[42], [43] construct the kernels by some steerable functions.
However, these constrained kernels limit the flexibility of
the network. Other studies [10], [44] obtain the equivariance
property by lifting the input space to higher-dimensional
spaces where the group is contained. These studies are time-
intensive and cost more computational resources due to the
integration of the entire group. Vector Neuron [12] presents
a brand new SO(3)-equivariant framework. The major ad-
vantage of this framework is the capability of incorporating
the SO(3)-equivariance property into existing networks, such
as PointNet [45] or DGCNN [46]. We will later see how
we design our SE(3)-equivariant feature extractor based on
this simple idea, and use the extracted representation to cope
with the registrations with notable initial transformations and
distribution variances.
III. COARSE-TO-FINE REGISTRATION
Illustrated in Fig. 2, our coarse-to-fine registration pipeline
begins with extracting global and local feature representa-
tions. These global representations are fed into the global
register to estimate a rough alignment between input point
clouds. Focusing on roughly aligned point clouds, the local
register refines the alignment results given the correspon-
dences formed by matching these local representations.
A. Preliminaries
A function f:UVis equivariant to a set of trans-
formations G, if for any gG,fand gcommutes, i.e.,
f(g·u) = g·f(u),uU. For instance, convolution layers
are translation-equivariant because the outcome of applying
a 2D translation to the input taken by convolution layers is
identical to that of applying the 2D translation to the feature
摘要:

Coarse-to-FinePointCloudRegistrationwithSE(3)-EquivariantRepresentationsCheng-WeiLin1,Tung-IChen1,Hsin-YingLee1,Wen-ChinChen1,andWinstonH.Hsu1;2Abstract—Pointcloudregistrationisacrucialproblemincomputervisionandrobotics.Existingmethodseitherrelyonmatchinglocalgeometricfeatures,whicharesensitivetothe...

展开>> 收起<<
Coarse-to-Fine Point Cloud Registration with SE3-Equivariant Representations Cheng-Wei Lin1 Tung-I Chen1 Hsin-Ying Lee1 Wen-Chin Chen1 and Winston H. Hsu12.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:8 页 大小:2.17MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注