1 CU-Net LiDAR Depth-only Completion with Coupled U-Net

2025-04-30 0 0 5.7MB 9 页 10玖币
侵权投诉
1
CU-Net: LiDAR Depth-only Completion with
Coupled U-Net
Yufei Wang, Yuchao Dai, Qi Liu, Peng Yang, Jiadai Sun, and Bo Li
Abstract—LiDAR depth-only completion is a challenging task
to estimate a dense depth map only from sparse measure-
ment points obtained by LiDAR. Even though the depth-only
completion methods have been widely developed, there is still
a significant performance gap with the RGB-guided methods
that utilize extra color images. We find that existing depth-only
methods can obtain satisfactory results in the areas where the
measurement points are almost accurate and evenly distributed
(denoted as normal areas), while the performance is limited
in the areas where the foreground and background points are
overlapped due to occlusion (denoted as overlap areas) and the
areas where there are no available measurement points around
(denoted as blank areas), since the methods have no reliable input
information in these areas. Building upon these observations,
we propose an effective Coupled U-Net (CU-Net) architecture
for depth-only completion. Instead of directly using a large
network for regression, we employ the local U-Net to estimate
accurate values in the normal areas and provide the global U-Net
with reliable initial values in the overlap and blank areas. The
depth maps predicted by the two coupled U-Nets complement
each other and can be fused by learned confidence maps to
obtain the final completion results. In addition, we propose
a confidence-based outlier removal module, which identifies
the regions with outliers and removes outliers using simple
judgment conditions. The proposed method boosts the final
dense depth maps with fewer parameters and achieves state-
of-the-art results on the KITTI benchmark. Moreover, it owns
a powerful generalization ability under various depth densities,
varying lighting, and weather conditions. The code is released at
https://github.com/YufeiWang777/CU-Net.
Index Terms—Computer vision for transportation, range sens-
ing, depth completion
I. INTRODUCTION
ACQUIRING scene geometry using active depth sensors,
e.g. LiDAR, has multiple applications in robotics [1] and
SLAM [2]. However, due to the limited number of scan beams
and angular resolution, existing LiDARs can only provide
sparse measurements. Therefore, depth completion methods
have been developed to predict dense depth maps from sparse
depth measurements. According to whether the color images
are utilized for guidance, existing depth completion methods
are divided into the RGB-guided methods and the depth-only
methods. Despite the RGB-guided methods outperforming
the depth-only methods by a wide margin in the evaluation
This work was partly supported by the National Key Research and De-
velopment Program of China (2018AAA0102803) and the National Natural
Science Foundation of China (61871325, 62001394, 61901387).
Corresponding authors: Yuchao Dai and Bo Li.
All authors are with the School of Electronics and Information,
Northwestern Polytechnical University, Xi’an, 710129, China. (e-mail:
{wangyufei1951, daiyuchao, changersunjd}@gmail.com;
{liuqi, yangpeng, libo}@nwpu.edu.cn.)
(c) Dense Depth Map obtained by the single U-Net depth-only method
(d) Dense depth map obtained by the RGB-guided method
(a) Dierent distributions of measurement points in LiDAR sparse depth map
Normal Areas Overlap Areas Blank Areas
Ordered points Overlap points No points
(b) Examples of three dierent fill areas
(e) Dense Depth Map obtained by the double U-Net depth-only method
Blank areas
Overlap areas
Normal areas
Ordered points
Overlap points
No points
(a) The areas to be filled is divided into three types of areas
(b) Performance of the methods in dierent areas
The division map of the areas to be filled
The sparse depth map
Figure 1: The areas to be filled are divided into normal
areas, overlap areas, and blank areas according to different
distributions of sparse measurement points. Existing depth-
only methods can obtain promising results in the normal areas,
while in the overlap and blank areas, due to the lack of reliable
input information, there is a significant performance gap with
the RGB-guided methods.
metrics, their performance heavily depends on the quality and
style of color images, the unpredictable imaging conditions
may cause adverse effects on such methods, especially when
encountering rain, snow, and night [3]. In this paper, we focus
on the depth-only methods.
Existing depth-only methods usually employ off-the-shelf
network structures to directly regress the sparse depth maps
to the dense depth maps, such as sparsity invariant CNN [4],
arXiv:2210.14898v1 [cs.CV] 26 Oct 2022
2
encoder-decoder [5], [6], and hourglass [7], [8]. Although the
performance of the depth-only methods has been dramatically
improved, there is still a large gap with the RGB-guided
methods. We observe that there are three different distributions
of measurement points in the areas to be filled: 1) the measure-
ment points are almost accurate and the distribution is even,
2) the measurement points are overlapped by foreground and
background points due to occlusion, and some of the points
whose depth values abruptly change should be removed as
outlier points [9], [10], 3) there are no available points around.
As shown in Fig. 1, according to the distributions, we divide
the areas that need to be filled into normal areas, overlap
areas, and blank areas. Through experiments, we observe that
existing depth-only methods such as S2D (depth-only) [6]
and Ip basic [11] can obtain promising results in normal
areas, the performance gap between them and state-of-the-art
RGB-guided methods such as PENet [12] and S2D (RGB-
guided) [6] primarily derives from the overlap and blank
areas. We consider that this is because the depth-only methods
have no reliable input information in these areas, while the
RGB-guided methods can take advantage of the rich semantic
information of the color images to get better results.
Therefore, predicting initial input information for the blank
areas and overlap areas, and removing the outlier points from
the overlap areas are the key to the depth-only methods. To
address the issue, we propose the Coupled U-Net (CU-Net)
method, which is enhanced by an outlier removal module.
First, we adopt the first U-Net to predict an initial depth map
and a corresponding confidence map from the sparse LiDAR
measurements. The initial dense depth map has accurate depth
values and high confidence in the normal areas, and it can also
provide the second U-Net with reliable initial depth values in
the overlap and blank areas. Second, we propose a confidence-
based outlier removal method. Unlike removing the outliers
by judging whether each measurement point meets complex
conditions [10], the proposed method uses the confidence map
predicted by the first U-Net to identify the regions with outliers
and removes the outliers by a simple judgment condition.
Then, the corrected sparse depth map and the initial dense
map are fed to the second U-Net to obtain a dense depth
map with improved results in the overlap and blank areas
and the corresponding confidence map. Since the depth map
predicted by the first U-Net shows satisfactory results in the
ordered depth points and their local areas, while the depth
map predicted by the second U-Net has good results in other
global areas, we refer to these two U-Nets as the local U-
Net and the global U-Net, respectively. The dense depth maps
predicted by the local U-Net and the global U-Net are fused
by the confidence maps to obtain the final completion result.
We conduct comprehensive experiments to verify the ef-
fectiveness and generalization of our method on the KITTI
dataset [13], [14] and DDAD dataset [15]. Our contributions
can be summarized as follows:
We quantitatively analyze the cause of the performance
gap between depth-only methods and RGB-guided meth-
ods, and show that the primary reason for the limited per-
formance of depth-only methods is their lack of reliable
input information in overlap regions and blank areas.
To address the issue, we propose a two-stage network
with learned intermediate confidence maps, where the
first network provides initial depth values of the overlap
and blank areas for the second network. Furthermore, we
propose a confidence-based outlier removal method to
enhance the proposed method, which employs a learned
confidence map to identify the areas with outliers and
remove them.
Experimental results on the popular KITTI benchmark
and the DDAD dataset show that our method achieves
state-of-the-art performance among all published papers
that employ only depth data during training and inference.
Meanwhile, it shows powerful generalization capabilities
under different depth densities, changing lighting, and
weather conditions.
II. RELATED WORK
This section introduces representative works of the RGB-
guided methods and the depth-only methods.
RGB-guided methods. The input of the RGB-guided methods
includes the sparse depth maps and their corresponding color
images. How to fuse the information of these two different
modalities is an open problem, a straightforward approach
called “early-fusion” is to concatenate the depth maps and
the color images to form a 4D tensor. Ma et al. [6] propose
a “later-fusion” method that extracts the feature of the color
images and the sparse depth maps separately, and feeds the
fusion features into an Encoder-decoder network. Gansbeke et
al. [8] propose a method based on color images guidance and
uncertainty, which employs two branches and achieves better
results. Qiu et al. [16] consider that the color images and depth
maps are not strongly correlated, and propose a method that
consists of the surface normal guided branch and the RGB
guided branch. The proposed method first predicts the surface
normal from the color images, and the results of two branches
are fused through confidence images to obtain the final dense
depth maps. Hu et al. [12] add the 3D information of the
sparse depth maps to the convolution operation. Considering
that the existing fusion methods between the color images and
depth maps are too simple, Tang et al. [17] propose a guided
convolutional network for feature fusion. To address the dense
depth maps predicted by end-to-end networks that are blurred
at the boundaries of objects, a series of affinity-based spatial
propagation methods [18], [19], [20], [21], [22] have been
proposed.
Depth-only methods. The depth-only methods adopt only the
given sparse depth maps to predict the corresponding dense
depth maps. Many classical methods can only efficiently fill
relatively dense depth maps, such as bilateral filtering. To fill
highly sparse depth maps, Ku et al. [11] propose a method that
employs traditional image processing techniques, such as mor-
phological transformations and image smoothing. The method
has a fast processing speed and its performance exceeds that of
contemporary learning methods. Zhao et al. [10] also propose
a non-learning method based on surface geometry, which is
enhanced by an outlier removal algorithm. In the deep learning
era, considering the sparsity of the input depth maps, Uhrig et
摘要:

1CU-Net:LiDARDepth-onlyCompletionwithCoupledU-NetYufeiWang,YuchaoDai,QiLiu,PengYang,JiadaiSun,andBoLiAbstract—LiDARdepth-onlycompletionisachallengingtasktoestimateadensedepthmaponlyfromsparsemeasure-mentpointsobtainedbyLiDAR.Eventhoughthedepth-onlycompletionmethodshavebeenwidelydeveloped,thereisst...

展开>> 收起<<
1 CU-Net LiDAR Depth-only Completion with Coupled U-Net.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:5.7MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注