LiDAR-guided object search and detection in Subterranean Environments Manthan Patel Gabriel Waibel Shehryar Khattak Marco Hutter

2025-05-03 0 0 6.55MB 6 页 10玖币

侵权投诉

LiDAR-guided object search and detection in Subterranean

Environments

Manthan Patel, Gabriel Waibel, Shehryar Khattak, Marco Hutter

Abstract— Detecting objects of interest, such as human sur-

vivors, safety equipment, and structure access points, is critical

to any search-and-rescue operation. Robots deployed for such

time-sensitive efforts rely on their onboard sensors to perform

their designated tasks. However, as disaster response operations

are predominantly conducted under perceptually degraded

conditions, commonly utilized sensors such as visual cameras

and LiDARs suffer in terms of performance degradation. In

response, this work presents a method that utilizes the comple-

mentary nature of vision and depth sensors to leverage multi-

modal information to aid object detection at longer distances.

In particular, depth and intensity values from sparse LiDAR

returns are used to generate proposals for objects present

in the environment. These proposals are then utilized by a

Pan-Tilt-Zoom (PTZ) camera system to perform a directed

search by adjusting its pose and zoom level for performing

object detection and classiﬁcation in difﬁcult environments. The

proposed work has been thoroughly veriﬁed using an ANYmal

quadruped robot in underground settings and on datasets

collected during the DARPA Subterranean Challenge ﬁnals.

I. INTRODUCTION

Rapid advancement of robotics systems over the past

decade have facilitated their application towards time-sensitive

and mission-critical operations, such as search-and-rescue [1,

2], disaster response [3,4] and infrastructure inspection [5,6],

across complex environments and under difﬁcult operational

conditions. In response, ﬁeld-ready robotic deployment in

challenging scenarios has recently become an area of interest

for robotics researchers and the wider stakeholder audience,

as showcased by the recently concluded DARPA Subterranean

(SubT) Challenge [7]. A key performance indicator for the

SubT challenge, in particular, and for search-and-rescue

missions, in general, is to detect and identify objects of

interest while exploring the target areas. Human survivors,

human identiﬁers (such as clothes, helmets, and backpacks),

safety equipment (such as ﬁre extinguishers and tools), and

environment access points (such as doors and ducts) constitute

examples of the vital objects that need to be detected in a

time-efﬁcient manner using on-board sensors of the robots

during critical tasks.

For object detection, visual cameras have been the sensor

of choice due to their cost-effective, lightweight, and power-

efﬁcient nature. However, in disaster response scenarios, poor

illumination and the presence of obscurants, such as dust,

smoke, and fog, severely degrade the camera performance and

effectively limits the range of object detection to a couple

of meters. Furthermore, rigidly mounted (static) cameras

This work was supported in part by the ETH RobotX student fellowship

The authors are with the Robotic Systems Lab, ETH Z¨

urich

Correspondence email: patelm@ethz.ch

Fig. 1: (A) Instance of autonomous exploration mission conducted at the Salt-Peter

Cave, Kentucky, USA using the ANYmal-C robot, with zoomed-in views showing

relevant onboard sensors. Far-away objects cannot be detected by static cameras alone

due to limited illumination (C). Utilizing the proposed method, object proposals are

generated from LiDAR (B) to direct the Pan-Tilt-Zoom camera to perform detection,

resulting in the backpack (D) being detected at 10 m as compared to 3 m when using

only a static camera.

provide only a limited observation of the environment and

depend on the robot pose to be such that the object is

within their Field-of-View (FoV) to be detected. Articulated

cameras, such as the Pan-Tilt-Zoom (PTZ) camera shown in

Figure 1, can independently orient themselves and change

their zoom level to obtain better observation of far-away

objects. Nevertheless, given the large number of combinations

of orientations and zoom levels required to fully observe

the surrounding environment, it is typically not feasible to

perform complete coverage without impacting the speed

of environment exploration. In contrast, LiDAR sensors

provide 360

◦

observation, depth measurements at long-range,

and remain unaffected by scene illumination, providing an

alternate sensor choice for object detection. However, the

sparse nature and low ﬁdelity of LiDAR data compared to

visual data make accurate object detection difﬁcult.

Motivated by the discussion above, this work presents a

arXiv:2210.14997v1 [cs.RO] 26 Oct 2022

method that utilizes the complementary nature of camera and

LiDAR data to facilitate object detection at long ranges. In

particular, depth and intensity values from sparse LiDAR

returns are used to detect and generate location proposals

for the objects present in the environment. These location

proposals are then used by a PTZ camera system to perform a

directed search by adjusting its orientation and zoom level to

perform object detection and classiﬁcation in difﬁcult environ-

ments at long ranges. The performance and applicability of

the proposed method is thoroughly evaluated on data collected

by an ANYmal-C quadruped robot during ﬁeld deployments

conducted in challenging underground settings, including the

SubT Challenge ﬁnals event consisting of an underground

urban environment, a cave network, and a tunnel system.

II. RELATED WORK

Visual cameras have been the preferred sensor choice for

object detection due to having rich scene information includ-

ing texture and context. Especially with the emergence of

Convolutional Neural Network (CNN) based object detection

approaches such as YOLO [8], SSD [9], faster R-CNN [10],

on-par human-level performance has been achieved. Moreover,

recent approaches such as Mask R-CNN [11], DetectoRS

[12] are able to perform instance segmentation in which each

pixel of the image is assigned a class label and an instance

label. However, due to the absence of depth information,

localizing the detected objects in 3D environment remains a

challenge. This has motivated the teams participating in the

DARPA SubT challenge to use LiDAR scans for localizing

the detected objects. Team CERBERUS [13] made use of

a YOLO architecture trained to include competition-speciﬁc

objects for detection. The 3D location of the object in world

coordinates is obtained by projecting the bounding box into

the robot occupancy map built using the LiDAR scans. Other

teams also made use of similar approaches utilizing both

camera and LiDAR data [14,15]. A common problem reported

by all teams was the reduced object detection range using

only visual cameras due to poor illumination in complex

underground environments.

LiDAR-based 3D object detection methods which make

use of CNNs and operate on point clouds (Point R-CNN

[16]) or voxel-based representation (Voxel R-CNN [17])

have also gained popularity. While these approaches are well

suited for detecting and localizing objects like vehicles and

pedestrians in a structured environment like that of a self-

driving vehicle, they are not well suited for detecting highly

speciﬁc objects in an unstructured environment as required

in our case. Thus, we propose to use LiDAR and a PTZ

camera in a coupled manner to improve the object detection

range. In particular, we propose to use LiDAR scans to

generate object proposals by performing clustering based on

LiDAR intensity and depth difference. These clusters are then

scanned by the PTZ camera and classiﬁed using a CNN-based

object detection model. Existing methods have performed

object segmentation and clustering using sparse LiDAR scans,

with a simple clustering approach based on the Euclidean

distance proposed in [18]. The approach operates directly

Point cloud

Accumulation

Waypoint

Generation

Aggregated Point cloud

Filtered

Images

Cluster

Centers

Ground Points

Removal

Object Segmentation

(Depth + Intensity)

Image Filter

Point clouds Odometry

Point cloud

Projection

Range

Intensity

Surface Normals

Object Proposal

Cluster Merge

Waypoints

To PTZ Camera Controller

Object Point Cluster

Segmentation

Object Cluster Filter

Volume-based

Surface-Normals Std Dev

Cluster Points Size

Fig. 2: An overview of the proposed method.

on the 3D point clouds and introduces a radially bounded

nearest neighbor algorithm for clustering which is able to

handle outliers as opposed to a ’k’ nearest neighbor clustering

[19]. This approach was further extended in [20] to work in

real-time on a continuous stream of data. Methods operating

directly on unordered point clouds are relatively slow due to

the expensive nearest neighbor search queries. Thus, for speed-

up, approaches choose to operate on range images generated

from point clouds instead. Performing computations on range

images have the advantages of exploitable neighborhood

relations and the reduction of redundant points to a single

representative pixel in the image. In [21], the authors propose

to use the depth angle for clustering on range images. In

another clustering approach, Scan-Line-Run (SLR) [22], the

authors propose to modify the two-run connected component

labeling technique for binary images [23] and apply it for

clustering the range images. In recent work [24], the authors

extend the depth-angle-based clustering approach of [21] to

make it robust to instance over-segmentation by introducing

additional sparse connections in the range image, termed map

connections.

III. PROPOSED METHOD

To aid the camera object detection and classiﬁcation

process, especially in challenging and visually-degraded

environments, this work proposes to utilize LiDAR data to

generate object proposals at longer distances. In addition to

utilizing depth data, this work uses auxiliary LiDAR data, such

as intensity return information, to distinguish and segment

objects from the environment. An overview of the proposed

approach is presented in Figure 2, with each component

detailed below:

A. Point cloud Accumulation

To facilitate object detection from sparse LiDAR scans

(Figure 3A), such as that obtained from low-cost LiDARs with

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LiDAR-guidedobjectsearchanddetectioninSubterraneanEnvironmentsManthanPatel,GabrielWaibel,ShehryarKhattak,MarcoHutterAbstractDetectingobjectsofinterest,suchashumansur-vivors,safetyequipment,andstructureaccesspoints,iscriticaltoanysearch-and-rescueoperation.Robotsdeployedforsuchtime-sensitiveeffortsr...

展开>> 收起<<

LiDAR-guided object search and detection in Subterranean Environments Manthan Patel Gabriel Waibel Shehryar Khattak Marco Hutter.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

LiDAR-guided object search and detection in Subterranean Environments Manthan Patel Gabriel Waibel Shehryar Khattak Marco Hutter

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: