ROBUST MONOCULAR LOCALIZATION OF DRONES BY ADAPTING DOMAIN MAPS TO DEPTH PREDICTION INACCURACIES Priyesh Shukla Sureshkumar S. Alex C. Stutts Sathya Ravi Theja Tulabandhula and Amit R. Trivedi

2025-05-03 0 0 1.53MB 5 页 10玖币
侵权投诉
ROBUST MONOCULAR LOCALIZATION OF DRONES BY ADAPTING DOMAIN MAPS TO
DEPTH PREDICTION INACCURACIES
Priyesh Shukla, Sureshkumar S., Alex C. Stutts, Sathya Ravi, Theja Tulabandhula, and Amit R. Trivedi
University of Illinois at Chicago, USA
ABSTRACT
We present a novel monocular localization framework by
jointly training deep learning-based depth prediction and
Bayesian filtering-based pose reasoning. The proposed cross-
modal framework significantly outperforms deep learning-
only predictions with respect to model scalability and tol-
erance to environmental variations. Specifically, we show
little-to-no degradation of pose accuracy even with extremely
poor depth estimates from a lightweight depth predictor. Our
framework also maintains high pose accuracy in extreme
lighting variations compared to standard deep learning, even
without explicit domain adaptation. By openly representing
the map and intermediate feature maps (such as depth es-
timates), our framework also allows for faster updates and
reusing intermediate predictions for other tasks, such as ob-
stacle avoidance, resulting in much higher resource efficiency.
Index TermsDepth neural network, drone localization.
1 Introduction
For self-navigation, the most fundamental computation re-
quired for a vehicle is to determine its position and orienta-
tion, i.e., pose during motion. Higher-level path planning ob-
jectives such as motion tracking and obstacle avoidance oper-
ate by continuously estimating vehicle’s pose. Recently, deep
neural networks (DNNs) have shown a remarkable ability for
vision-based pose estimation in highly complex and cluttered
environments [1–3]. For visual pose estimation, DNNs can
learn the correlation of vehicle’s position/orientation and vi-
sual fields to a mounted camera. Thereby, vehicle’s pose can
be predicted using a monocular camera alone. In contrast, the
traditional methods required bulky and power-hungry range
sensors or stereo vision sensors to resolve the ambiguity be-
tween an object’s distance and its scale [4, 5].
However, DNN’s implicit learning of flying domain fea-
tures such as its map, placement of objects, coordinate frame,
domain structure, etc. in a standard pose-DNN also affects
the robustness and adaptability of pose estimations. The tra-
ditional filtering-based approaches [6] account for the flying
space structure using explicit representations such as voxel
grids, occupancy grid, Gaussian mixture model (GMM), etc.
[7]; thereby, updates to the flying space such as map exten-
sion, new objects, and locations can be more easily accommo-
dated. Comparatively, DNN-based estimators cannot handle
selective map updates, and the entire model must be retrained
even under small randomized or structured perturbations. Ad-
ditionally, filtering loops in traditional methods can adjudi-
cate predictive uncertainties against measurements to system-
atically prune hypothesis space and can express prediction
confidence along with the prediction itself [8]. Whereas feed-
forward pose estimations from a deterministic DNN are vul-
nerable to measurement and modeling uncertainties.
In thie paper, we use integrate traditional filtering tech-
niques with deep learning to overcome such limitations of
DNN-based pose estimation while exploiting their suitability
to operate efficiently with monocular cameras alone. Specif-
ically, we present a novel framework for visual localization
by integrating DNN-based depth prediction and Bayesian
filtering-based pose localization. In Figure 1, avoiding range
sensors for localization, we utilize a DNN-based lightweight
depth prediction network at the front end and sequential
Bayesian estimation at the back end. Our key observation
is that, unlike pose estimation, which innately depends on
map characteristics such as spatial structure, objects, coordi-
nate frame, etc., depth prediction is map-independent [9, 10].
Thus, by applying deep learning only on domain-independent
tasks and utilizing traditional models where domain is openly
(or explicitly) represented helps improve the predictive ro-
bustness. Limiting deep learning to only domain-independent
tasks also allows our framework to utilize vast training sets
from unrelated domains. Open representation of map and
depth estimates enables faster domain-specific updates and
utilization of intermediate feature maps for other autonomy
objectives, such as obstacle avoidance, thus improving com-
putational efficiency.
2 Monocular Localization with Depth
Neural Network and Pose Filters
In Figure 1, our framework integrates deep learning-based
depth prediction and Bayesian filters for visual pose local-
ization in the 3D space. At the front end, a depth DNN scans
monocular camera images to predict the relative depth of im-
age pixels from the camera’s focal point. A particle filter lo-
arXiv:2210.15559v1 [cs.CV] 27 Oct 2022
摘要:

ROBUSTMONOCULARLOCALIZATIONOFDRONESBYADAPTINGDOMAINMAPSTODEPTHPREDICTIONINACCURACIESPriyeshShukla,SureshkumarS.,AlexC.Stutts,SathyaRavi,ThejaTulabandhula,andAmitR.TrivediUniversityofIllinoisatChicago,USAABSTRACTWepresentanovelmonocularlocalizationframeworkbyjointlytrainingdeeplearning-baseddepthpred...

展开>> 收起<<
ROBUST MONOCULAR LOCALIZATION OF DRONES BY ADAPTING DOMAIN MAPS TO DEPTH PREDICTION INACCURACIES Priyesh Shukla Sureshkumar S. Alex C. Stutts Sathya Ravi Theja Tulabandhula and Amit R. Trivedi.pdf

共5页,预览1页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:5 页 大小:1.53MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 5
客服
关注