DOTIE - D etecting O bjects through T emporal I solation of E vents using a Spiking Architecture Manish Nagaraj Chamika Mihiranga Liyanagedera and Kaushik Roy

2025-04-27 0 0 1.17MB 7 页 10玖币
侵权投诉
DOTIE - Detecting Objects through Temporal Isolation of Events using
a Spiking Architecture
Manish Nagaraj, Chamika Mihiranga Liyanagedera and Kaushik Roy
Abstract Vision-based autonomous navigation systems rely
on fast and accurate object detection algorithms to avoid
obstacles. Algorithms and sensors designed for such systems
need to be computationally efficient, due to the limited energy
of the hardware used for deployment. Biologically inspired
event cameras are a good candidate as a vision sensor for such
systems due to their speed, energy efficiency, and robustness
to varying lighting conditions. However, traditional computer
vision algorithms fail to work on event-based outputs, as they
lack photometric features such as light intensity and texture.
In this work, we propose a novel technique that utilizes
the temporal information inherently present in the events
to efficiently detect moving objects. Our technique consists
of a lightweight spiking neural architecture that is able to
separate events based on the speed of the corresponding objects.
These separated events are then further grouped spatially to
determine object boundaries. This method of object detection
is both asynchronous and robust to camera noise. In addition,
it shows good performance in scenarios with events generated
by static objects in the background, where existing event-based
algorithms fail. We show that by utilizing our architecture,
autonomous navigation systems can have minimal latency and
energy overheads for performing object detection.
I. INTRODUCTION
Autonomous navigation is emerging as an important area
of research, with applications ranging from videography and
surveillance in remote regions to transportation systems in
populated urban areas. The Society of Automation Engineers
(SAE) has identified six levels of automation for autonomous
navigation [1], starting from systems with no automation
(level 0) to systems that are fully automated and do not
require any operator (level 5). One of the most basic and
essential requirements of a system with automation (level 1
and above), is the ability to detect and avoid obstacles. And
most importantly, these systems must be able to perform
object detection accurately at very high speeds.
Recent advancements in imaging technology have led to
the development of biologically inspired event cameras [2]–
[5]. While traditional frame cameras capture photometric
features such as light intensity and texture with high spatial
resolution at fixed intervals, event cameras are asynchronous
and only capture the change in light intensities at each
pixel. Although event cameras fail at capturing photometric
features, they do not suffer from issues such as motion blur
and can operate at a much higher frequency and under a
*This work was supported in part by, Center for Brain-inspired Comput-
ing (C-BRIC), a DARPA sponsored JUMP center, Semiconductor Research
Corporation (SRC), National Science Foundation, the DoD Vannevar Bush
Fellowship, and IARPA MicroE4AI.
All authors are with Purdue University, West Lafayette, IN 47907, USA
{mnagara, cliyanag, kaushik}@purdue.edu
wider range of illumination. Owing to their higher output
frequency, event cameras can capture high resolution tem-
poral information that are missed by frame cameras. Since
object detection and subsequent applications such as collision
avoidance need to be performed at a very high speed and
over a wide range of illumination, event cameras become
ideal candidates for this task.
While there has been a plethora of research on object
and feature detection algorithms, most of these algorithms
are designed for frame camera outputs. These algorithms
fail on event camera outputs as they rely on photometric
features that are only present in frame data. For example,
Fig. 1. shows the performance of both, a preliminary edge
detection algorithm - the canny filter [6], and a complex
learning based object recognition algorithm - YOLOv3 [7],
on a frame camera output and an event camera output of the
same scene. While the algorithms can successfully function
on the former, they fail to function on the latter.
(a) (b) (c)
(d) (e) (f)
Fig. 1: Example to show the inability of traditional computer
vision algorithms to perform on event camera outputs. (a)
and (d) show the frame and event camera outputs captured
on the same scene (taken from the MVSEC dataset [8]). (b)
and (e) show the outputs of applying a canny filter on the
respective images. (c) and (f) show the output of YOLOv3 on
the respective images. We can see that both canny filtering
and YOLOv3 fail to perform on the event camera output.
This motivates us to explore and utilize temporal features
that are inherently present in the events, but are ignored
in traditional computer vision algorithms. Leveraging such
temporal information in events can help us in detecting
and differentiating objects in a scene. For this purpose, we
identify two properties important for object detection:
(i) Events generated by the same object are temporally
close to each other.
arXiv:2210.00975v1 [cs.CV] 3 Oct 2022
(ii) Events generated by the same object are spatially close
to each other.
Based on these two properties, we aim to group events
along two dimensions - spatial and temporal, to isolate events
based on their source objects and thus identify the object
boundaries.
Biologically inspired spiking neurons, specifically Leaky
Integrate and Fire (LIF) neurons [9], can leverage spatio-
temporal information and are hence, well suited for achiev-
ing boundary detection along both these dimensions. These
neurons mimic the brain activity and behave based on the
temporal properties of the inputs to the neuron. The neuron
generates an output spike only if the input events occur at a
rate higher than a certain frequency. A neural architecture
with these spiking neurons is sensitive to the temporal
structure of input events. It is also possible to arrange the
connectivity between these neurons in a network to enable
support between events that are spatially close to each other.
Further, spiking neurons work in an asynchronous fashion,
making them compatible with the asynchronous outputs of
event cameras. These properties, along with the fact that
spiking neural architectures can be more energy efficient (as
shown in [10]–[12]) than their artificial neural counterparts,
make them an ideal candidate for performing object detection
with a low latency and energy overhead for autonomous
navigation systems.
In this work, we develop a novel and energy efficient
object detection technique to first isolate objects based on the
speed of their movement. This is done using inputs from an
event camera and a lightweight single layer spiking neural
network. Once objects are separated based on their speed
of movement, their corresponding events are then grouped
together based on their spatial characteristics. For this, we
utilize existing clustering techniques to further separate the
events belonging to different objects, based on their spatial
proximity. Here, we are taking advantage of the fact that
the events from the same object, share similar temporal
characteristics such as their speed of movement, and spatial
characteristics such as being generated by pixels that are spa-
tially close together. Our experiments show that this approach
is a more efficient way to detect objects in terms of both
latency and energy consumption. By isolating events based
on the speed of movement of their corresponding objects, we
are also able to eliminate the non-relevant information caused
by noise and background static objects1. Further, this benefits
the clustering techniques as they have a lower operational
complexity due to the reduced number of samples (events)
that need to be clustered.
We summarize the main contributions as follows:
1) We develop an object detection algorithm that solely
relies on the event camera outputs and does not need any
additional information from traditional frame cameras.
2) Unlike many existing works on object detection which
accumulate events for a time duration to create a frame,
1If there is a need to detect static objects, this can be done by clustering
the residual events that do not propagate through the spiking architecture.
we perform object detection asynchronously as the
events are being generated by the event camera.
3) The proposed spike-based network for separating ob-
jects based on their motion consists of a single layer,
resulting in a detection algorithm with lower latency and
energy overhead.
4) The outputs of the proposed spiking architecture (which
isolates objects based on their speed), can be used with
any spatial clustering technique that does not require
prior knowledge of the number or the size of clusters.
5) The spiking architecture is scene independent. This
means that we do not have to train the parameters
of the architecture based on the scene of deployment.
These parameters directly correspond to the speed of
the objects and can be fine-tuned prior to deployment.
II. RELATED WORK
A. Object Detection in the Event Camera Domain
Object detection is a topic that has been extensively
studied in the computer vision community. There have been
works ranging from simple feature detectors [6], [13], to
more complex learning-based methods [14]. There has also
been a significant interest in the learning community on
neural networks that can not only detect objects, but can
also classify them into different classes [7]. However, when it
comes to autonomous navigation where object classification
is not a priority, the latency and energy efficiency of the
underlying algorithms take precedence.
As discussed earlier, traditional frame-based algorithms
fail to operate on event camera outputs due to the absence of
photometric characteristics such as texture and light intensity.
However, owing to the numerous advantages of event cam-
eras, including higher operating speed, wider dynamic range,
and lower power consumption, there has been a substantial
interest in the community to develop algorithms that are more
suited towards this domain.
Initial event-based detection algorithms such as [15], [16]
were focused on detecting patterns present in the event cam-
era output. The authors in [17] used a simple blob detector
to detect the inherent patterns present in the event data, and
[18] used a plane fitting method on the distribution of events
to identify corners. A recent work adapted Gaussian mixture
modelling to detect patterns in the event data [19]. These
methods, however, fail in scenarios where there are events
generated by the background. As a solution, [20] proposed a
motion compensation technique to eliminate events generated
by the background, by estimating the system’s ego-motion.
The optimization involved, however, adds significant latency
and computational overhead to the system.
To improve the detection accuracy, several recent research
efforts were focused on utilizing information from both
frame and event cameras [21], [22]. These hybrid methods
detect features on the frames and track the objects through
events. Since their detection relies on frame inputs, they
cannot operate in scenarios with a wide dynamic range and
are computationally expensive.
摘要:

DOTIE-DetectingObjectsthroughTemporalIsolationofEventsusingaSpikingArchitectureManishNagaraj,ChamikaMihirangaLiyanagederaandKaushikRoyAbstract—Vision-basedautonomousnavigationsystemsrelyonfastandaccurateobjectdetectionalgorithmstoavoidobstacles.Algorithmsandsensorsdesignedforsuchsystemsneedtobecompu...

展开>> 收起<<
DOTIE - D etecting O bjects through T emporal I solation of E vents using a Spiking Architecture Manish Nagaraj Chamika Mihiranga Liyanagedera and Kaushik Roy.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:7 页 大小:1.17MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注