
APREPRINT - OCTOBER 28, 2022
(a) (b)
Figure 1: Visualization of Box event output in space-time. Red and Blue represents events with polarity ‘
0
’ and ‘
1
’
respectively. (a) without polarity separation (b) with polarity separation and creation of super-frame.
the location of the event at a particular timestamp
t
with polarity
p
indicating an increase or decrease in event brightness.
Each tuple is represented by 64 bits, with the timestamp being 32 bits and the remaining three fields being 32 bits. The
goal is to gather helpful information from event data and utilize it for processing.
DVS acquire information asynchronously and sparsely, with high temporal resolution and low latency. Hence, the
temporal aspect, particularly latency, is critical in the event data processing. The output stream cannot use traditional
vision algorithms since it is a series of asynchronous events rather than actual intensity images. Therefore, development
of new algorithms that take advantage of the sensor’s high temporal resolution and asynchronous nature is necessary.
There are two types of algorithms based on the processing number of events at the same time. The first approach
operates on an event-by-event basis, in which the system’s state changes upon the occurrence of a single event, resulting
in minimal latency. The second approach involves latency because it operates on groups or packets of events. It can still
provide a system state update upon the occurrence of each event if the window moves by one event at a time. The data
storage and transmission bandwidth limitation for onboard DVS processing is an open challenge and requires immediate
solutions. Spike coding [
10
] is a dedicated lossless compression strategy that exploits event data’s time-series and
asynchronous nature. It follows a cube-based coding framework where the spike sequence is divided into multiple
macro-cubes and encoded accordingly. Entropy-based coding strategies like Huffman and Arithmetic can effectively
encode DVS data by treating each spike event field as an input symbol. Existing lossless coding schemes such as
dictionary-based [
11
,
12
,
13
] and fast-integer [
14
,
15
] encoders can also compress the DVS data after converting the
spike events into a multivariate stream of integers.
The applications of DVS range from self-driving cars [
16
] to robotics [
17
] and drones [
18
]. Applications such as
coordinating multiple intelligent vehicles (IoV) (cars, drones, etc.) having onboard processing constraints require
real-time data sharing and feedback. In comparison to traditional sensing techniques, neuromorphic sensing provides an
intrinsic compression. Further compression of event data is advantageous for transmission in the Internet of Things
(IoT) and the Internet of IoV. This paper presents a novel approach suitable for DVS data compression based on a
deep learning algorithm, Deep Belief Network (DBN). Figure 2 depicts the complete workflow of event compression.
The entire stream of events is converted into a dimensionally reduced latent representation by multiple code layer
blocks using the DBN. The compact latent code blocks contain recurring information suitable for lossless symbol-based
encoders. Hence, we compress the latent code using an entropy-based Huffman coding technique. The primary
contributions of the proposed scheme are as follows:
•
The proposed framework is among the first to incorporate deep learning techniques for event data processing.
High-dimensional event data is transformed into low-dimensional latent code using a multilayer neural
network called a deep belief network. We perform lossless encoding of low-dimensional latent features using
entropy-based encoders to achieve a further compressed representation.
•
We formulated a unique events arrangement deemed more suitable for processing by the proposed framework.
The events are time-aggregated by accumulating spike events over time as super-frame sequences, as explained
in Section 2.1. Super-frames result in high spatial and temporal correlation among the event data.
•
We conducted extensive comparisons with lossless benchmark strategies on a diverse standard dataset with
varying scene complexity and camera movement. As a result of the learning-based framework, we obtain a
2