Transferring dense object detection models to event-based data

2025-05-06 0 0 1.59MB 10 页 10玖币

侵权投诉

Transferring dense object detection models to event-based

data

VINCENZ MECHLER, Fraunhofer IGD, Germany

PAVEL ROJTBERG, Fraunhofer IGD, Germany

Fig. 1. Object detection on the KITTI dataset [

]. Cyan boxes denote ground-truth. Pink boxes denote

predictions. Top: Using sparse event-histograms Boom: Using source RGB-image responsible for the o

events.

Event-based image representations are fundamentally dierent to traditional dense images. This poses a

challenge to apply current state-of-the-art models for object detection as they are designed for dense images. In

this work we evaluate the YOLO object detection model on event data. To this end we replace dense-convolution

layers by either sparse convolutions or asynchronous sparse convolutions which enables direct processing

of event-based images and compare the performance and runtime to feeding event-histograms into dense-

convolutions. Here, hyper-parameters are shared across all variants to isolate the eect sparse-representation

has on detection performance. At this, we show that current sparse-convolution implementations cannot

translate their theoretical lower computation requirements into an improved runtime.

CCS Concepts: •Computing methodologies →Image representations;Object detection.

Additional Key Words and Phrases: computer vision

arXiv:2210.02607v1 [cs.CV] 5 Oct 2022

2 Vincenz Mechler and Pavel Rojtberg

1 INTRODUCTION

Event-based or neuromorphic cameras provide many advantages like high-frequency output,

high dynamic-range and a lower power-consumption. However, their sensor output is a sparse,

asynchronous image-representation, which is fundamentally dierent to traditional, dense images.

This hinders the use of convolutional layers, which are an essential building-block of current

state-of-the-art image processing networks. Classical convolutions on sparse data, as it is produced

e.g. by event-cameras, are inecient, as a large part of the computed feature-map defaults to zero.

Furthermore, sparsity of the data is quickly lost, as the non-zero sites spread rapidly with each

convolution. To alleviate this problem, changes to the convolutional layers were proposed.

Sparse convolutional layers [

] compute convolutions only at active (i.e. non-zero) sites. The

sub-type of ’valid’ or ’submanifold’ sparse convolutional layers furthermore tries to preserve the

sparsity of the data by only producing output signals at active sites, which makes them highly

ecient at the cost of restricting signal propagation. Non-valid sparse convolutions are semantically

equivalent with dense convolution layers in that they compute the same result given identical inputs.

Valid or submanifold sparse convolution layers, on the other hand, dier from dense convolutions,

but still provide a good approximation for full convolutions on sparse data.

Messikommer et al

. [8]

further introduce asynchronicity into the network. This allows for

samples to be fed into the network in parts as they are produced by a sensor, and thus to reduce

the latency in real-time applications. Several small batches of events from the same sample can be

processed sequentially, producing identical results to synchronous layers once the whole sample

has been processed. However, [

] only implemented a proof-of-concept. The project only includes

asynchronous submanifold sparse convolutional and batch-norm layers, whereas the sparseconvnet

(SCN) project[

] provides a full-edged library. Furthermore, asynchronous models cannot be

trained, as the index_add operation used in the forward function is not supported by PyTorch’s

automatic gradient tracking. This, however, does not pose a problem, as each layer is functionally

equivalent to its SCN counterpart. Therefore, it is possible to train an architecturally identical SCN

network and transfer the weights. As the asynchronous property is only relevant during inference,

this does not pose a limitation.

An alternative approach is to convert the sparse frame representation to dense frames rst,

using a learning-based approach [

]. This way, however, one loses all computational advantages

that the sparse representation oers. Notably, it is also possible to synthesize events from a dense

frame-based representation [3].

Furthermore, a leaky surface layer was proposed by Cannici et al

. [1]

which integrates the

event-to-frame conversion directly into the target network. This way the network becomes stateful,

and resembles a spiking model [7].

1.1 Contributions and outline

In this work, we use the YOLO v1 model [

] as a simple but powerful dense object recognition

baseline. We model sparse networks architecturally identical to YOLO v1 using the SCN[

] and

asynet[

] frameworks. These serve as a case study to evaluate the performance of sparse and

asynchronous vs dense object detection.

We implement all variants in PyTorch and evaluate the predictive performance and runtime

requirements against a dense variant. To this end, we convert the KITTI Vision dataset to events

using [

]. This allows us to answer the question if these novel technologies are a viable optimization

over dense convolutional layers, or if they fall short of the expectations in practice.

The remaining part of this work is structured as follows: First, section 2 introduces data formats

required for the remainder of this work. Next, section 3 details the major changes and additions to

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Transferringdenseobjectdetectionmodelstoevent-baseddataVINCENZMECHLER,FraunhoferIGD,GermanyPAVELROJTBERG,FraunhoferIGD,GermanyFig.1.ObjectdetectionontheKITTIdataset[5].Cyanboxesdenoteground-truth.Pinkboxesdenotepredictions.Top:Usingsparseevent-histogramsBottom:UsingsourceRGB-imageresponsiblefortheof...

展开>> 收起<<

Transferring dense object detection models to event-based data.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Transferring dense object detection models to event-based data

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: