Energy-Efficient Deployment of Machine Learning
Workloads on Neuromorphic Hardware
Peyton Chandarana, Mohammadreza Mohammadi, James Seekings, Ramtin Zand
Department of Computer Science and Engineering, University of South Carolina, Columbia, SC
Abstract—As the technology industry is moving towards im-
plementing tasks such as natural language processing, path
planning, image classification, and more on smaller edge com-
puting devices, the demand for more efficient implementations of
algorithms and hardware accelerators has become a significant
area of research. In recent years, several edge deep learning
hardware accelerators have been released that specifically focus
on reducing the power and area consumed by deep neural
networks (DNNs). On the other hand, spiking neural networks
(SNNs) which operate on discrete time-series data, have been
shown to achieve substantial power reductions over even the
aforementioned edge DNN accelerators when deployed on special-
ized neuromorphic event-based/asynchronous hardware. While
neuromorphic hardware has demonstrated great potential for
accelerating deep learning tasks at the edge, the current space
of algorithms and hardware is limited and still in rather early
development. Thus, many hybrid approaches have been proposed
which aim to convert pre-trained DNNs into SNNs. In this work,
we provide a general guide to converting pre-trained DNNs into
SNNs while also presenting techniques to improve the deployment
of converted SNNs on neuromorphic hardware with respect to
latency, power, and energy. Our experimental results show that
when compared against the Intel Neural Compute Stick 2, Intel’s
neuromorphic processor, Loihi, consumes up to 27×less power
and 5×less energy in the tested image classification tasks by
using our SNN improvement techniques.
Index Terms—edge computing, hardware accelerator, spiking
neural networks, deep neural networks, neuromorphic computing
I. INTRODUCTION
In the last 10 years, deep learning has transformed the
technology industry enabling computers to perform image
classification and recognition, translation, path planning, and
more [1]–[4]. While these efforts have been fruitful in terms
of providing the desired functionality, most of these imple-
mentations employ the use of power-hungry hardware such
as GPUs and TPUs [5] and are deployed in systems that
are not constrained by power limitations. In recent, years
many approaches have been proposed to alleviate these power
constraints with methods such as quantization [6], [7] and
approximate computing [8]. With these approaches, many new
edge-specific devices have been introduced such as the Nvidia
Jetson Nano, Intel Neural Compute Stick 2, and Google Coral
Edge TPU. Many of these edge devices were created to take
advantage of quantized networks that operate on lower preci-
sion values rather than the standard single or double precision
floating point representations. As a result, the overall power
consumption and architecture area are reduced to specifically
benefit applications where space and power are limited.
Spiking neural networks (SNNs), considered the latest gen-
eration of artificial neural networks (ANNs), are a new class
of neural networks that focus on biological plausibility, energy
efficiency, and event-based computing [9]. Unlike their con-
tinuous floating-point value-based deep neural network (DNN)
counterparts, SNNs operate on discrete-temporal values which
represent the biological action potentials of neurons in the
brain [9]. SNNs have been shown in many works [10]–[12]
to accomplish comparable accuracies to DNNs while also
significantly reducing power and energy consumption.
While, SNNs can be more energy and power efficient,
training deep SNNs (DSNNs) has been a recurring challenge
due to the lack of suitable training/learning algorithms that per-
form as well as the backpropagation algorithm used in DNNs
[13]–[15]. Many SNN-specific learning algorithms have been
proposed such as spike-timing dependent plasticity (STDP)
and variants of it [14], [16]. These learning approaches rely on
the temporal patterns found in the time between spikes to adapt
the weight values as the network sees more input [17]. This
approach to learning, while efficient and suitable for SNNs of
low depth dimensionality, do not typically scale well to deeper
networks due to the lack of feedback from subsequent layers
during training [13], [15]. To address or even bypass the train-
ing and design challenges introduced in SNNs, many DNN to
SNN conversion approaches have been proposed [18]–[20].
One such conversion approach, the SNN Conversion Toolbox
[20], uses the parameters in pre-trained DNNs to create a
similar SNN and deploy them on Loihi to provide energy-
efficient and event-based computation to highly constrained
environments in edge computing applications. In this work,
we aim to generalize the process of converting pre-trained
DNNs into SNNs and deploying the SNN on neuromorphic
hardware such as Loihi by contributing the following:
•We provide general guidelines for designing and training
DNNs for conversion into SNNs.
•After the SNNs are created, we present analysis and
optimization techniques to further optimize the SNNs
with respect to power, latency, and energy.
•We compare the performance of SNNs on Loihi against
the Intel Neural Compute Stick 2 in classifying static
images.
The remainder of this work is organized as follows. In
Section II, we provide an overview of the two hardware
platforms used in this work, the Intel Neural Compute Stick
2 and Intel Loihi, along with their respective APIs. In Section
978-1-6654-6550-2/22/$31.00 ©2022 IEEE
arXiv:2210.05006v2 [cs.LG] 22 Nov 2022