Energy Consumption of Neural Networks on NVIDIA Edge Boards an Empirical Model Seyyidahmed Lahmer Aria Khoshsirat Michele Rossi Andrea Zanella

2025-04-29 0 0 385.45KB 7 页 10玖币
侵权投诉
Energy Consumption of Neural Networks on
NVIDIA Edge Boards: an Empirical Model
Seyyidahmed Lahmer, Aria Khoshsirat, Michele Rossi, Andrea Zanella
Department of Information Engineering
University of Padova
Padova, Italy
{firstname.lastname}@unipd.it
Abstract—Recently, there has been a trend of shifting the
execution of deep learning inference tasks toward the edge of
the network, closer to the user, to reduce latency and preserve
data privacy. At the same time, growing interest is being
devoted to the energetic sustainability of machine learning. At
the intersection of these trends, in this paper we focus on the
energetic characterization of machine learning at the edge, which
is attracting increasing attention. Unfortunately, calculating the
energy consumption of a given neural network during inference
is complicated by the heterogeneity of the possible underlying
hardware implementation. In this work, we aim at profiling
the energetic consumption of inference tasks for some modern
edge nodes by deriving simple but accurate models. To this
end, we performed a large number of experiments to collect the
energy consumption of fully connected and convolutional layers
on two well-known edge boards by NVIDIA, namely, Jetson TX2
and Xavier. From these experimental measurements, we have
then distilled a simple and practical model that can provide an
estimate of the energy consumption of a certain inference task
on these edge computers. We believe that this model can prove
useful in many contexts as, for instance, to guide the search for
efficient neural network architectures, as a heuristic in neural
network pruning, to find energy-efficient offloading strategies in
a split computing context, or to evaluate and compare the energy
performance of deep neural network architectures.
Index Terms—Energy consumption, Deep Neural Networks,
Edge Computing, Inference
I. INTRODUCTION
Machine learning is being used in many applications,
exploiting the abundance of data in the modern era and
delivering state-of-the-art performance on a huge number of
tasks. For new emerging mobile applications, the traditional
way of running inference tasks in cloud computing facilities
and sending back the predictions to the end users is not
always feasible because of the need for preserving privacy and
ensuring low latency. On the other hand, shifting the inference
towards the end devices presents its challenges, due to the
limited resources available on these devices. To tackle these
limitations, computation offloading from resource-limited end
users to more powerful edge servers is being advocated as
a promising method to schedule and execute user-generated
tasks [1].
In fact, Edge Computing not only can provide faster online
computations, closer to the end users, but it can also exploit the
smart distribution and scheduling of computations to benefit
from renewable energy resources (RERs), so as to reduce
the carbon footprint of computing technology [2]. Besides
improving throughput and latency, the energy efficiency of
edge networks has gained much attention lately. For example,
reference [3] studies the whole network’s energy consumption,
including access points, edge servers, and user equipment for
a computation offloading scenario. According to this paper,
the more we push the computation from cloud servers to the
network’s edge, the more crucial it becomes to consider the
energy consumption of the models that are being exploited by
end user applications.
Although deep learning (DL) [4] has been known for its
great success in terms of accurate predictions in a wide variety
of tasks, energy and memory requirements of modern DL ar-
chitectures may make the use of large deep neural networks for
edge computing challenging. Split computing techniques have
been proposed to tackle this problem. They basically focus on
splitting a neural network at different candidate points, and
performing early exit at such candidate points to obtain a trade-
off between computing effort and quality of the result. This
facilitates the deployment of deep networks at the network’s
edge, see, e.g., [5]. Further, designing energy efficient neural
networks that have the same prediction accuracy as their more
power hungry versions is receiving much attention from the
research community [6], [7].
Overall, current developments are evolving along two main
axes: (i) providing online and energy efficient schedulers for
edge computing networks that allow end users to offload their
tasks, e.g., [1], [2], and (ii) devising new energy efficient
DL architectures, also entailing but not limited to the split
computing paradigm, e.g., [5]–[7]. We advocate that proper
designs along both axes would greatly benefit from accurate
energy consumption models of DL, especially tailored to
modern edge computing hardware. These models are largely
missing in the literature and are the objective of the present
work.
In most of the existing literature on edge task scheduling,
the energy cost models that were used for predicting the energy
consumption mainly used the number of CPU cycles required
to perform the tasks [8] or the amount of workload that a
task produces [9], using simple equations that proportionally
depend on the squared CPU frequency or on the workload.
While these models were very valuable to derive initial the-
ories and results on scheduling algorithms, they may not suit
arXiv:2210.01625v1 [cs.LG] 4 Oct 2022
well with the parallelizable computations on modern multi-
core processing unit architectures. In fact, an accurate energy
consumption estimation tool requires one to take into account
the architecture of the host device, the different parameters in
the neural network model that can exploit the parallel hardware
architectures, and the exact number of operations a neural
network requires for inference.
In this paper, we propose an experimentally validated and
simple energy consumption model for neural networks on
recent NVIDIA Jetson edge computers. The model allows one
to estimate the energy drained by performing inference tasks
on DL models composed of fully connected and convolutional
layers, without having to perform online measurements of
the energy drained. As we elaborate in the following, the
main indicator for the energy consumption is the total number
of multiply and accumulate (MAC) operations that are per-
formed, as expected. Based on this number, for a convolutional
layer, the energy consumption shows a multi-modal behavior
governed by the number of kernels that are exploited. The
derived empirical model is fully described by two hardware
dependent parameters, which are here provided for Jetson TX2
and Xavier NX boards from NVIDIA. The model fitting for a
simpler fully connected layer follows a similar rationale, but
only requires a single parameter and shows a single slope in
the MAC vs energy plot.
The remainder of this paper is organized as follows: The
related work is briefly commented in Section II. In Section III
we present the experiment setup and configurations. The ob-
servations and discussions, in addition to an energy estimation
model are provided in Section IV. Finally, conclusions and
future research lines are discussed in Section V.
II. RELATED WORK
Profiling the power/energy consumption of running Neural
Network (NN)s on low-power edge devices has gained an
increasing attention in recent years. In [10], the authors
measured the power consumption of an entire NN as well as
single NN layers on an NVIDIA Jetson Nano. A framework
that predicts the energy consumption of CNNs on the Jetson
TX1 based on real measurements has been proposed in [11].
This work is however still very preliminary, as it just presents
the general measurement setup/methodology and some limited
results. For the Jetson TX2 device, in [12] the authors have re-
ported the power consumption of GPU and CPU, the memory
usage and the time of executing the test phase on a fixed small
Convolutional Neural Network (CNN) architecture. Although
the results in this paper are measured from real hardware, no
analytical model is provided to gauge the energy consumption
of the edge board from the neural network parameters.
In a research paper more similar to our present work, but
based on simulations instead of real measurements [13], the
authors have provided an energy estimation tool for different
types of neural network layers. They have shown that the
energy consumption is not always proportional to the number
of computations or parameters involved in a layer. Our results
somehow confirm these observations, since the pure number of
operations, per se, is not sufficient to characterize the energy
consumption of the boards. Nonetheless, with a careful and
systematic analysis of the collected measurements, we were
able to identify the effect of the different computational model
parameters on the energy consumption of a single inference
stage and, hence, define a model that captures reasonably well
the experimental behavior of the computing boards.
To the best of our knowledge, this is the first work to
explore the real-world effect of choosing different configu-
rations of a NN layer on the energy consumption of two
NVIDIA Jetson edge devices (TX2 and Xavier NX), providing
a parameterized analytical energy estimation model based on
empirical measurements. Our model allows estimating the
energy consumption of any custom set of layer configurations
in common feed-forward deep neural networks.
III. EXPERIMENTAL SETUP
We experimentally characterize the energy consumption
of two energy-efficient embedded computing devices from
NVIDIA, namely, Jetson TX2, and Jetson Xavier NX. These
two edge computers are currently being used in several fields
such as manufacturing, agriculture, retail, life sciences, etc. For
instance, an image processing algorithm for thermal events
has been recently proposed for the Jetson TX2 [14]. The
configurations of both devices are shown in Table I (Jetson
TX2) and II (Jetson Xavier NX).
TABLE I: NVIDIA Jetson TX2 configurations
CPU Quad-Core ARM Cortex-A57 @ 2 GHz + Dual-Core
NVIDIA Denver2 @ 2 GHz
GPU NVIDIA Pascal 256 CUDA cores @ 1300 MHz
Memory 8 GB 128-bit LPDDR4 @ 1866 Mhz, 59.7 GB/s
Performance 1.3 TFLOPS
TABLE II: NVIDIA Jetson Xavier NX configurations
CPU 6-core NVIDIA Carmel ARM®v8.2 64-bit CPU 6 MB
L2 + 4 MB L3
GPU 384-core NVIDIA Volta™ GPU with 48 Tensor Cores
Memory 8 GB 128-bit LPDDR4x 59.7 GB/s
Performance 21 TFLOPS
To assess the energy profile of these edge computers, we
measure the timing and energy figures of neural network
architectures, focusing on one single layer of the whole NN
architecture. In fact, as demonstrated in [13], and also indepen-
dently verified by us, the energy consumption of two neural
network layers L1,L2that are executed in sequence adds up,
i.e., if their energy consumption is respectively E(L1) = E1
and E(L2) = E2, then sequentially using these two layers
into a single model results in a total energy consumption of
E(L1, L2)'E1+E2, where the approximation accounts
for the measurement noise and the intrinsic variability of
the energy consumption of each single layer (as it will be
seen later on in this paper). We hence focus our analysis on
two widely utilized layer types, namely fully connected and
convolutional, as better described in the following.
摘要:

EnergyConsumptionofNeuralNetworksonNVIDIAEdgeBoards:anEmpiricalModelSeyyidahmedLahmer,AriaKhoshsirat,MicheleRossi,AndreaZanellaDepartmentofInformationEngineeringUniversityofPadovaPadova,Italyfrstname.lastnameg@unipd.itAbstract—Recently,therehasbeenatrendofshiftingtheexecutionofdeeplearninginference...

展开>> 收起<<
Energy Consumption of Neural Networks on NVIDIA Edge Boards an Empirical Model Seyyidahmed Lahmer Aria Khoshsirat Michele Rossi Andrea Zanella.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:385.45KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注