Energy Consumption of Neural Networks on NVIDIA Edge Boards an Empirical Model Seyyidahmed Lahmer Aria Khoshsirat Michele Rossi Andrea Zanella

2025-04-29 0 0 385.45KB 7 页 10玖币

侵权投诉

Energy Consumption of Neural Networks on

NVIDIA Edge Boards: an Empirical Model

Seyyidahmed Lahmer, Aria Khoshsirat, Michele Rossi, Andrea Zanella

Department of Information Engineering

University of Padova

Padova, Italy

{ﬁrstname.lastname}@unipd.it

Abstract—Recently, there has been a trend of shifting the

execution of deep learning inference tasks toward the edge of

the network, closer to the user, to reduce latency and preserve

data privacy. At the same time, growing interest is being

devoted to the energetic sustainability of machine learning. At

the intersection of these trends, in this paper we focus on the

energetic characterization of machine learning at the edge, which

is attracting increasing attention. Unfortunately, calculating the

energy consumption of a given neural network during inference

is complicated by the heterogeneity of the possible underlying

hardware implementation. In this work, we aim at proﬁling

the energetic consumption of inference tasks for some modern

edge nodes by deriving simple but accurate models. To this

end, we performed a large number of experiments to collect the

energy consumption of fully connected and convolutional layers

on two well-known edge boards by NVIDIA, namely, Jetson TX2

and Xavier. From these experimental measurements, we have

then distilled a simple and practical model that can provide an

estimate of the energy consumption of a certain inference task

on these edge computers. We believe that this model can prove

useful in many contexts as, for instance, to guide the search for

efﬁcient neural network architectures, as a heuristic in neural

network pruning, to ﬁnd energy-efﬁcient ofﬂoading strategies in

a split computing context, or to evaluate and compare the energy

performance of deep neural network architectures.

Index Terms—Energy consumption, Deep Neural Networks,

Edge Computing, Inference

I. INTRODUCTION

Machine learning is being used in many applications,

exploiting the abundance of data in the modern era and

delivering state-of-the-art performance on a huge number of

tasks. For new emerging mobile applications, the traditional

way of running inference tasks in cloud computing facilities

and sending back the predictions to the end users is not

always feasible because of the need for preserving privacy and

ensuring low latency. On the other hand, shifting the inference

towards the end devices presents its challenges, due to the

limited resources available on these devices. To tackle these

limitations, computation ofﬂoading from resource-limited end

users to more powerful edge servers is being advocated as

a promising method to schedule and execute user-generated

tasks [1].

In fact, Edge Computing not only can provide faster online

computations, closer to the end users, but it can also exploit the

smart distribution and scheduling of computations to beneﬁt

from renewable energy resources (RERs), so as to reduce

the carbon footprint of computing technology [2]. Besides

improving throughput and latency, the energy efﬁciency of

edge networks has gained much attention lately. For example,

reference [3] studies the whole network’s energy consumption,

including access points, edge servers, and user equipment for

a computation ofﬂoading scenario. According to this paper,

the more we push the computation from cloud servers to the

network’s edge, the more crucial it becomes to consider the

energy consumption of the models that are being exploited by

end user applications.

Although deep learning (DL) [4] has been known for its

great success in terms of accurate predictions in a wide variety

of tasks, energy and memory requirements of modern DL ar-

chitectures may make the use of large deep neural networks for

edge computing challenging. Split computing techniques have

been proposed to tackle this problem. They basically focus on

splitting a neural network at different candidate points, and

performing early exit at such candidate points to obtain a trade-

off between computing effort and quality of the result. This

facilitates the deployment of deep networks at the network’s

edge, see, e.g., [5]. Further, designing energy efﬁcient neural

networks that have the same prediction accuracy as their more

power hungry versions is receiving much attention from the

research community [6], [7].

Overall, current developments are evolving along two main

axes: (i) providing online and energy efﬁcient schedulers for

edge computing networks that allow end users to ofﬂoad their

tasks, e.g., [1], [2], and (ii) devising new energy efﬁcient

DL architectures, also entailing but not limited to the split

computing paradigm, e.g., [5]–[7]. We advocate that proper

designs along both axes would greatly beneﬁt from accurate

energy consumption models of DL, especially tailored to

modern edge computing hardware. These models are largely

missing in the literature and are the objective of the present

work.

In most of the existing literature on edge task scheduling,

the energy cost models that were used for predicting the energy

consumption mainly used the number of CPU cycles required

to perform the tasks [8] or the amount of workload that a

task produces [9], using simple equations that proportionally

depend on the squared CPU frequency or on the workload.

While these models were very valuable to derive initial the-

ories and results on scheduling algorithms, they may not suit

arXiv:2210.01625v1 [cs.LG] 4 Oct 2022

well with the parallelizable computations on modern multi-

core processing unit architectures. In fact, an accurate energy

consumption estimation tool requires one to take into account

the architecture of the host device, the different parameters in

the neural network model that can exploit the parallel hardware

architectures, and the exact number of operations a neural

network requires for inference.

In this paper, we propose an experimentally validated and

simple energy consumption model for neural networks on

recent NVIDIA Jetson edge computers. The model allows one

to estimate the energy drained by performing inference tasks

on DL models composed of fully connected and convolutional

layers, without having to perform online measurements of

the energy drained. As we elaborate in the following, the

main indicator for the energy consumption is the total number

of multiply and accumulate (MAC) operations that are per-

formed, as expected. Based on this number, for a convolutional

layer, the energy consumption shows a multi-modal behavior

governed by the number of kernels that are exploited. The

derived empirical model is fully described by two hardware

dependent parameters, which are here provided for Jetson TX2

and Xavier NX boards from NVIDIA. The model ﬁtting for a

simpler fully connected layer follows a similar rationale, but

only requires a single parameter and shows a single slope in

the MAC vs energy plot.

The remainder of this paper is organized as follows: The

related work is brieﬂy commented in Section II. In Section III

we present the experiment setup and conﬁgurations. The ob-

servations and discussions, in addition to an energy estimation

model are provided in Section IV. Finally, conclusions and

future research lines are discussed in Section V.

II. RELATED WORK

Proﬁling the power/energy consumption of running Neural

Network (NN)s on low-power edge devices has gained an

increasing attention in recent years. In [10], the authors

measured the power consumption of an entire NN as well as

single NN layers on an NVIDIA Jetson Nano. A framework

that predicts the energy consumption of CNNs on the Jetson

TX1 based on real measurements has been proposed in [11].

This work is however still very preliminary, as it just presents

the general measurement setup/methodology and some limited

results. For the Jetson TX2 device, in [12] the authors have re-

ported the power consumption of GPU and CPU, the memory

usage and the time of executing the test phase on a ﬁxed small

Convolutional Neural Network (CNN) architecture. Although

the results in this paper are measured from real hardware, no

analytical model is provided to gauge the energy consumption

of the edge board from the neural network parameters.

In a research paper more similar to our present work, but

based on simulations instead of real measurements [13], the

authors have provided an energy estimation tool for different

types of neural network layers. They have shown that the

energy consumption is not always proportional to the number

of computations or parameters involved in a layer. Our results

somehow conﬁrm these observations, since the pure number of

operations, per se, is not sufﬁcient to characterize the energy

consumption of the boards. Nonetheless, with a careful and

systematic analysis of the collected measurements, we were

able to identify the effect of the different computational model

parameters on the energy consumption of a single inference

stage and, hence, deﬁne a model that captures reasonably well

the experimental behavior of the computing boards.

To the best of our knowledge, this is the ﬁrst work to

explore the real-world effect of choosing different conﬁgu-

rations of a NN layer on the energy consumption of two

NVIDIA Jetson edge devices (TX2 and Xavier NX), providing

a parameterized analytical energy estimation model based on

empirical measurements. Our model allows estimating the

energy consumption of any custom set of layer conﬁgurations

in common feed-forward deep neural networks.

III. EXPERIMENTAL SETUP

We experimentally characterize the energy consumption

of two energy-efﬁcient embedded computing devices from

NVIDIA, namely, Jetson TX2, and Jetson Xavier NX. These

two edge computers are currently being used in several ﬁelds

such as manufacturing, agriculture, retail, life sciences, etc. For

instance, an image processing algorithm for thermal events

has been recently proposed for the Jetson TX2 [14]. The

conﬁgurations of both devices are shown in Table I (Jetson

TX2) and II (Jetson Xavier NX).

TABLE I: NVIDIA Jetson TX2 conﬁgurations

CPU Quad-Core ARM Cortex-A57 @ 2 GHz + Dual-Core

NVIDIA Denver2 @ 2 GHz

GPU NVIDIA Pascal 256 CUDA cores @ 1300 MHz

Memory 8 GB 128-bit LPDDR4 @ 1866 Mhz, 59.7 GB/s

Performance 1.3 TFLOPS

TABLE II: NVIDIA Jetson Xavier NX conﬁgurations

CPU 6-core NVIDIA Carmel ARM®v8.2 64-bit CPU 6 MB

L2 + 4 MB L3

GPU 384-core NVIDIA Volta™ GPU with 48 Tensor Cores

Memory 8 GB 128-bit LPDDR4x 59.7 GB/s

Performance 21 TFLOPS

To assess the energy proﬁle of these edge computers, we

measure the timing and energy ﬁgures of neural network

architectures, focusing on one single layer of the whole NN

architecture. In fact, as demonstrated in [13], and also indepen-

dently veriﬁed by us, the energy consumption of two neural

network layers L1,L2that are executed in sequence adds up,

i.e., if their energy consumption is respectively E(L1) = E1

and E(L2) = E2, then sequentially using these two layers

into a single model results in a total energy consumption of

E(L1, L2)'E1+E2, where the approximation accounts

for the measurement noise and the intrinsic variability of

the energy consumption of each single layer (as it will be

seen later on in this paper). We hence focus our analysis on

two widely utilized layer types, namely fully connected and

convolutional, as better described in the following.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

EnergyConsumptionofNeuralNetworksonNVIDIAEdgeBoards:anEmpiricalModelSeyyidahmedLahmer,AriaKhoshsirat,MicheleRossi,AndreaZanellaDepartmentofInformationEngineeringUniversityofPadovaPadova,Italyfrstname.lastnameg@unipd.itAbstractRecently,therehasbeenatrendofshiftingtheexecutionofdeeplearninginference...

展开>> 收起<<

Energy Consumption of Neural Networks on NVIDIA Edge Boards an Empirical Model Seyyidahmed Lahmer Aria Khoshsirat Michele Rossi Andrea Zanella.pdf

共7页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Energy Consumption of Neural Networks on NVIDIA Edge Boards an Empirical Model Seyyidahmed Lahmer Aria Khoshsirat Michele Rossi Andrea Zanella

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: