
well with the parallelizable computations on modern multi-
core processing unit architectures. In fact, an accurate energy
consumption estimation tool requires one to take into account
the architecture of the host device, the different parameters in
the neural network model that can exploit the parallel hardware
architectures, and the exact number of operations a neural
network requires for inference.
In this paper, we propose an experimentally validated and
simple energy consumption model for neural networks on
recent NVIDIA Jetson edge computers. The model allows one
to estimate the energy drained by performing inference tasks
on DL models composed of fully connected and convolutional
layers, without having to perform online measurements of
the energy drained. As we elaborate in the following, the
main indicator for the energy consumption is the total number
of multiply and accumulate (MAC) operations that are per-
formed, as expected. Based on this number, for a convolutional
layer, the energy consumption shows a multi-modal behavior
governed by the number of kernels that are exploited. The
derived empirical model is fully described by two hardware
dependent parameters, which are here provided for Jetson TX2
and Xavier NX boards from NVIDIA. The model fitting for a
simpler fully connected layer follows a similar rationale, but
only requires a single parameter and shows a single slope in
the MAC vs energy plot.
The remainder of this paper is organized as follows: The
related work is briefly commented in Section II. In Section III
we present the experiment setup and configurations. The ob-
servations and discussions, in addition to an energy estimation
model are provided in Section IV. Finally, conclusions and
future research lines are discussed in Section V.
II. RELATED WORK
Profiling the power/energy consumption of running Neural
Network (NN)s on low-power edge devices has gained an
increasing attention in recent years. In [10], the authors
measured the power consumption of an entire NN as well as
single NN layers on an NVIDIA Jetson Nano. A framework
that predicts the energy consumption of CNNs on the Jetson
TX1 based on real measurements has been proposed in [11].
This work is however still very preliminary, as it just presents
the general measurement setup/methodology and some limited
results. For the Jetson TX2 device, in [12] the authors have re-
ported the power consumption of GPU and CPU, the memory
usage and the time of executing the test phase on a fixed small
Convolutional Neural Network (CNN) architecture. Although
the results in this paper are measured from real hardware, no
analytical model is provided to gauge the energy consumption
of the edge board from the neural network parameters.
In a research paper more similar to our present work, but
based on simulations instead of real measurements [13], the
authors have provided an energy estimation tool for different
types of neural network layers. They have shown that the
energy consumption is not always proportional to the number
of computations or parameters involved in a layer. Our results
somehow confirm these observations, since the pure number of
operations, per se, is not sufficient to characterize the energy
consumption of the boards. Nonetheless, with a careful and
systematic analysis of the collected measurements, we were
able to identify the effect of the different computational model
parameters on the energy consumption of a single inference
stage and, hence, define a model that captures reasonably well
the experimental behavior of the computing boards.
To the best of our knowledge, this is the first work to
explore the real-world effect of choosing different configu-
rations of a NN layer on the energy consumption of two
NVIDIA Jetson edge devices (TX2 and Xavier NX), providing
a parameterized analytical energy estimation model based on
empirical measurements. Our model allows estimating the
energy consumption of any custom set of layer configurations
in common feed-forward deep neural networks.
III. EXPERIMENTAL SETUP
We experimentally characterize the energy consumption
of two energy-efficient embedded computing devices from
NVIDIA, namely, Jetson TX2, and Jetson Xavier NX. These
two edge computers are currently being used in several fields
such as manufacturing, agriculture, retail, life sciences, etc. For
instance, an image processing algorithm for thermal events
has been recently proposed for the Jetson TX2 [14]. The
configurations of both devices are shown in Table I (Jetson
TX2) and II (Jetson Xavier NX).
TABLE I: NVIDIA Jetson TX2 configurations
CPU Quad-Core ARM Cortex-A57 @ 2 GHz + Dual-Core
NVIDIA Denver2 @ 2 GHz
GPU NVIDIA Pascal 256 CUDA cores @ 1300 MHz
Memory 8 GB 128-bit LPDDR4 @ 1866 Mhz, 59.7 GB/s
Performance 1.3 TFLOPS
TABLE II: NVIDIA Jetson Xavier NX configurations
CPU 6-core NVIDIA Carmel ARM®v8.2 64-bit CPU 6 MB
L2 + 4 MB L3
GPU 384-core NVIDIA Volta™ GPU with 48 Tensor Cores
Memory 8 GB 128-bit LPDDR4x 59.7 GB/s
Performance 21 TFLOPS
To assess the energy profile of these edge computers, we
measure the timing and energy figures of neural network
architectures, focusing on one single layer of the whole NN
architecture. In fact, as demonstrated in [13], and also indepen-
dently verified by us, the energy consumption of two neural
network layers L1,L2that are executed in sequence adds up,
i.e., if their energy consumption is respectively E(L1) = E1
and E(L2) = E2, then sequentially using these two layers
into a single model results in a total energy consumption of
E(L1, L2)'E1+E2, where the approximation accounts
for the measurement noise and the intrinsic variability of
the energy consumption of each single layer (as it will be
seen later on in this paper). We hence focus our analysis on
two widely utilized layer types, namely fully connected and
convolutional, as better described in the following.