
2. ENERGY AWARENESS IN NAS
Building upon the foundational NAS-Bench-101 [10], we
introduce our benchmark, EC-NAS, to accentuate the im-
perative of energy efficiency in NAS. Our adaptation of this
dataset, initially computed using an exorbitant 100 TPU
years equivalent of compute time, serves our broader mission
of steering NAS methodologies towards energy consumption
awareness.
2.1. Architectural Design and Blueprint
Central to our method are architectures tailored for CIFAR-
10 image classification [15]. We introduce additional objec-
tives for emphasizing the significance of hardware-specific
efficiency trends in deep learning models. The architec-
tural space is confined to the topological space of cells,
with each cell being a configurable feedforward network.
In terms of cell encoding, these individual cells are rep-
resented as directed acyclic graphs (DAGs). Each DAG,
G(V, M), has N=|V|vertices (or nodes) and edges de-
scribed in a binary adjacency matrix M∈ {0,1}N×N.
The set of operations (labels) that each node can realise
is given by L′={input,output} ∪ L, where L=
{3x3conv,1x1conv,3x3maxpool}. Two of the Nnodes
are always fixed as input and output to the network. The
remaining N−2nodes can take up one of the labels in L. The
connections between nodes of the DAG are encoded in the
upper-triangular adjacency matrix with no self-connections
(zero main diagonal entries). For a given architecture, A,
every entry αi,j ∈MAdenotes an edge, from node ito node
jwith operations i, j ∈ L and its labelled adjacency matrix,
LA∈MA× L′.
2.2. Energy Measures in NAS
Traditional benchmarks, while insightful, often fall short
of providing a complete energy consumption profile. In
EC-NAS, we bring the significance of energy meaures to
the forefront, crafting a comprehensive view that synthesizes
both hardware and software intricacies. The mainstays of
neural network training – GPUs and TPUs – are notorious
for their high energy consumption [6,16]. To capture these
nuances, we utilize and adopt the Carbontracker tool
[6] to our specific needs, allowing us to observe total energy
costs, computational times, and aggregate carbon footprints.
2.3. Surrogate Model for Energy Estimation
The landscape of NAS has transformed to encompass a
broader spectrum of metrics. Energy consumption, pivotal
during model training, offers insights beyond the purview
of traditional measures such as floating-point operations
(FPOPs) and computational time. Given the variability in
computational time, owing to diverse factors like parallel
infrastructure, this metric can occasionally be misleading.
5 10 15 20 25 30 35 40
Actual Energy (kWh)
0
10
20
30
40
Predicted Energy (kWh)
Kendall-Tau R2= 0.9030
0 500 1000 1500 2000 2500 3000
No. of training datapoints
0.5
1.0
1.5
2.0
2.5
3.0
MAE on fixed test set
Fig. 2. Scatter plot depicting the Kendall-Tau correlation co-
efficient between predicted and actual energy consumption
(left) and the influence of training data size on test accuracy
(right). Error bars are based on 10 random initializations.
Energy consumption, in contrast, lends itself as a more con-
sistent and comprehensive measure, factoring in software and
hardware variations. We measure the energy consumption of
training the architectures on the CIFAR-10 dataset, follow-
ing the protocols to NAS-Bench-101. The in-house SLURM
cluster, powered by an NVIDIA Quadro RTX 6000 GPU and
two Intel CPUs, provides an optimal environment.
The vast architecture space, however, introduces chal-
lenges in the direct energy estimation. Our remedy to this
is a surrogate model approach, wherein we derived insights
to guide a multi-layer perceptron (MLP) model by training
using a representative subset of architectures. This surrogate
model adeptly predicts energy consumption patterns, bridg-
ing computational demand and energy efficiency. Its efficacy
is highlighted by the strong correlation between its predic-
tions and actual energy consumption values, as illustrated in
Figure 2.
2.4. Dataset Analysis and Hardware Consistency
Understanding architectural characteristics and the trade-offs
they introduce is crucial. This involves studying operations,
their impacts on efficiency and performance, as well as the
overarching influence of hardware on energy costs. Training
time and energy consumption trends naturally increase with
model size. However, gains in performance tend to plateau for
models characterized by larger DAGs. Interestingly, while pa-
rameter variation across model sizes remains minimal, train-
ing time and energy consumption show more significant vari-
ability for more extensive models. These findings highlight
the multifaceted factors affecting performance and efficiency.
Different operations can also have a profound impact on
performance. For instance, specific operation replacements
significantly boost validation accuracy while increasing en-
ergy consumption without increasing training time. This
complex relationship between training time, energy con-
sumption and performance underscore the importance of a
comprehensive approach in NAS. The impact of swapping
one operation for another on various metrics, including en-
ergy consumption, training time, validation accuracy, and
parameter count, is captured in Figure 3.
In EC-NAS, we further probed the energy consumption