NAS-Bench-Suite-Zero Accelerating Research on Zero Cost Proxies Arjun Krishnakumar1 Colin White2 Arber Zela1 Renbo Tu3

2025-05-02 0 0 3.18MB 37 页 10玖币
侵权投诉
NAS-Bench-Suite-Zero:
Accelerating Research on Zero Cost Proxies
Arjun Krishnakumar1, Colin White2, Arber Zela1, Renbo Tu3,
Mahmoud Safari1, Frank Hutter1,4
1University of Freiburg, 2Abacus.AI, 3University of Toronto,
4Bosch Center for Artificial Intelligence
Abstract
Zero-cost proxies (ZC proxies) are a recent architecture performance prediction
technique aiming to significantly speed up algorithms for neural architecture search
(NAS). Recent work has shown that these techniques show great promise, but cer-
tain aspects, such as evaluating and exploiting their complementary strengths, are
under-studied. In this work, we create
NAS-Bench-Suite-Zero
: we evaluate 13
ZC proxies across 28 tasks, creating by far the largest dataset (and unified codebase)
for ZC proxies, enabling orders-of-magnitude faster experiments on ZC proxies,
while avoiding confounding factors stemming from different implementations.
To demonstrate the usefulness of
NAS-Bench-Suite-Zero
, we run a large-scale
analysis of ZC proxies, including a bias analysis, and the first information-theoretic
analysis which concludes that ZC proxies capture substantial complementary infor-
mation. Motivated by these findings, we present a procedure to improve the perfor-
mance of ZC proxies by reducing biases such as cell size, and we also show that
incorporating all 13 ZC proxies into the surrogate models used by NAS algorithms
can improve their predictive performance by up to 42%. Our code and datasets are
available at https://github.com/automl/naslib/tree/zerocost.
1 Introduction
Algorithms for neural architecture search (NAS) seek to automate the design of high-performing
neural architectures for a given dataset. NAS has successfully been used to discover architectures
with better accuracy/latency tradeoffs than the best human-designed architectures [
5
,
9
,
27
,
37
]. Since
early NAS algorithms were prohibitively expensive to run [
57
], a long line of recent work has focused
on improving the runtime and efficiency of NAS methods (see [9, 48] for recent surveys).
A recent thread of research within NAS focuses on zero-cost proxies (ZC proxies) [
1
,
22
]. These
novel techniques aim to give an estimate of the (relative) performance of neural architectures from
just a single minibatch of data. Often taking just five seconds to run, these techniques are essentially
“zero cost” compared to training an architecture or to any other method of predicting the performance
of neural architectures [
47
]. Since the initial ZC proxy was introduced [
22
], there have been many
follow-up methods [
1
,
16
]. However, several recent works have shown that simple baselines such
as “number of parameters” and “FLOPS” are competitive with all existing ZC proxies across most
settings, and that most ZC proxies do not generalize well across different benchmarks, thus requiring
broader large-scale evaluations in order to assess their strengths [
2
,
24
]. A recent landscape overview
concluded that ZC proxies show great promise, but certain aspects are under-studied and their true
Equal contribution. Work done while RT was part-time at Abacus.AI. Email to:
{krishnan, zelaa, fh}@cs.uni-freiburg.de
,
colin@abacus.ai
,
renbo.tu@mail.utoronto.ca
,
safarim@informatik.uni-freiburg.de.
36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks.
arXiv:2210.03230v1 [cs.LG] 6 Oct 2022
 
 
 









Figure 1: Overview of
NAS-Bench-Suite-Zero
. We implement and pre-compute 13 ZC prox-
ies on 28 tasks in a unified framework, and then use this dataset to analyze the generalizability,
complementary information, biases, and NAS integration of ZC proxies.
potential has not been realized thus far [
44
]. In particular, it is still largely unknown whether ZC
proxies can be effectively combined, and how best to integrate ZC proxies into NAS algorithms.
In this work, we introduce
NAS-Bench-Suite-Zero
: a unified and extensible collection of 13
ZC proxies, accessible through a unified interface, which can be evaluated on a suite of 28 tasks
through
NASLib
[
29
] (see Figure 1). In addition to the codebase itself, we release precomputed
ZC proxy scores across all 13 ZC proxies and 28 tasks, which can be used to speed up ZC proxy
experiments. Specifically, we show that the runtime of ZC proxy experiments such as NAS analyses
and bias analyses are shortened by a factor of at least
103
when using the precomputed ZC proxies in
NAS-Bench-Suite-Zero
. By providing a unified framework with ready-to-use scripts to run large-
scale experiments,
NAS-Bench-Suite-Zero
eliminates the overhead for researchers to compare
against many other methods and across all popular NAS benchmark search spaces, helping the
community to rapidly increase the speed of research in this promising direction. Our benchmark
suite was very recently used successfully in the Zero Cost NAS Competition at AutoML-Conf 2022.
See Appendix E for more details. In Appendix A, we give detailed documentation, including a
datasheet [
10
], license, author responsibility, code of conduct, and maintenance plan. We welcome
contributions from the community and hope to grow the repository and benchmark suite as more ZC
proxies and NAS benchmarks are released.
To demonstrate the usefulness of
NAS-Bench-Suite-Zero
, we run a large-scale analysis of ZC prox-
ies: we give a thorough study of generalizability and biases, and we give the first information-theoretic
analysis. Interestingly, based on the bias study,
we present a concrete method for improving the
performance of a ZC proxy by reducing biases
(such as the tendency to favor larger architectures
or architectures with more
conv
operations). This may have important consequences for the future
design of ZC proxies. Furthermore, based on the information-theoretic analysis, we find that there
is high information gain of the validation accuracy when conditioned on multiple ZC proxies, sug-
gesting that ZC proxies do indeed compute substantial complementary information. Motivated by
these findings, we incorporate all 13 proxies into the surrogate models used by NAS algorithms
[
43
,
46
], showing that the Spearman rank correlation of the surrogate predictions can increase by up
to 42%. We show that this results in improved performance for two predictor-based NAS algorithms:
BANANAS [46] and NPENAS [43].
Our contributions. We summarize our main contributions below.
We release
NAS-Bench-Suite-Zero
, a collection of benchmarks and ZC proxies that unifies
and accelerates research on ZC proxies – a promising new sub-field of NAS – by enabling
orders-of-magnitude faster evaluations on a large suite of diverse benchmarks.
We run a large-scale analysis of 13 ZC proxies across 28 different combinations of search spaces
and tasks by studying the generalizability, bias, and mutual information among ZC proxies.
Motivated by our analysis, we present a procedure to improve the performance of ZC proxies by
reducing biases, and we show that the complementary information of ZC proxies can significantly
improve the predictive power of surrogate models commonly used for NAS.
2
Table 1: List of ZC proxies in
NAS-Bench-Suite-Zero
. Note that “neuron-wise” denotes whether
the total score is a sum of individual weights.
Name Data-dependent Neuron-wise Type In NAS-Bench-Suite-Zero
epe-nas [20] 37Jacobian 3
fisher [41] 3 3 Pruning-at-init 3
flops [24] 3 3 Baseline 3
grad-norm [1] 3 3 Pruning-at-init 3
grasp [42] 3 3 Pruning-at-init 3
l2-norm [1] 7 7 Baseline 3
jacov [22] 37Jacobian 3
nwot [22] 37Jacobian 3
params [24] 73Baseline 3
plain [1] 3 3 Baseline 3
snip [14] 3 3 Pruning-at-init 3
synflow [38] 73Pruning-at-init 3
zen-score [16] 7 7 Piece. Lin. 3
2 Background and Related Work
Given a dataset and a search space – a large set of neural architectures – NAS seeks to find the
architecture with the highest validation accuracy (or the best application-specific trade-off among
accuracy, latency, size, and so on) on the dataset. NAS has been studied since the late 1980s [
23
,
39
]
and has seen a resurgence in the last few years [
17
,
57
], with over 1000 papers on NAS in the last
two years alone. For a survey of the different techniques used for NAS, see [9, 48].
Many NAS methods make use of performance prediction. A performance prediction method is
any function which predicts the (relative) performance of architectures, without fully training the
architectures [
47
]. BRP-NAS [
8
], BONAS [
33
], and BANANAS [
46
] are all examples of NAS
methods that make use of performance prediction. While performance prediction speeds up NAS
algorithms by avoiding fully training neural networks, many still require non-trivial computation time.
On the other hand, a recently-proposed line of techniques, zero-cost proxies (ZC proxies) require just
a single forward pass through the network, often taking just five seconds [22].
Zero-cost proxies. The original ZC proxy estimated the separability of the minibatch of data into
different linear regions of the output space [
22
]. Many other ZC proxies have been proposed since
then, including data-independent ZC proxies [
1
,
15
,
16
,
38
], ZC proxies inspired by pruning-at-
initialization techniques [
1
,
14
,
38
,
42
], and ZC proxies inspired by neural tangent kernels [
4
,
34
].
See Table 1 for a full list of the ZC proxies we use in this paper. We describe theoretical ZC proxy
results in Appendix B.1.
Search spaces and tasks.
In our experiments, we make use of several different NAS benchmark
search spaces and tasks. NAS-Bench-101 [
53
] is a popular cell-based search space for NAS research.
It consists of 423 624 architectures trained on CIFAR-10. The cell-based search space is designed
to model ResNet-like and Inception-like cells [
12
,
36
]. NAS-Bench-201 [
6
] is a cell-based search
space consisting of 15 625 architectures (6 466 non-isomorphic) trained on CIFAR-10, CIFAR-100,
and ImageNet16-120. NAS-Bench-301 [
55
] is a surrogate NAS benchmark for the DARTS search
space [
18
]. The search space consists of normal cell and reduction cells, with
1018
total architectures.
TransNAS-Bench-101 [
7
] is a NAS benchmark consisting of two different search spaces: a “micro”
(cell-based) search space of size 4 096, and a macro search space of size 3 256. The architectures are
trained on seven different tasks from the Taskonomy dataset [
54
]. NAS-Bench-Suite [
21
] collects
these search spaces and tasks within the unified framework of
NASLib
[
29
]. In this work, we extend
this collection by adding two datasets from NAS-Bench-360 [
40
], SVHN, and four datasets from
Taskonomy. NAS-Bench-360 is a collection of diverse tasks that are ready-to-use for NAS research.
Large-scale studies of ZC proxies.
A few recent works [
2
,
24
,
44
,
47
] investigated the perfor-
mance of ZC proxies in ranking architectures over different NAS benchmarks, showing that the
relative performance highly depends on the search space, but none study more than 12 total tasks, and
none make the ZC proxy values publicly available. Two predictor-based NAS methods have recently
been introduced: OMNI [
47
] and ProxyBO [
32
]. However, OMNI only uses a single ZC proxy, and
3
Table 2: Overview of ZC proxy evaluations in
NAS-Bench-Suite-Zero
.
Note that EPE-NAS is
only defined for classification tasks [20].
Search space Tasks Num. ZC proxies Num. architectures Total ZC proxy evaluations
NAS-Bench-101 1 13 10 000 130 000
NAS-Bench-201 3 13 15 625 609 375
NAS-Bench-301 1 13 11 221 145 873
TransNAS-Bench-101-Micro 7 123 256 273 504
TransNAS-Bench-101-Macro 7 124 096 344 064
Add’l. 201, 301, TNB-Micro 9 13 600 23400
Total 28 13 44 798 1 526 216
while ProxyBO uses three, the algorithm dynamically chooses one in each iteration (so individual
predictions are made using a single ZC proxy at a time). Recently, NAS-Bench-Zero was introduced
[
2
], a new benchmark based on popular computer vision models ResNet [
12
] and MobileNetV2
[
30
], which includes 10 ZC proxies. However, the NAS-Bench-Zero dataset is currently not publicly
available. For more related work details, see Appendix B.
Only two prior works combine the information of multiple ZC proxies together in architecture
predictions [
1
,
2
] and both only use the voting strategy to combine at most four ZC proxies. Our
work is the first to publicly release ZC proxy values, combine ZC proxies in a nontrivial way, and
exploit the complementary information of 13 ZC proxies simultaneously.
3 Overview of NAS-Bench-Suite-Zero
In this section, we give an overview of the
NAS-Bench-Suite-Zero
codebase and dataset, which
allows researchers to quickly develop ZC proxies, compare against existing ZC proxies across diverse
datasets, and integrate them into NAS algorithms, as shown in Sections 4 and 5.
We implement all ZC proxies from Table 1 in the same codebase (
NASLib
[
29
]). For all ZC proxies,
we use the default implementation from the original work. While this list covers 13 ZC proxies,
the majority of ZC proxies released to date, we did not yet include a few other ZC proxies, for
example, due to requiring a trained supernetwork to make evaluations [
4
,
34
] (therefore needing to
implement a supernetwork on 28 benchmarks), implementation in TensorFlow rather than PyTorch
[
25
], or unreleased code. Our modular framework easily allows additional ZC proxies to be added to
NAS-Bench-Suite-Zero in the future.
To build
NAS-Bench-Suite-Zero
, we extend the collection of
NASLib
s publicly available bench-
marks, known as NAS-Bench-Suite [
21
]. This allows us to evaluate and fairly compare all ZC
proxies in the same framework without confounding factors stemming from different implemen-
tations, software versions or training pipelines. Specifically, for the search spaces and tasks, we
use NAS-Bench-101 (CIFAR-10), NAS-Bench-201 (CIFAR-10, CIFAR-100, and ImageNet16-120),
NAS-Bench-301 (CIFAR-10), and TransNAS-Bench-101 Micro and Macro (Jigsaw, Object Classifi-
cation, Scene Classification, Autoencoder) from NAS-Bench-Suite. We add the remaining tasks from
TransNAS-Bench-101 (Room Layout, Surface Normal, Semantic Segmentation), and three tasks each
for NAS-Bench-201, NAS-Bench-301, and TransNAS-Bench-101-Micro: Spherical-CIFAR-100,
NinaPro, and SVHN. This yields a total of 28 benchmarks in our analysis. For all NAS-Bench-201
and TransNAS-Bench-101 tasks, we evaluate all ZC proxy values and the respective runtimes, for
all architectures. For NAS-Bench-301, we evaluate on all 11 221 randomly sampled architectures
from the NAS-Bench-301 dataset, due to the computational infeasibility of exhaustively evaluating
the full set of
1018
architectures. Similarly, we evaluate 10 000 architectures from NAS-Bench-101.
Finally, for Spherical-CIFAR-100, NinaPro, and SVHN, we evaluate 200 architectures per search
space, since only 200 architectures are fully trained for each of these tasks. See Table 2.
We run all ZC proxies from Table 1 on Intel Xeon Gold 6242 CPUs and save their evaluations in
order to create a queryable table with these pre-computed values. We use a batch size of 64 for all ZC
proxy evaluations, except for the case of TransNAS-Bench-101: due to the extreme memory usage of
the Taskonomy tasks (
>30
GB memory), we used a batch size of 32. The total computation time for
all 1.5M evaluations was 1100 CPU hours.
4
Figure 2: Spearman rank correlation coefficient between ZC proxy values and validation accuracies,
for each ZC proxy and benchmark. The rows and columns are ordered based on the mean scores
across columns and rows, respectively.
Speedups and recommended usage.
The average time to compute a ZC proxy across all tasks is
2.6 seconds, and the maximum time (computing
grasp
on TNB-Macro Autoencoder) is 205 seconds,
compared to 105seconds when instead querying the NAS-Bench-Suite-ZeroAPI.
When researchers evaluate ZC proxy-based NAS algorithms using queryable NAS benchmarks, the
bottleneck is often (ironically) the ZC proxy evaluations. For example, for OMNI [
47
] or ProxyBO
[
32
] running for 100 iterations and 100 candidates per iteration, the total evaluation time is roughly 9
hours, yet they can be run on
NAS-Bench-Suite-Zero
in under one minute. Across all experiments
done in this paper (mutual information study, bias study, NAS study, etc.), we calculate that using
NAS-Bench-Suite-Zero
decreases the computation time by at least three orders of magnitude. See
Appendix C.4 for more details.
Since
NAS-Bench-Suite-Zero
reduces the runtime of experiments by at least three orders
of magnitude (on queryable NAS benchmarks), we recommend researchers take advantage of
NAS-Bench-Suite-Zero
to (i) run hundreds of trials of ZC proxy-based NAS algorithms, to reach
statistically significant conclusions, (ii) run extensive ablation studies, including the type and usage
of ZC proxies, and (iii) increase the total number of ZC proxies evaluated in the NAS algorithm.
Finally, when using
NAS-Bench-Suite-Zero
, researchers should report the real-world time NAS
algorithms would take, by adding the time to run each ZC proxy evaluation (which can be queried in
NAS-Bench-Suite-Zero) to the total runtime of the NAS algorithm.
4 Generalizability, Mutual Information, and Bias of ZC Proxies
In this section, we use
NAS-Bench-Suite-Zero
to study concrete research questions relating to the
generalizability, complementary information, and bias of ZC proxies.
4.1 RQ 1: How well do ZC proxies generalize across different benchmarks?
In Figure 2, for each ZC proxy and each benchmark, we compute the Spearman rank correlation
between the ZC proxy values and the validation accuracies over a set of 1000 randomly drawn
architectures (see Appendix C for the full results on all benchmarks). Out of all the ZC proxies,
nwot
and
flops
have the highest rank correlations across all benchmarks. On some of the benchmarks,
such as TransNAS-Bench-101-Micro Autoencoder and Room Layout, all of the ZC proxies exhibit
poor performance on average, while on the widely used NAS-Bench-201 benchmarks, almost all of
them perform well. Several methods, such as
snip
and
grasp
, perform well on the NAS-Bench-201
tasks, but on average are outperformed by params and flops on the other benchmarks.
Although no ZC proxy performs consistently across all benchmarks, we may ask a related question:
is the performance of all ZC proxies across benchmarks correlated enough to capture similarities
among benchmarks? In other words, can we use ZC proxies as a tool to assess the similarities among
tasks. This is particularly important in meta-learning or transfer learning, where a meta-algorithm
aims to learn and transfer knowledge across a set of similar tasks. To answer this question, we
5
摘要:

NAS-Bench-Suite-Zero:AcceleratingResearchonZeroCostProxiesArjunKrishnakumar1,ColinWhite2,ArberZela1,RenboTu3,MahmoudSafari1,FrankHutter1;41UniversityofFreiburg,2Abacus.AI,3UniversityofToronto,4BoschCenterforArticialIntelligenceAbstractZero-costproxies(ZCproxies)arearecentarchitectureperformance...

展开>> 收起<<
NAS-Bench-Suite-Zero Accelerating Research on Zero Cost Proxies Arjun Krishnakumar1 Colin White2 Arber Zela1 Renbo Tu3.pdf

共37页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:37 页 大小:3.18MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 37
客服
关注