NAS-Bench-Suite-Zero Accelerating Research on Zero Cost Proxies Arjun Krishnakumar1 Colin White2 Arber Zela1 Renbo Tu3

2025-05-02 1 0 3.18MB 37 页 10玖币

侵权投诉

NAS-Bench-Suite-Zero:

Accelerating Research on Zero Cost Proxies

Arjun Krishnakumar∗1, Colin White∗2, Arber Zela∗1, Renbo Tu∗3,

Mahmoud Safari1, Frank Hutter1,4

1University of Freiburg, 2Abacus.AI, 3University of Toronto,

4Bosch Center for Artiﬁcial Intelligence

Abstract

Zero-cost proxies (ZC proxies) are a recent architecture performance prediction

technique aiming to signiﬁcantly speed up algorithms for neural architecture search

(NAS). Recent work has shown that these techniques show great promise, but cer-

tain aspects, such as evaluating and exploiting their complementary strengths, are

under-studied. In this work, we create

NAS-Bench-Suite-Zero

: we evaluate 13

ZC proxies across 28 tasks, creating by far the largest dataset (and uniﬁed codebase)

for ZC proxies, enabling orders-of-magnitude faster experiments on ZC proxies,

while avoiding confounding factors stemming from different implementations.

To demonstrate the usefulness of

NAS-Bench-Suite-Zero

, we run a large-scale

analysis of ZC proxies, including a bias analysis, and the ﬁrst information-theoretic

analysis which concludes that ZC proxies capture substantial complementary infor-

mation. Motivated by these ﬁndings, we present a procedure to improve the perfor-

mance of ZC proxies by reducing biases such as cell size, and we also show that

incorporating all 13 ZC proxies into the surrogate models used by NAS algorithms

can improve their predictive performance by up to 42%. Our code and datasets are

available at https://github.com/automl/naslib/tree/zerocost.

1 Introduction

Algorithms for neural architecture search (NAS) seek to automate the design of high-performing

neural architectures for a given dataset. NAS has successfully been used to discover architectures

with better accuracy/latency tradeoffs than the best human-designed architectures [

]. Since

early NAS algorithms were prohibitively expensive to run [

], a long line of recent work has focused

on improving the runtime and efﬁciency of NAS methods (see [9, 48] for recent surveys).

A recent thread of research within NAS focuses on zero-cost proxies (ZC proxies) [

]. These

novel techniques aim to give an estimate of the (relative) performance of neural architectures from

just a single minibatch of data. Often taking just ﬁve seconds to run, these techniques are essentially

“zero cost” compared to training an architecture or to any other method of predicting the performance

of neural architectures [

]. Since the initial ZC proxy was introduced [

], there have been many

follow-up methods [

]. However, several recent works have shown that simple baselines such

as “number of parameters” and “FLOPS” are competitive with all existing ZC proxies across most

settings, and that most ZC proxies do not generalize well across different benchmarks, thus requiring

broader large-scale evaluations in order to assess their strengths [

]. A recent landscape overview

concluded that ZC proxies show great promise, but certain aspects are under-studied and their true

∗Equal contribution. Work done while RT was part-time at Abacus.AI. Email to:

{krishnan, zelaa, fh}@cs.uni-freiburg.de

colin@abacus.ai

renbo.tu@mail.utoronto.ca

safarim@informatik.uni-freiburg.de.

36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks.

arXiv:2210.03230v1 [cs.LG] 6 Oct 2022

 

 

 



















Figure 1: Overview of

NAS-Bench-Suite-Zero

. We implement and pre-compute 13 ZC prox-

ies on 28 tasks in a uniﬁed framework, and then use this dataset to analyze the generalizability,

complementary information, biases, and NAS integration of ZC proxies.

potential has not been realized thus far [

]. In particular, it is still largely unknown whether ZC

proxies can be effectively combined, and how best to integrate ZC proxies into NAS algorithms.

In this work, we introduce

NAS-Bench-Suite-Zero

: a uniﬁed and extensible collection of 13

ZC proxies, accessible through a uniﬁed interface, which can be evaluated on a suite of 28 tasks

through

NASLib

[

] (see Figure 1). In addition to the codebase itself, we release precomputed

ZC proxy scores across all 13 ZC proxies and 28 tasks, which can be used to speed up ZC proxy

experiments. Speciﬁcally, we show that the runtime of ZC proxy experiments such as NAS analyses

and bias analyses are shortened by a factor of at least

103

when using the precomputed ZC proxies in

NAS-Bench-Suite-Zero

. By providing a uniﬁed framework with ready-to-use scripts to run large-

scale experiments,

NAS-Bench-Suite-Zero

eliminates the overhead for researchers to compare

against many other methods and across all popular NAS benchmark search spaces, helping the

community to rapidly increase the speed of research in this promising direction. Our benchmark

suite was very recently used successfully in the Zero Cost NAS Competition at AutoML-Conf 2022.

See Appendix E for more details. In Appendix A, we give detailed documentation, including a

datasheet [

], license, author responsibility, code of conduct, and maintenance plan. We welcome

contributions from the community and hope to grow the repository and benchmark suite as more ZC

proxies and NAS benchmarks are released.

To demonstrate the usefulness of

NAS-Bench-Suite-Zero

, we run a large-scale analysis of ZC prox-

ies: we give a thorough study of generalizability and biases, and we give the ﬁrst information-theoretic

analysis. Interestingly, based on the bias study,

we present a concrete method for improving the

performance of a ZC proxy by reducing biases

(such as the tendency to favor larger architectures

or architectures with more

conv

operations). This may have important consequences for the future

design of ZC proxies. Furthermore, based on the information-theoretic analysis, we ﬁnd that there

is high information gain of the validation accuracy when conditioned on multiple ZC proxies, sug-

gesting that ZC proxies do indeed compute substantial complementary information. Motivated by

these ﬁndings, we incorporate all 13 proxies into the surrogate models used by NAS algorithms

[

], showing that the Spearman rank correlation of the surrogate predictions can increase by up

to 42%. We show that this results in improved performance for two predictor-based NAS algorithms:

BANANAS [46] and NPENAS [43].

Our contributions. We summarize our main contributions below.

•

We release

NAS-Bench-Suite-Zero

, a collection of benchmarks and ZC proxies that uniﬁes

and accelerates research on ZC proxies – a promising new sub-ﬁeld of NAS – by enabling

orders-of-magnitude faster evaluations on a large suite of diverse benchmarks.

•

We run a large-scale analysis of 13 ZC proxies across 28 different combinations of search spaces

and tasks by studying the generalizability, bias, and mutual information among ZC proxies.

•

Motivated by our analysis, we present a procedure to improve the performance of ZC proxies by

reducing biases, and we show that the complementary information of ZC proxies can signiﬁcantly

improve the predictive power of surrogate models commonly used for NAS.

Table 1: List of ZC proxies in

NAS-Bench-Suite-Zero

. Note that “neuron-wise” denotes whether

the total score is a sum of individual weights.

Name Data-dependent Neuron-wise Type In NAS-Bench-Suite-Zero

epe-nas [20] 37Jacobian 3

fisher [41] 3 3 Pruning-at-init 3

flops [24] 3 3 Baseline 3

grad-norm [1] 3 3 Pruning-at-init 3

grasp [42] 3 3 Pruning-at-init 3

l2-norm [1] 7 7 Baseline 3

jacov [22] 37Jacobian 3

nwot [22] 37Jacobian 3

params [24] 73Baseline 3

plain [1] 3 3 Baseline 3

snip [14] 3 3 Pruning-at-init 3

synflow [38] 73Pruning-at-init 3

zen-score [16] 7 7 Piece. Lin. 3

2 Background and Related Work

Given a dataset and a search space – a large set of neural architectures – NAS seeks to ﬁnd the

architecture with the highest validation accuracy (or the best application-speciﬁc trade-off among

accuracy, latency, size, and so on) on the dataset. NAS has been studied since the late 1980s [

]

and has seen a resurgence in the last few years [

], with over 1000 papers on NAS in the last

two years alone. For a survey of the different techniques used for NAS, see [9, 48].

Many NAS methods make use of performance prediction. A performance prediction method is

any function which predicts the (relative) performance of architectures, without fully training the

architectures [

]. BRP-NAS [

], BONAS [

], and BANANAS [

] are all examples of NAS

methods that make use of performance prediction. While performance prediction speeds up NAS

algorithms by avoiding fully training neural networks, many still require non-trivial computation time.

On the other hand, a recently-proposed line of techniques, zero-cost proxies (ZC proxies) require just

a single forward pass through the network, often taking just ﬁve seconds [22].

Zero-cost proxies. The original ZC proxy estimated the separability of the minibatch of data into

different linear regions of the output space [

]. Many other ZC proxies have been proposed since

then, including data-independent ZC proxies [

], ZC proxies inspired by pruning-at-

initialization techniques [

], and ZC proxies inspired by neural tangent kernels [

See Table 1 for a full list of the ZC proxies we use in this paper. We describe theoretical ZC proxy

results in Appendix B.1.

Search spaces and tasks.

In our experiments, we make use of several different NAS benchmark

search spaces and tasks. NAS-Bench-101 [

] is a popular cell-based search space for NAS research.

It consists of 423 624 architectures trained on CIFAR-10. The cell-based search space is designed

to model ResNet-like and Inception-like cells [

]. NAS-Bench-201 [

] is a cell-based search

space consisting of 15 625 architectures (6 466 non-isomorphic) trained on CIFAR-10, CIFAR-100,

and ImageNet16-120. NAS-Bench-301 [

] is a surrogate NAS benchmark for the DARTS search

space [

]. The search space consists of normal cell and reduction cells, with

1018

total architectures.

TransNAS-Bench-101 [

] is a NAS benchmark consisting of two different search spaces: a “micro”

(cell-based) search space of size 4 096, and a macro search space of size 3 256. The architectures are

trained on seven different tasks from the Taskonomy dataset [

]. NAS-Bench-Suite [

] collects

these search spaces and tasks within the uniﬁed framework of

NASLib

[

]. In this work, we extend

this collection by adding two datasets from NAS-Bench-360 [

], SVHN, and four datasets from

Taskonomy. NAS-Bench-360 is a collection of diverse tasks that are ready-to-use for NAS research.

Large-scale studies of ZC proxies.

A few recent works [

] investigated the perfor-

mance of ZC proxies in ranking architectures over different NAS benchmarks, showing that the

relative performance highly depends on the search space, but none study more than 12 total tasks, and

none make the ZC proxy values publicly available. Two predictor-based NAS methods have recently

been introduced: OMNI [

] and ProxyBO [

]. However, OMNI only uses a single ZC proxy, and

Table 2: Overview of ZC proxy evaluations in

NAS-Bench-Suite-Zero

∗

Note that EPE-NAS is

only deﬁned for classiﬁcation tasks [20].

Search space Tasks Num. ZC proxies Num. architectures Total ZC proxy evaluations

NAS-Bench-101 1 13 10 000 130 000

NAS-Bench-201 3 13 15 625 609 375

NAS-Bench-301 1 13 11 221 145 873

TransNAS-Bench-101-Micro 7 12∗3 256 273 504

TransNAS-Bench-101-Macro 7 12∗4 096 344 064

Add’l. 201, 301, TNB-Micro 9 13 600 23400

Total 28 13 44 798 1 526 216

while ProxyBO uses three, the algorithm dynamically chooses one in each iteration (so individual

predictions are made using a single ZC proxy at a time). Recently, NAS-Bench-Zero was introduced

[

], a new benchmark based on popular computer vision models ResNet [

] and MobileNetV2

[

], which includes 10 ZC proxies. However, the NAS-Bench-Zero dataset is currently not publicly

available. For more related work details, see Appendix B.

Only two prior works combine the information of multiple ZC proxies together in architecture

predictions [

] and both only use the voting strategy to combine at most four ZC proxies. Our

work is the ﬁrst to publicly release ZC proxy values, combine ZC proxies in a nontrivial way, and

exploit the complementary information of 13 ZC proxies simultaneously.

3 Overview of NAS-Bench-Suite-Zero

In this section, we give an overview of the

NAS-Bench-Suite-Zero

codebase and dataset, which

allows researchers to quickly develop ZC proxies, compare against existing ZC proxies across diverse

datasets, and integrate them into NAS algorithms, as shown in Sections 4 and 5.

We implement all ZC proxies from Table 1 in the same codebase (

NASLib

[

]). For all ZC proxies,

we use the default implementation from the original work. While this list covers 13 ZC proxies,

the majority of ZC proxies released to date, we did not yet include a few other ZC proxies, for

example, due to requiring a trained supernetwork to make evaluations [

] (therefore needing to

implement a supernetwork on 28 benchmarks), implementation in TensorFlow rather than PyTorch

[

], or unreleased code. Our modular framework easily allows additional ZC proxies to be added to

NAS-Bench-Suite-Zero in the future.

To build

NAS-Bench-Suite-Zero

, we extend the collection of

NASLib

’s publicly available bench-

marks, known as NAS-Bench-Suite [

]. This allows us to evaluate and fairly compare all ZC

proxies in the same framework without confounding factors stemming from different implemen-

tations, software versions or training pipelines. Speciﬁcally, for the search spaces and tasks, we

use NAS-Bench-101 (CIFAR-10), NAS-Bench-201 (CIFAR-10, CIFAR-100, and ImageNet16-120),

NAS-Bench-301 (CIFAR-10), and TransNAS-Bench-101 Micro and Macro (Jigsaw, Object Classiﬁ-

cation, Scene Classiﬁcation, Autoencoder) from NAS-Bench-Suite. We add the remaining tasks from

TransNAS-Bench-101 (Room Layout, Surface Normal, Semantic Segmentation), and three tasks each

for NAS-Bench-201, NAS-Bench-301, and TransNAS-Bench-101-Micro: Spherical-CIFAR-100,

NinaPro, and SVHN. This yields a total of 28 benchmarks in our analysis. For all NAS-Bench-201

and TransNAS-Bench-101 tasks, we evaluate all ZC proxy values and the respective runtimes, for

all architectures. For NAS-Bench-301, we evaluate on all 11 221 randomly sampled architectures

from the NAS-Bench-301 dataset, due to the computational infeasibility of exhaustively evaluating

the full set of

1018

architectures. Similarly, we evaluate 10 000 architectures from NAS-Bench-101.

Finally, for Spherical-CIFAR-100, NinaPro, and SVHN, we evaluate 200 architectures per search

space, since only 200 architectures are fully trained for each of these tasks. See Table 2.

We run all ZC proxies from Table 1 on Intel Xeon Gold 6242 CPUs and save their evaluations in

order to create a queryable table with these pre-computed values. We use a batch size of 64 for all ZC

proxy evaluations, except for the case of TransNAS-Bench-101: due to the extreme memory usage of

the Taskonomy tasks (

>30

GB memory), we used a batch size of 32. The total computation time for

all 1.5M evaluations was 1100 CPU hours.

Figure 2: Spearman rank correlation coefﬁcient between ZC proxy values and validation accuracies,

for each ZC proxy and benchmark. The rows and columns are ordered based on the mean scores

across columns and rows, respectively.

Speedups and recommended usage.

The average time to compute a ZC proxy across all tasks is

2.6 seconds, and the maximum time (computing

grasp

on TNB-Macro Autoencoder) is 205 seconds,

compared to 10−5seconds when instead querying the NAS-Bench-Suite-ZeroAPI.

When researchers evaluate ZC proxy-based NAS algorithms using queryable NAS benchmarks, the

bottleneck is often (ironically) the ZC proxy evaluations. For example, for OMNI [

] or ProxyBO

[

] running for 100 iterations and 100 candidates per iteration, the total evaluation time is roughly 9

hours, yet they can be run on

NAS-Bench-Suite-Zero

in under one minute. Across all experiments

done in this paper (mutual information study, bias study, NAS study, etc.), we calculate that using

NAS-Bench-Suite-Zero

decreases the computation time by at least three orders of magnitude. See

Appendix C.4 for more details.

Since

NAS-Bench-Suite-Zero

reduces the runtime of experiments by at least three orders

of magnitude (on queryable NAS benchmarks), we recommend researchers take advantage of

NAS-Bench-Suite-Zero

to (i) run hundreds of trials of ZC proxy-based NAS algorithms, to reach

statistically signiﬁcant conclusions, (ii) run extensive ablation studies, including the type and usage

of ZC proxies, and (iii) increase the total number of ZC proxies evaluated in the NAS algorithm.

Finally, when using

NAS-Bench-Suite-Zero

, researchers should report the real-world time NAS

algorithms would take, by adding the time to run each ZC proxy evaluation (which can be queried in

NAS-Bench-Suite-Zero) to the total runtime of the NAS algorithm.

4 Generalizability, Mutual Information, and Bias of ZC Proxies

In this section, we use

NAS-Bench-Suite-Zero

to study concrete research questions relating to the

generalizability, complementary information, and bias of ZC proxies.

4.1 RQ 1: How well do ZC proxies generalize across different benchmarks?

In Figure 2, for each ZC proxy and each benchmark, we compute the Spearman rank correlation

between the ZC proxy values and the validation accuracies over a set of 1000 randomly drawn

architectures (see Appendix C for the full results on all benchmarks). Out of all the ZC proxies,

nwot

and

flops

have the highest rank correlations across all benchmarks. On some of the benchmarks,

such as TransNAS-Bench-101-Micro Autoencoder and Room Layout, all of the ZC proxies exhibit

poor performance on average, while on the widely used NAS-Bench-201 benchmarks, almost all of

them perform well. Several methods, such as

snip

and

grasp

, perform well on the NAS-Bench-201

tasks, but on average are outperformed by params and flops on the other benchmarks.

Although no ZC proxy performs consistently across all benchmarks, we may ask a related question:

is the performance of all ZC proxies across benchmarks correlated enough to capture similarities

among benchmarks? In other words, can we use ZC proxies as a tool to assess the similarities among

tasks. This is particularly important in meta-learning or transfer learning, where a meta-algorithm

aims to learn and transfer knowledge across a set of similar tasks. To answer this question, we

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

NAS-Bench-Suite-Zero:AcceleratingResearchonZeroCostProxiesArjunKrishnakumar1,ColinWhite2,ArberZela1,RenboTu3,MahmoudSafari1,FrankHutter1;41UniversityofFreiburg,2Abacus.AI,3UniversityofToronto,4BoschCenterforArticialIntelligenceAbstractZero-costproxies(ZCproxies)arearecentarchitectureperformance...

展开>> 收起<<

NAS-Bench-Suite-Zero Accelerating Research on Zero Cost Proxies Arjun Krishnakumar1 Colin White2 Arber Zela1 Renbo Tu3.pdf

共37页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

NAS-Bench-Suite-Zero Accelerating Research on Zero Cost Proxies Arjun Krishnakumar1 Colin White2 Arber Zela1 Renbo Tu3

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: