NAS-Bench-x11 and the Power of Learning Curves
- URL: http://arxiv.org/abs/2111.03602v1
- Date: Fri, 5 Nov 2021 16:41:06 GMT
- Title: NAS-Bench-x11 and the Power of Learning Curves
- Authors: Shen Yan, Colin White, Yash Savani, Frank Hutter
- Abstract summary: We present a method using singular value decomposition and noise modeling to create surrogate benchmarks, NAS-Bench-111, NAS-Bench-311, and NAS-Bench-11.
We demonstrate the power of using the full training information by introducing a learning curve extrapolation framework to modify single-fidelity algorithms.
- Score: 43.4379778935488
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While early research in neural architecture search (NAS) required extreme
computational resources, the recent releases of tabular and surrogate
benchmarks have greatly increased the speed and reproducibility of NAS
research. However, two of the most popular benchmarks do not provide the full
training information for each architecture. As a result, on these benchmarks it
is not possible to run many types of multi-fidelity techniques, such as
learning curve extrapolation, that require evaluating architectures at
arbitrary epochs. In this work, we present a method using singular value
decomposition and noise modeling to create surrogate benchmarks, NAS-Bench-111,
NAS-Bench-311, and NAS-Bench-NLP11, that output the full training information
for each architecture, rather than just the final validation accuracy. We
demonstrate the power of using the full training information by introducing a
learning curve extrapolation framework to modify single-fidelity algorithms,
showing that it leads to improvements over popular single-fidelity algorithms
which claimed to be state-of-the-art upon release. Our code and pretrained
models are available at https://github.com/automl/nas-bench-x11.
Related papers
- Graph is all you need? Lightweight data-agnostic neural architecture search without training [45.79667238486864]
Neural architecture search (NAS) enables the automatic design of neural network models.
Our method, dubbed nasgraph, remarkably reduces the computational costs by converting neural architectures to graphs.
It can find the best architecture among 200 randomly sampled architectures from NAS-Bench201 in 217 CPU seconds.
arXiv Detail & Related papers (2024-05-02T14:12:58Z) - DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - Efficacy of Neural Prediction-Based Zero-Shot NAS [0.04096453902709291]
We propose a novel approach for zero-shot Neural Architecture Search (NAS) using deep learning.
Our method employs Fourier sum of sines encoding for convolutional kernels, enabling the construction of a computational feed-forward graph with a structure similar to the architecture under evaluation.
Experimental results show that our approach surpasses previous methods using graph convolutional networks in terms of correlation on the NAS-Bench-201 dataset and exhibits a higher convergence rate.
arXiv Detail & Related papers (2023-08-31T14:54:06Z) - Neural Architecture Search via Two Constant Shared Weights Initialisations [0.0]
We present a zero-cost metric that highly correlated with the train set accuracy across the NAS-Bench-101, NAS-Bench-201 and NAS-Bench-NLP benchmark datasets.
Our method is easy to integrate within existing NAS algorithms and takes a fraction of a second to evaluate a single network.
arXiv Detail & Related papers (2023-02-09T02:25:38Z) - NAAP-440 Dataset and Baseline for Neural Architecture Accuracy
Prediction [1.2183405753834562]
We introduce the NAAP-440 dataset of 440 neural architectures, which were trained on CIFAR10 using a fixed recipe.
Experiments indicate that by using off-the-shelf regression algorithms and running up to 10% of the training process, not only is it possible to predict an architecture's accuracy rather precisely.
This approach may serve as a powerful tool for accelerating NAS-based studies and thus dramatically increase their efficiency.
arXiv Detail & Related papers (2022-09-14T13:21:39Z) - BaLeNAS: Differentiable Architecture Search via the Bayesian Learning
Rule [95.56873042777316]
Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost.
This paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions.
We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability.
arXiv Detail & Related papers (2021-11-25T18:13:42Z) - FNAS: Uncertainty-Aware Fast Neural Architecture Search [54.49650267859032]
Reinforcement learning (RL)-based neural architecture search (NAS) generally guarantees better convergence yet suffers from the requirement of huge computational resources.
We propose a general pipeline to accelerate the convergence of the rollout process as well as the RL process in NAS.
Experiments on the Mobile Neural Architecture Search (MNAS) search space show the proposed Fast Neural Architecture Search (FNAS) accelerates standard RL-based NAS process by 10x.
arXiv Detail & Related papers (2021-05-25T06:32:52Z) - DrNAS: Dirichlet Neural Architecture Search [88.56953713817545]
We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution.
With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based generalization.
To alleviate the large memory consumption of differentiable NAS, we propose a simple yet effective progressive learning scheme.
arXiv Detail & Related papers (2020-06-18T08:23:02Z) - DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution
Pruning [135.27931587381596]
We propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning.
In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs.
With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints.
arXiv Detail & Related papers (2019-05-28T06:35:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.