Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities
- URL: http://arxiv.org/abs/2307.01998v3
- Date: Tue, 18 Jun 2024 16:09:26 GMT
- Title: Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities
- Authors: Guihong Li, Duc Hoang, Kartikeya Bhardwaj, Ming Lin, Zhangyang Wang, Radu Marculescu,
- Abstract summary: Key idea behind zero-shot NAS approaches is to design proxies that can predict the accuracy of some given networks without training the network parameters.
This paper aims to comprehensively review and compare the state-of-the-art (SOTA) zero-shot NAS approaches.
- Score: 58.67514819895494
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, zero-shot (or training-free) Neural Architecture Search (NAS) approaches have been proposed to liberate NAS from the expensive training process. The key idea behind zero-shot NAS approaches is to design proxies that can predict the accuracy of some given networks without training the network parameters. The proxies proposed so far are usually inspired by recent progress in theoretical understanding of deep learning and have shown great potential on several datasets and NAS benchmarks. This paper aims to comprehensively review and compare the state-of-the-art (SOTA) zero-shot NAS approaches, with an emphasis on their hardware awareness. To this end, we first review the mainstream zero-shot proxies and discuss their theoretical underpinnings. We then compare these zero-shot proxies through large-scale experiments and demonstrate their effectiveness in both hardware-aware and hardware-oblivious NAS scenarios. Finally, we point out several promising ideas to design better proxies. Our source code and the list of related papers are available on https://github.com/SLDGroup/survey-zero-shot-nas.
Related papers
- SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss
Landscape [14.550053893504764]
We introduce a "sub-one-shot" paradigm that serves as a bridge between zero-shot and one-shot NAS.
In sub-one-shot NAS, the supernet is trained using only a small subset of the training data, a phase we refer to as "warm-up"
We present SiGeo, a proxy founded on a novel theoretical framework that connects the supernet warm-up with the efficacy of the proxy.
arXiv Detail & Related papers (2023-11-22T05:25:24Z) - NASiam: Efficient Representation Learning using Neural Architecture
Search for Siamese Networks [76.8112416450677]
Siamese networks are one of the most trending methods to achieve self-supervised visual representation learning (SSL)
NASiam is a novel approach that uses for the first time differentiable NAS to improve the multilayer perceptron projector and predictor (encoder/predictor pair)
NASiam reaches competitive performance in both small-scale (i.e., CIFAR-10/CIFAR-100) and large-scale (i.e., ImageNet) image classification datasets while costing only a few GPU hours.
arXiv Detail & Related papers (2023-01-31T19:48:37Z) - ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients [17.139381064317778]
We propose a new zero-shot proxy, ZiCo, that works consistently better than #Params.
ZiCo-based NAS can find optimal architectures with 78.1%, 79.4%, and 80.4% test accuracy under inference budgets of 450M, 600M, and 1000M FLOPs, respectively.
arXiv Detail & Related papers (2023-01-26T18:38:56Z) - RD-NAS: Enhancing One-shot Supernet Ranking Ability via Ranking
Distillation from Zero-cost Proxies [20.076610051602618]
We propose Ranking Distillation one-shot NAS (RD-NAS) to enhance ranking consistency.
Our evaluation of the NAS-Bench-201 and ResNet-based search space demonstrates that RD-NAS achieve 10.7% and 9.65% improvements in ranking ability.
arXiv Detail & Related papers (2023-01-24T07:49:04Z) - Generalization Properties of NAS under Activation and Skip Connection
Search [66.8386847112332]
We study the generalization properties of Neural Architecture Search (NAS) under a unifying framework.
We derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime.
We show how the derived results can guide NAS to select the top-performing architectures, even in the case without training.
arXiv Detail & Related papers (2022-09-15T12:11:41Z) - Neural Architecture Search on ImageNet in Four GPU Hours: A
Theoretically Inspired Perspective [88.39981851247727]
We propose a novel framework called training-free neural architecture search (TE-NAS)
TE-NAS ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space.
We show that: (1) these two measurements imply the trainability and expressivity of a neural network; (2) they strongly correlate with the network's test accuracy.
arXiv Detail & Related papers (2021-02-23T07:50:44Z) - Revisiting Neural Architecture Search [0.0]
We propose a novel approach to search for the complete neural network without much human effort and is a step closer towards AutoML-nirvana.
Our method starts from a complete graph mapped to a neural network and searches for the connections and operations by balancing the exploration and exploitation of the search space.
arXiv Detail & Related papers (2020-10-12T13:57:30Z) - GreedyNAS: Towards Fast One-Shot NAS with Greedy Supernet [63.96959854429752]
GreedyNAS is easy-to-follow, and experimental results on ImageNet dataset indicate that it can achieve better Top-1 accuracy under same search space and FLOPs or latency level.
By searching on a larger space, our GreedyNAS can also obtain new state-of-the-art architectures.
arXiv Detail & Related papers (2020-03-25T06:54:10Z) - EcoNAS: Finding Proxies for Economical Neural Architecture Search [130.59673917196994]
In this paper, we observe that most existing proxies exhibit different behaviors in maintaining the rank consistency among network candidates.
Inspired by these observations, we present a reliable proxy and further formulate a hierarchical proxy strategy.
The strategy spends more computations on candidate networks that are potentially more accurate, while discards unpromising ones in early stage with a fast proxy.
arXiv Detail & Related papers (2020-01-05T13:29:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.