Unifying and Boosting Gradient-Based Training-Free Neural Architecture
Search
- URL: http://arxiv.org/abs/2201.09785v1
- Date: Mon, 24 Jan 2022 16:26:11 GMT
- Title: Unifying and Boosting Gradient-Based Training-Free Neural Architecture
Search
- Authors: Yao Shu, Zhongxiang Dai, Zhaoxuan Wu, Bryan Kian Hsiang Low
- Abstract summary: Neural architecture search (NAS) has gained immense popularity owing to its ability to automate neural architecture design.
A number of training-free metrics are recently proposed to realize NAS without training, hence making NAS more scalable.
Despite their competitive empirical performances, a unified theoretical understanding of these training-free metrics is lacking.
- Score: 30.986396610873626
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural architecture search (NAS) has gained immense popularity owing to its
ability to automate neural architecture design. A number of training-free
metrics are recently proposed to realize NAS without training, hence making NAS
more scalable. Despite their competitive empirical performances, a unified
theoretical understanding of these training-free metrics is lacking. As a
consequence, (a) the relationships among these metrics are unclear, (b) there
is no theoretical guarantee for their empirical performances and
transferability, and (c) there may exist untapped potential in training-free
NAS, which can be unveiled through a unified theoretical understanding. To this
end, this paper presents a unified theoretical analysis of gradient-based
training-free NAS, which allows us to (a) theoretically study their
relationships, (b) theoretically guarantee their generalization performances
and transferability, and (c) exploit our unified theoretical understanding to
develop a novel framework named hybrid NAS (HNAS) which consistently boosts
training-free NAS in a principled way. Interestingly, HNAS is able to enjoy the
advantages of both training-free (i.e., superior search efficiency) and
training-based (i.e., remarkable search effectiveness) NAS, which we have
demonstrated through extensive experiments.
Related papers
- Robust NAS under adversarial training: benchmark, theory, and beyond [55.51199265630444]
We release a comprehensive data set that encompasses both clean accuracy and robust accuracy for a vast array of adversarially trained networks.
We also establish a generalization theory for searching architecture in terms of clean accuracy and robust accuracy under multi-objective adversarial training.
arXiv Detail & Related papers (2024-03-19T20:10:23Z) - Robustifying and Boosting Training-Free Neural Architecture Search [49.828875134088904]
We propose a robustifying and boosting training-free NAS (RoBoT) algorithm to develop a robust and consistently better-performing metric on diverse tasks.
Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS.
arXiv Detail & Related papers (2024-03-12T12:24:11Z) - SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss
Landscape [14.550053893504764]
We introduce a "sub-one-shot" paradigm that serves as a bridge between zero-shot and one-shot NAS.
In sub-one-shot NAS, the supernet is trained using only a small subset of the training data, a phase we refer to as "warm-up"
We present SiGeo, a proxy founded on a novel theoretical framework that connects the supernet warm-up with the efficacy of the proxy.
arXiv Detail & Related papers (2023-11-22T05:25:24Z) - Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities [58.67514819895494]
Key idea behind zero-shot NAS approaches is to design proxies that can predict the accuracy of some given networks without training the network parameters.
This paper aims to comprehensively review and compare the state-of-the-art (SOTA) zero-shot NAS approaches.
arXiv Detail & Related papers (2023-07-05T03:07:00Z) - Generalization Properties of NAS under Activation and Skip Connection
Search [66.8386847112332]
We study the generalization properties of Neural Architecture Search (NAS) under a unifying framework.
We derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime.
We show how the derived results can guide NAS to select the top-performing architectures, even in the case without training.
arXiv Detail & Related papers (2022-09-15T12:11:41Z) - Understanding and Accelerating Neural Architecture Search with
Training-Free and Theory-Grounded Metrics [117.4281417428145]
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS)
NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations.
We present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks.
arXiv Detail & Related papers (2021-08-26T17:52:07Z) - Neural Architecture Search on ImageNet in Four GPU Hours: A
Theoretically Inspired Perspective [88.39981851247727]
We propose a novel framework called training-free neural architecture search (TE-NAS)
TE-NAS ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space.
We show that: (1) these two measurements imply the trainability and expressivity of a neural network; (2) they strongly correlate with the network's test accuracy.
arXiv Detail & Related papers (2021-02-23T07:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.