Understanding and Accelerating Neural Architecture Search with
Training-Free and Theory-Grounded Metrics
- URL: http://arxiv.org/abs/2108.11939v1
- Date: Thu, 26 Aug 2021 17:52:07 GMT
- Title: Understanding and Accelerating Neural Architecture Search with
Training-Free and Theory-Grounded Metrics
- Authors: Wuyang Chen, Xinyu Gong, Yunchao Wei, Humphrey Shi, Zhicheng Yan, Yi
Yang, Zhangyang Wang
- Abstract summary: This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS)
NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations.
We present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks.
- Score: 117.4281417428145
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This work targets designing a principled and unified training-free framework
for Neural Architecture Search (NAS), with high performance, low cost, and
in-depth interpretation. NAS has been explosively studied to automate the
discovery of top-performer neural networks, but suffers from heavy resource
consumption and often incurs search bias due to truncated training or
approximations. Recent NAS works start to explore indicators that can predict a
network's performance without training. However, they either leveraged limited
properties of deep networks, or the benefits of their training-free indicators
are not applied to more extensive search methods. By rigorous correlation
analysis, we present a unified framework to understand and accelerate NAS, by
disentangling "TEG" characteristics of searched networks - Trainability,
Expressivity, Generalization - all assessed in a training-free manner. The TEG
indicators could be scaled up and integrated with various NAS search methods,
including both supernet and single-path approaches. Extensive studies validate
the effective and efficient guidance from our TEG-NAS framework, leading to
both improved search accuracy and over 2.3x reduction in search time cost.
Moreover, we visualize search trajectories on three landscapes of "TEG"
characteristics, observing that while a good local minimum is easier to find on
NAS-Bench-201 given its simple topology, balancing "TEG" characteristics is
much harder on the DARTS search space due to its complex landscape geometry.
Our code is available at https://github.com/VITA-Group/TEGNAS.
Related papers
- Robustifying and Boosting Training-Free Neural Architecture Search [49.828875134088904]
We propose a robustifying and boosting training-free NAS (RoBoT) algorithm to develop a robust and consistently better-performing metric on diverse tasks.
Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS.
arXiv Detail & Related papers (2024-03-12T12:24:11Z) - Generalization Properties of NAS under Activation and Skip Connection
Search [66.8386847112332]
We study the generalization properties of Neural Architecture Search (NAS) under a unifying framework.
We derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime.
We show how the derived results can guide NAS to select the top-performing architectures, even in the case without training.
arXiv Detail & Related papers (2022-09-15T12:11:41Z) - PRE-NAS: Predictor-assisted Evolutionary Neural Architecture Search [34.06028035262884]
We propose a novel evolutionary-based NAS strategy, Predictor-assisted E-NAS (PRE-NAS)
PRE-NAS leverages new evolutionary search strategies and integrates high-fidelity weight inheritance over generations.
Experiments on NAS-Bench-201 and DARTS search spaces show that PRE-NAS can outperform state-of-the-art NAS methods.
arXiv Detail & Related papers (2022-04-27T06:40:39Z) - An Analysis of Super-Net Heuristics in Weight-Sharing NAS [70.57382341642418]
We show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained.
We show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained.
arXiv Detail & Related papers (2021-10-04T02:18:44Z) - Generative Adversarial Neural Architecture Search [21.05611902967155]
We propose Generative Adversarial NAS (GA-NAS) with theoretically provable convergence guarantees.
We show that GA-NAS can be used to improve already optimized baselines found by other NAS methods.
arXiv Detail & Related papers (2021-05-19T18:54:44Z) - Neural Architecture Search on ImageNet in Four GPU Hours: A
Theoretically Inspired Perspective [88.39981851247727]
We propose a novel framework called training-free neural architecture search (TE-NAS)
TE-NAS ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space.
We show that: (1) these two measurements imply the trainability and expressivity of a neural network; (2) they strongly correlate with the network's test accuracy.
arXiv Detail & Related papers (2021-02-23T07:50:44Z) - CATCH: Context-based Meta Reinforcement Learning for Transferrable
Architecture Search [102.67142711824748]
CATCH is a novel Context-bAsed meTa reinforcement learning algorithm for transferrable arChitecture searcH.
The combination of meta-learning and RL allows CATCH to efficiently adapt to new tasks while being agnostic to search spaces.
It is also capable of handling cross-domain architecture search as competitive networks on ImageNet, COCO, and Cityscapes are identified.
arXiv Detail & Related papers (2020-07-18T09:35:53Z) - Neural Architecture Search without Training [8.067283219068832]
In this work, we examine the overlap of activations between datapoints in untrained networks.
We motivate how this can give a measure which is usefully indicative of a network's trained performance.
We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU.
arXiv Detail & Related papers (2020-06-08T14:53:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.