Neural Architecture Search on ImageNet in Four GPU Hours: A
Theoretically Inspired Perspective
- URL: http://arxiv.org/abs/2102.11535v1
- Date: Tue, 23 Feb 2021 07:50:44 GMT
- Title: Neural Architecture Search on ImageNet in Four GPU Hours: A
Theoretically Inspired Perspective
- Authors: Wuyang Chen, Xinyu Gong, Zhangyang Wang
- Abstract summary: We propose a novel framework called training-free neural architecture search (TE-NAS)
TE-NAS ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space.
We show that: (1) these two measurements imply the trainability and expressivity of a neural network; (2) they strongly correlate with the network's test accuracy.
- Score: 88.39981851247727
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Architecture Search (NAS) has been explosively studied to automate the
discovery of top-performer neural networks. Current works require heavy
training of supernet or intensive architecture evaluations, thus suffering from
heavy resource consumption and often incurring search bias due to truncated
training or approximations. Can we select the best neural architectures without
involving any training and eliminate a drastic portion of the search cost? We
provide an affirmative answer, by proposing a novel framework called
training-free neural architecture search (TE-NAS). TE-NAS ranks architectures
by analyzing the spectrum of the neural tangent kernel (NTK) and the number of
linear regions in the input space. Both are motivated by recent theory advances
in deep networks and can be computed without any training and any label. We
show that: (1) these two measurements imply the trainability and expressivity
of a neural network; (2) they strongly correlate with the network's test
accuracy. Further on, we design a pruning-based NAS mechanism to achieve a more
flexible and superior trade-off between the trainability and expressivity
during the search. In NAS-Bench-201 and DARTS search spaces, TE-NAS completes
high-quality search but only costs 0.5 and 4 GPU hours with one 1080Ti on
CIFAR-10 and ImageNet, respectively. We hope our work inspires more attempts in
bridging the theoretical findings of deep networks and practical impacts in
real NAS applications. Code is available at:
https://github.com/VITA-Group/TENAS.
Related papers
- GeNAS: Neural Architecture Search with Better Generalization [14.92869716323226]
Recent neural architecture search (NAS) approaches rely on validation loss or accuracy to find the superior network for the target data.
In this paper, we investigate a new neural architecture search measure for excavating architectures with better generalization.
arXiv Detail & Related papers (2023-05-15T12:44:54Z) - NASiam: Efficient Representation Learning using Neural Architecture
Search for Siamese Networks [76.8112416450677]
Siamese networks are one of the most trending methods to achieve self-supervised visual representation learning (SSL)
NASiam is a novel approach that uses for the first time differentiable NAS to improve the multilayer perceptron projector and predictor (encoder/predictor pair)
NASiam reaches competitive performance in both small-scale (i.e., CIFAR-10/CIFAR-100) and large-scale (i.e., ImageNet) image classification datasets while costing only a few GPU hours.
arXiv Detail & Related papers (2023-01-31T19:48:37Z) - Generalization Properties of NAS under Activation and Skip Connection
Search [66.8386847112332]
We study the generalization properties of Neural Architecture Search (NAS) under a unifying framework.
We derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime.
We show how the derived results can guide NAS to select the top-performing architectures, even in the case without training.
arXiv Detail & Related papers (2022-09-15T12:11:41Z) - PRE-NAS: Predictor-assisted Evolutionary Neural Architecture Search [34.06028035262884]
We propose a novel evolutionary-based NAS strategy, Predictor-assisted E-NAS (PRE-NAS)
PRE-NAS leverages new evolutionary search strategies and integrates high-fidelity weight inheritance over generations.
Experiments on NAS-Bench-201 and DARTS search spaces show that PRE-NAS can outperform state-of-the-art NAS methods.
arXiv Detail & Related papers (2022-04-27T06:40:39Z) - Evolutionary Neural Cascade Search across Supernetworks [68.8204255655161]
We introduce ENCAS - Evolutionary Neural Cascade Search.
ENCAS can be used to search over multiple pretrained supernetworks.
We test ENCAS on common computer vision benchmarks.
arXiv Detail & Related papers (2022-03-08T11:06:01Z) - Understanding and Accelerating Neural Architecture Search with
Training-Free and Theory-Grounded Metrics [117.4281417428145]
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS)
NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations.
We present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks.
arXiv Detail & Related papers (2021-08-26T17:52:07Z) - BossNAS: Exploring Hybrid CNN-transformers with Block-wisely
Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS)
We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately.
We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z) - Neural Architecture Search without Training [8.067283219068832]
In this work, we examine the overlap of activations between datapoints in untrained networks.
We motivate how this can give a measure which is usefully indicative of a network's trained performance.
We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU.
arXiv Detail & Related papers (2020-06-08T14:53:56Z) - Fast Neural Network Adaptation via Parameter Remapping and Architecture
Search [35.61441231491448]
Deep neural networks achieve remarkable performance in many computer vision tasks.
Most state-of-the-art (SOTA) semantic segmentation and object detection approaches reuse neural network architectures designed for image classification as the backbone.
One major challenge though, is that ImageNet pre-training of the search space representation incurs huge computational cost.
In this paper, we propose a Fast Neural Network Adaptation (FNA) method, which can adapt both the architecture and parameters of a seed network.
arXiv Detail & Related papers (2020-01-08T13:45:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.