Neural Architecture Search without Training
- URL: http://arxiv.org/abs/2006.04647v3
- Date: Fri, 11 Jun 2021 14:31:02 GMT
- Title: Neural Architecture Search without Training
- Authors: Joseph Mellor, Jack Turner, Amos Storkey, Elliot J. Crowley
- Abstract summary: In this work, we examine the overlap of activations between datapoints in untrained networks.
We motivate how this can give a measure which is usefully indicative of a network's trained performance.
We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU.
- Score: 8.067283219068832
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The time and effort involved in hand-designing deep neural networks is
immense. This has prompted the development of Neural Architecture Search (NAS)
techniques to automate this design. However, NAS algorithms tend to be slow and
expensive; they need to train vast numbers of candidate networks to inform the
search process. This could be alleviated if we could partially predict a
network's trained accuracy from its initial state. In this work, we examine the
overlap of activations between datapoints in untrained networks and motivate
how this can give a measure which is usefully indicative of a network's trained
performance. We incorporate this measure into a simple algorithm that allows us
to search for powerful networks without any training in a matter of seconds on
a single GPU, and verify its effectiveness on NAS-Bench-101, NAS-Bench-201,
NATS-Bench, and Network Design Spaces. Our approach can be readily combined
with more expensive search methods; we examine a simple adaptation of
regularised evolutionary search. Code for reproducing our experiments is
available at https://github.com/BayesWatch/nas-without-training.
Related papers
- NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance [0.0]
We propose a zero-cost proxy Network Expressivity by Activation Rank (NEAR) to identify the optimal neural network without training.
We demonstrate the cutting-edge correlation between this network score and the model accuracy on NAS-Bench-101 and NATS-Bench-SSS/TSS.
arXiv Detail & Related papers (2024-08-16T14:38:14Z) - OFA$^2$: A Multi-Objective Perspective for the Once-for-All Neural
Architecture Search [79.36688444492405]
Once-for-All (OFA) is a Neural Architecture Search (NAS) framework designed to address the problem of searching efficient architectures for devices with different resources constraints.
We aim to give one step further in the search for efficiency by explicitly conceiving the search stage as a multi-objective optimization problem.
arXiv Detail & Related papers (2023-03-23T21:30:29Z) - You Can Have Better Graph Neural Networks by Not Training Weights at
All: Finding Untrained GNNs Tickets [105.24703398193843]
Untrainedworks in graph neural networks (GNNs) still remains mysterious.
We show that the found untrainedworks can substantially mitigate the GNN over-smoothing problem.
We also observe that such sparse untrainedworks have appealing performance in out-of-distribution detection and robustness of input perturbations.
arXiv Detail & Related papers (2022-11-28T14:17:36Z) - Evolutionary Neural Cascade Search across Supernetworks [68.8204255655161]
We introduce ENCAS - Evolutionary Neural Cascade Search.
ENCAS can be used to search over multiple pretrained supernetworks.
We test ENCAS on common computer vision benchmarks.
arXiv Detail & Related papers (2022-03-08T11:06:01Z) - Improving the sample-efficiency of neural architecture search with
reinforcement learning [0.0]
In this work, we would like to contribute to the area of Automated Machine Learning (AutoML)
Our focus is on one of the most promising research directions, reinforcement learning.
The validation accuracies of the child networks serve as a reward signal for training the controller.
We propose to modify this to a more modern and complex algorithm, PPO, which has demonstrated to be faster and more stable in other environments.
arXiv Detail & Related papers (2021-10-13T14:30:09Z) - Understanding and Accelerating Neural Architecture Search with
Training-Free and Theory-Grounded Metrics [117.4281417428145]
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS)
NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations.
We present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks.
arXiv Detail & Related papers (2021-08-26T17:52:07Z) - Neural Architecture Search on ImageNet in Four GPU Hours: A
Theoretically Inspired Perspective [88.39981851247727]
We propose a novel framework called training-free neural architecture search (TE-NAS)
TE-NAS ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space.
We show that: (1) these two measurements imply the trainability and expressivity of a neural network; (2) they strongly correlate with the network's test accuracy.
arXiv Detail & Related papers (2021-02-23T07:50:44Z) - Direct Federated Neural Architecture Search [0.0]
We present an effective approach for direct federated NAS which is hardware agnostic, computationally lightweight, and a one-stage method to search for ready-to-deploy neural network models.
Our results show an order of magnitude reduction in resource consumption while edging out prior art in accuracy.
arXiv Detail & Related papers (2020-10-13T08:11:35Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - Fast Neural Network Adaptation via Parameter Remapping and Architecture
Search [35.61441231491448]
Deep neural networks achieve remarkable performance in many computer vision tasks.
Most state-of-the-art (SOTA) semantic segmentation and object detection approaches reuse neural network architectures designed for image classification as the backbone.
One major challenge though, is that ImageNet pre-training of the search space representation incurs huge computational cost.
In this paper, we propose a Fast Neural Network Adaptation (FNA) method, which can adapt both the architecture and parameters of a seed network.
arXiv Detail & Related papers (2020-01-08T13:45:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.