Zero-Cost Proxies for Lightweight NAS
- URL: http://arxiv.org/abs/2101.08134v1
- Date: Wed, 20 Jan 2021 13:59:52 GMT
- Title: Zero-Cost Proxies for Lightweight NAS
- Authors: Mohamed S. Abdelfattah, Abhinav Mehrotra, {\L}ukasz Dudziak, Nicholas
D. Lane
- Abstract summary: We evaluate conventional reduced-training proxies and quantify how well they preserve ranking between multiple models during search.
We propose a series of zero-cost proxies that use just a single minibatch of training data to compute a model's score.
Our zero-cost proxies use 3 orders of magnitude less computation but can match and even outperform conventional proxies.
- Score: 19.906217380811373
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Neural Architecture Search (NAS) is quickly becoming the standard methodology
to design neural network models. However, NAS is typically compute-intensive
because multiple models need to be evaluated before choosing the best one. To
reduce the computational power and time needed, a proxy task is often used for
evaluating each model instead of full training. In this paper, we evaluate
conventional reduced-training proxies and quantify how well they preserve
ranking between multiple models during search when compared with the rankings
produced by final trained accuracy. We propose a series of zero-cost proxies,
based on recent pruning literature, that use just a single minibatch of
training data to compute a model's score. Our zero-cost proxies use 3 orders of
magnitude less computation but can match and even outperform conventional
proxies. For example, Spearman's rank correlation coefficient between final
validation accuracy and our best zero-cost proxy on NAS-Bench-201 is 0.82,
compared to 0.61 for EcoNAS (a recently proposed reduced-training proxy).
Finally, we use these zero-cost proxies to enhance existing NAS search
algorithms such as random search, reinforcement learning, evolutionary search
and predictor-based search. For all search methodologies and across three
different NAS datasets, we are able to significantly improve sample efficiency,
and thereby decrease computation, by using our zero-cost proxies. For example
on NAS-Bench-101, we achieved the same accuracy 4$\times$ quicker than the best
previous result.
Related papers
- TG-NAS: Leveraging Zero-Cost Proxies with Transformer and Graph Convolution Networks for Efficient Neural Architecture Search [1.30891455653235]
TG-NAS aims to create training-free proxies for architecture performance prediction.
We introduce TG-NAS, a novel model-based universal proxy that leverages a transformer-based operator embedding generator and a graph convolution network (GCN) to predict architecture performance.
TG-NAS achieves up to 300X improvements in search efficiency compared to previous SOTA ZC proxy methods.
arXiv Detail & Related papers (2024-03-30T07:25:30Z) - SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS [35.041289296298565]
Training-free metrics are widely used to avoid resource-intensive neural network training.
We propose Sample-Wise Activation Patterns and its derivative, SWAP-Score, a novel high-performance training-free metric.
The SWAP-Score is strongly correlated with ground-truth performance across various search spaces and tasks.
arXiv Detail & Related papers (2024-03-07T02:40:42Z) - Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities [58.67514819895494]
Key idea behind zero-shot NAS approaches is to design proxies that can predict the accuracy of some given networks without training the network parameters.
This paper aims to comprehensively review and compare the state-of-the-art (SOTA) zero-shot NAS approaches.
arXiv Detail & Related papers (2023-07-05T03:07:00Z) - Neural Architecture Search via Two Constant Shared Weights Initialisations [0.0]
We present a zero-cost metric that highly correlated with the train set accuracy across the NAS-Bench-101, NAS-Bench-201 and NAS-Bench-NLP benchmark datasets.
Our method is easy to integrate within existing NAS algorithms and takes a fraction of a second to evaluate a single network.
arXiv Detail & Related papers (2023-02-09T02:25:38Z) - ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients [17.139381064317778]
We propose a new zero-shot proxy, ZiCo, that works consistently better than #Params.
ZiCo-based NAS can find optimal architectures with 78.1%, 79.4%, and 80.4% test accuracy under inference budgets of 450M, 600M, and 1000M FLOPs, respectively.
arXiv Detail & Related papers (2023-01-26T18:38:56Z) - $\beta$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable
Architecture Search [96.99525100285084]
Regularization method, Beta-Decay, is proposed to regularize the DARTS-based NAS searching process (i.e., $beta$-DARTS)
In-depth theoretical analyses on how it works and why it works are provided.
arXiv Detail & Related papers (2023-01-16T12:30:32Z) - FNAS: Uncertainty-Aware Fast Neural Architecture Search [54.49650267859032]
Reinforcement learning (RL)-based neural architecture search (NAS) generally guarantees better convergence yet suffers from the requirement of huge computational resources.
We propose a general pipeline to accelerate the convergence of the rollout process as well as the RL process in NAS.
Experiments on the Mobile Neural Architecture Search (MNAS) search space show the proposed Fast Neural Architecture Search (FNAS) accelerates standard RL-based NAS process by 10x.
arXiv Detail & Related papers (2021-05-25T06:32:52Z) - Efficient Sampling for Predictor-Based Neural Architecture Search [3.287802528135173]
We study predictor-based NAS algorithms for neural architecture search.
We show that the sample efficiency of predictor-based algorithms decreases dramatically if the proxy is only computed for a subset of the search space.
This is an important step to make predictor-based NAS algorithms useful, in practice.
arXiv Detail & Related papers (2020-11-24T11:36:36Z) - Search What You Want: Barrier Panelty NAS for Mixed Precision
Quantization [51.26579110596767]
We propose a novel Barrier Penalty based NAS (BP-NAS) for mixed precision quantization.
BP-NAS sets new state of the arts on both classification (Cifar-10, ImageNet) and detection (COCO)
arXiv Detail & Related papers (2020-07-20T12:00:48Z) - Accuracy Prediction with Non-neural Model for Neural Architecture Search [185.0651567642238]
We study an alternative approach which uses non-neural model for accuracy prediction.
We leverage gradient boosting decision tree (GBDT) as the predictor for Neural architecture search (NAS)
Experiments on NASBench-101 and ImageNet demonstrate the effectiveness of using GBDT as predictor for NAS.
arXiv Detail & Related papers (2020-07-09T13:28:49Z) - EcoNAS: Finding Proxies for Economical Neural Architecture Search [130.59673917196994]
In this paper, we observe that most existing proxies exhibit different behaviors in maintaining the rank consistency among network candidates.
Inspired by these observations, we present a reliable proxy and further formulate a hierarchical proxy strategy.
The strategy spends more computations on candidate networks that are potentially more accurate, while discards unpromising ones in early stage with a fast proxy.
arXiv Detail & Related papers (2020-01-05T13:29:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.