Speedy Performance Estimation for Neural Architecture Search
- URL: http://arxiv.org/abs/2006.04492v2
- Date: Tue, 8 Jun 2021 02:41:51 GMT
- Title: Speedy Performance Estimation for Neural Architecture Search
- Authors: Binxin Ru, Clare Lyle, Lisa Schut, Miroslav Fil, Mark van der Wilk and
Yarin Gal
- Abstract summary: We propose to estimate the final test performance based on a simple measure of training speed.
Our estimator is theoretically motivated by the connection between generalisation and training speed.
- Score: 47.683124540824515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reliable yet efficient evaluation of generalisation performance of a proposed
architecture is crucial to the success of neural architecture search (NAS).
Traditional approaches face a variety of limitations: training each
architecture to completion is prohibitively expensive, early stopped validation
accuracy may correlate poorly with fully trained performance, and model-based
estimators require large training sets. We instead propose to estimate the
final test performance based on a simple measure of training speed. Our
estimator is theoretically motivated by the connection between generalisation
and training speed, and is also inspired by the reformulation of a PAC-Bayes
bound under the Bayesian setting. Our model-free estimator is simple,
efficient, and cheap to implement, and does not require hyperparameter-tuning
or surrogate training before deployment. We demonstrate on various NAS search
spaces that our estimator consistently outperforms other alternatives in
achieving better correlation with the true test performance rankings. We
further show that our estimator can be easily incorporated into both
query-based and one-shot NAS methods to improve the speed or quality of the
search.
Related papers
- Robustifying and Boosting Training-Free Neural Architecture Search [49.828875134088904]
We propose a robustifying and boosting training-free NAS (RoBoT) algorithm to develop a robust and consistently better-performing metric on diverse tasks.
Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS.
arXiv Detail & Related papers (2024-03-12T12:24:11Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - EBJR: Energy-Based Joint Reasoning for Adaptive Inference [10.447353952054492]
State-of-the-art deep learning models have achieved significant performance levels on various benchmarks.
Light-weight architectures, on the other hand, achieve moderate accuracies, but at a much more desirable latency.
This paper presents a new method of jointly using the large accurate models together with the small fast ones.
arXiv Detail & Related papers (2021-10-20T02:33:31Z) - RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform
Successive Halving [74.61723678821049]
We propose NOn-uniform Successive Halving (NOSH), a hierarchical scheduling algorithm that terminates the training of underperforming architectures early to avoid wasting budget.
We formulate predictor-based architecture search as learning to rank with pairwise comparisons.
The resulting method - RANK-NOSH, reduces the search budget by 5x while achieving competitive or even better performance than previous state-of-the-art predictor-based methods on various spaces and datasets.
arXiv Detail & Related papers (2021-08-18T07:45:21Z) - AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision
of Weight Sharing [6.171090327531059]
We introduce Learning to Rank methods to select the best (ace) architectures from a space.
We also propose to leverage weak supervision from weight sharing by pretraining architecture representation on weak labels obtained from the super-net.
Experiments on NAS benchmarks and large-scale search spaces demonstrate that our approach outperforms SOTA with a significantly reduced search cost.
arXiv Detail & Related papers (2021-08-06T08:31:42Z) - SIMPLE: SIngle-network with Mimicking and Point Learning for Bottom-up
Human Pose Estimation [81.03485688525133]
We propose a novel multi-person pose estimation framework, SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation (SIMPLE)
Specifically, in the training process, we enable SIMPLE to mimic the pose knowledge from the high-performance top-down pipeline.
Besides, SIMPLE formulates human detection and pose estimation as a unified point learning framework to complement each other in single-network.
arXiv Detail & Related papers (2021-04-06T13:12:51Z) - Efficient Model Performance Estimation via Feature Histories [27.008927077173553]
An important step in the task of neural network design is the evaluation of a model's performance.
In this work, we use the evolution history of features of a network during the early stages of training to build a proxy classifier.
We show that our method can be combined with multiple search algorithms to find better solutions to a wide range of tasks.
arXiv Detail & Related papers (2021-03-07T20:41:57Z) - FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking.
We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints.
FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.