Related papers: Speedy Performance Estimation for Neural Architecture Search

Speedy Performance Estimation for Neural Architecture Search

URL: http://arxiv.org/abs/2006.04492v2
Date: Tue, 8 Jun 2021 02:41:51 GMT
Title: Speedy Performance Estimation for Neural Architecture Search
Authors: Binxin Ru, Clare Lyle, Lisa Schut, Miroslav Fil, Mark van der Wilk and Yarin Gal
Abstract summary: We propose to estimate the final test performance based on a simple measure of training speed. Our estimator is theoretically motivated by the connection between generalisation and training speed.
Score: 47.683124540824515
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reliable yet efficient evaluation of generalisation performance of a proposed architecture is crucial to the success of neural architecture search (NAS). Traditional approaches face a variety of limitations: training each architecture to completion is prohibitively expensive, early stopped validation accuracy may correlate poorly with fully trained performance, and model-based estimators require large training sets. We instead propose to estimate the final test performance based on a simple measure of training speed. Our estimator is theoretically motivated by the connection between generalisation and training speed, and is also inspired by the reformulation of a PAC-Bayes bound under the Bayesian setting. Our model-free estimator is simple, efficient, and cheap to implement, and does not require hyperparameter-tuning or surrogate training before deployment. We demonstrate on various NAS search spaces that our estimator consistently outperforms other alternatives in achieving better correlation with the true test performance rankings. We further show that our estimator can be easily incorporated into both query-based and one-shot NAS methods to improve the speed or quality of the search.

Related papers

AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining [12.630306478872043]
We propose textbfAdaLRS, a plug-in-and-play adaptive learning rate search algorithm that conducts online optimal learning rate search.<n>Experiments show that AdaLRS adjusts suboptimal learning rates to the neighborhood of optimum with marked efficiency and effectiveness.
arXiv Detail & Related papers (2025-06-16T09:14:01Z)
Robustifying and Boosting Training-Free Neural Architecture Search [49.828875134088904]
We propose a robustifying and boosting training-free NAS (RoBoT) algorithm to develop a robust and consistently better-performing metric on diverse tasks. Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS.
arXiv Detail & Related papers (2024-03-12T12:24:11Z)
Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples. We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment. We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z)
EBJR: Energy-Based Joint Reasoning for Adaptive Inference [10.447353952054492]
State-of-the-art deep learning models have achieved significant performance levels on various benchmarks. Light-weight architectures, on the other hand, achieve moderate accuracies, but at a much more desirable latency. This paper presents a new method of jointly using the large accurate models together with the small fast ones.
arXiv Detail & Related papers (2021-10-20T02:33:31Z)
RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform Successive Halving [74.61723678821049]
We propose NOn-uniform Successive Halving (NOSH), a hierarchical scheduling algorithm that terminates the training of underperforming architectures early to avoid wasting budget. We formulate predictor-based architecture search as learning to rank with pairwise comparisons. The resulting method - RANK-NOSH, reduces the search budget by 5x while achieving competitive or even better performance than previous state-of-the-art predictor-based methods on various spaces and datasets.
arXiv Detail & Related papers (2021-08-18T07:45:21Z)
AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing [6.171090327531059]
We introduce Learning to Rank methods to select the best (ace) architectures from a space. We also propose to leverage weak supervision from weight sharing by pretraining architecture representation on weak labels obtained from the super-net. Experiments on NAS benchmarks and large-scale search spaces demonstrate that our approach outperforms SOTA with a significantly reduced search cost.
arXiv Detail & Related papers (2021-08-06T08:31:42Z)
SIMPLE: SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation [81.03485688525133]
We propose a novel multi-person pose estimation framework, SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation (SIMPLE) Specifically, in the training process, we enable SIMPLE to mimic the pose knowledge from the high-performance top-down pipeline. Besides, SIMPLE formulates human detection and pose estimation as a unified point learning framework to complement each other in single-network.
arXiv Detail & Related papers (2021-04-06T13:12:51Z)
Efficient Model Performance Estimation via Feature Histories [27.008927077173553]
An important step in the task of neural network design is the evaluation of a model's performance. In this work, we use the evolution history of features of a network during the early stages of training to build a proxy classifier. We show that our method can be combined with multiple search algorithms to find better solutions to a wide range of tasks.
arXiv Detail & Related papers (2021-03-07T20:41:57Z)
FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking. We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints. FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.