Robustifying and Boosting Training-Free Neural Architecture Search
- URL: http://arxiv.org/abs/2403.07591v1
- Date: Tue, 12 Mar 2024 12:24:11 GMT
- Title: Robustifying and Boosting Training-Free Neural Architecture Search
- Authors: Zhenfeng He, Yao Shu, Zhongxiang Dai, Bryan Kian Hsiang Low
- Abstract summary: We propose a robustifying and boosting training-free NAS (RoBoT) algorithm to develop a robust and consistently better-performing metric on diverse tasks.
Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS.
- Score: 49.828875134088904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural architecture search (NAS) has become a key component of AutoML and a
standard tool to automate the design of deep neural networks. Recently,
training-free NAS as an emerging paradigm has successfully reduced the search
costs of standard training-based NAS by estimating the true architecture
performance with only training-free metrics. Nevertheless, the estimation
ability of these metrics typically varies across different tasks, making it
challenging to achieve robust and consistently good search performance on
diverse tasks with only a single training-free metric. Meanwhile, the
estimation gap between training-free metrics and the true architecture
performances limits training-free NAS to achieve superior performance. To
address these challenges, we propose the robustifying and boosting
training-free NAS (RoBoT) algorithm which (a) employs the optimized combination
of existing training-free metrics explored from Bayesian optimization to
develop a robust and consistently better-performing metric on diverse tasks,
and (b) applies greedy search, i.e., the exploitation, on the newly developed
metric to bridge the aforementioned gap and consequently to boost the search
performance of standard training-free NAS further. Remarkably, the expected
performance of our RoBoT can be theoretically guaranteed, which improves over
the existing training-free NAS under mild conditions with additional
interesting insights. Our extensive experiments on various NAS benchmark tasks
yield substantial empirical evidence to support our theoretical results.
Related papers
- Efficient Multi-Objective Neural Architecture Search via Pareto Dominance-based Novelty Search [0.0]
Neural Architecture Search (NAS) aims to automate the discovery of high-performing deep neural network architectures.
Traditional NAS approaches typically optimize a certain performance metric (e.g., prediction accuracy) overlooking large parts of the architecture search space that potentially contain interesting network configurations.
This paper presents a novelty search for multi-objective NAS with Multiple Training-Free metrics (MTF-PDNS)
arXiv Detail & Related papers (2024-07-30T08:52:10Z) - Robust NAS under adversarial training: benchmark, theory, and beyond [55.51199265630444]
We release a comprehensive data set that encompasses both clean accuracy and robust accuracy for a vast array of adversarially trained networks.
We also establish a generalization theory for searching architecture in terms of clean accuracy and robust accuracy under multi-objective adversarial training.
arXiv Detail & Related papers (2024-03-19T20:10:23Z) - Generalizable Lightweight Proxy for Robust NAS against Diverse
Perturbations [59.683234126055694]
Recent neural architecture search (NAS) frameworks have been successful in finding optimal architectures for given conditions.
We propose a novel lightweight robust zero-cost proxy that considers the consistency across features, parameters, and gradients of both clean and perturbed images.
Our approach facilitates an efficient and rapid search for neural architectures capable of learning generalizable features that exhibit robustness across diverse perturbations.
arXiv Detail & Related papers (2023-06-08T08:34:26Z) - Training-free Neural Architecture Search for RNNs and Transformers [0.0]
We develop a new training-free metric, named hidden covariance, that predicts the trained performance of an RNN architecture.
We find that the current search space paradigm for transformer architectures is not optimized for training-free neural architecture search.
arXiv Detail & Related papers (2023-06-01T02:06:13Z) - Generalization Properties of NAS under Activation and Skip Connection
Search [66.8386847112332]
We study the generalization properties of Neural Architecture Search (NAS) under a unifying framework.
We derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime.
We show how the derived results can guide NAS to select the top-performing architectures, even in the case without training.
arXiv Detail & Related papers (2022-09-15T12:11:41Z) - Unifying and Boosting Gradient-Based Training-Free Neural Architecture
Search [30.986396610873626]
Neural architecture search (NAS) has gained immense popularity owing to its ability to automate neural architecture design.
A number of training-free metrics are recently proposed to realize NAS without training, hence making NAS more scalable.
Despite their competitive empirical performances, a unified theoretical understanding of these training-free metrics is lacking.
arXiv Detail & Related papers (2022-01-24T16:26:11Z) - Understanding and Accelerating Neural Architecture Search with
Training-Free and Theory-Grounded Metrics [117.4281417428145]
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS)
NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations.
We present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks.
arXiv Detail & Related papers (2021-08-26T17:52:07Z) - Speedy Performance Estimation for Neural Architecture Search [47.683124540824515]
We propose to estimate the final test performance based on a simple measure of training speed.
Our estimator is theoretically motivated by the connection between generalisation and training speed.
arXiv Detail & Related papers (2020-06-08T11:48:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.