Related papers: An Analysis of Super-Net Heuristics in Weight-Sharing NAS

An Analysis of Super-Net Heuristics in Weight-Sharing NAS

URL: http://arxiv.org/abs/2110.01154v1
Date: Mon, 4 Oct 2021 02:18:44 GMT
Title: An Analysis of Super-Net Heuristics in Weight-Sharing NAS
Authors: Kaicheng Yu, Ren\'e Ranftl, Mathieu Salzmann
Abstract summary: We show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained. We show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained.
Score: 70.57382341642418
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware. Existing methods in this space rely on a diverse set of heuristics to design and train the shared-weight backbone network, a.k.a. the super-net. Since heuristics substantially vary across different methods and have not been carefully studied, it is unclear to which extent they impact super-net training and hence the weight-sharing NAS algorithms. In this paper, we disentangle super-net training from the search algorithm, isolate 14 frequently-used training heuristics, and evaluate them over three benchmark search spaces. Our analysis uncovers that several commonly-used heuristics negatively impact the correlation between super-net and stand-alone performance, whereas simple, but often overlooked factors, such as proper hyper-parameter settings, are key to achieve strong performance. Equipped with this knowledge, we show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained.

Related papers

Subnet-Aware Dynamic Supernet Training for Neural Architecture Search [34.085718250054136]
N-shot architecture search (NAS) exploits a supernet containing all candidates for a given search space. Supernet training is biased towards the low-complexitys (unfairness) We present a dynamic supernet training technique to address these problems by adjusting the training strategy adaptive to the complexitys.
arXiv Detail & Related papers (2025-03-13T17:07:04Z)
Prior-Guided One-shot Neural Architecture Search [11.609732776776982]
We present Prior-Guided One-shot NAS (PGONAS) to strengthen the ranking correlation of supernets. Our PGONAS ranks 3rd place in the supernet Track Track of CVPR2022 Second lightweight NAS challenge.
arXiv Detail & Related papers (2022-06-27T14:19:56Z)
Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing. Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets. It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z)
Understanding and Accelerating Neural Architecture Search with Training-Free and Theory-Grounded Metrics [117.4281417428145]
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS) NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations. We present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks.
arXiv Detail & Related papers (2021-08-26T17:52:07Z)
Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search [70.57382341642418]
Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware. Recent works have empirically shown a ranking disorder between the performance of stand-alone architectures and that of the corresponding shared-weight networks. We propose a regularization term that aims to maximize the correlation between the performance rankings of the shared-weight network and that of the standalone architectures.
arXiv Detail & Related papers (2021-04-12T09:32:33Z)
Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap [90.93522795555724]
Neural architecture search (NAS) has attracted increasing attentions in both academia and industry. Weight-sharing methods were proposed in which exponentially many architectures share weights in the same super-network. This paper provides a literature review on NAS, in particular the weight-sharing methods.
arXiv Detail & Related papers (2020-08-04T11:57:03Z)
How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS [64.50415611717057]
We show that some commonly-used baselines for super-net training negatively impact the correlation between super-net and stand-alone performance. Our code and experiments set a strong and reproducible baseline that future works can build on.
arXiv Detail & Related papers (2020-03-09T17:34:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.