K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets
- URL: http://arxiv.org/abs/2106.06442v1
- Date: Fri, 11 Jun 2021 14:57:36 GMT
- Title: K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets
- Authors: Xiu Su, Shan You, Mingkai Zheng, Fei Wang, Chen Qian, Changshui Zhang,
Chang Xu
- Abstract summary: We introduce $K$-shot supernets and take their weights for each operation as a dictionary.
A textitsimplex-net is introduced to produce architecture-customized code for each path.
Experiments on benchmark datasets validate that K-shot NAS significantly improves the evaluation accuracy of paths.
- Score: 52.983810997539486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In one-shot weight sharing for NAS, the weights of each operation (at each
layer) are supposed to be identical for all architectures (paths) in the
supernet. However, this rules out the possibility of adjusting operation
weights to cater for different paths, which limits the reliability of the
evaluation results. In this paper, instead of counting on a single supernet, we
introduce $K$-shot supernets and take their weights for each operation as a
dictionary. The operation weight for each path is represented as a convex
combination of items in a dictionary with a simplex code. This enables a matrix
approximation of the stand-alone weight matrix with a higher rank ($K>1$). A
\textit{simplex-net} is introduced to produce architecture-customized code for
each path. As a result, all paths can adaptively learn how to share weights in
the $K$-shot supernets and acquire corresponding weights for better evaluation.
$K$-shot supernets and simplex-net can be iteratively trained, and we further
extend the search to the channel dimension. Extensive experiments on benchmark
datasets validate that K-shot NAS significantly improves the evaluation
accuracy of paths and thus brings in impressive performance improvements.
Related papers
- Learning to Compose SuperWeights for Neural Parameter Allocation Search [61.078949532440724]
We show that our approach can generate parameters for many network using the same set of weights.
This enables us to support tasks like efficient ensembling and anytime prediction.
arXiv Detail & Related papers (2023-12-03T04:20:02Z) - Prior-Guided One-shot Neural Architecture Search [11.609732776776982]
We present Prior-Guided One-shot NAS (PGONAS) to strengthen the ranking correlation of supernets.
Our PGONAS ranks 3rd place in the supernet Track Track of CVPR2022 Second lightweight NAS challenge.
arXiv Detail & Related papers (2022-06-27T14:19:56Z) - Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing.
Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets.
It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z) - An Analysis of Super-Net Heuristics in Weight-Sharing NAS [70.57382341642418]
We show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained.
We show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained.
arXiv Detail & Related papers (2021-10-04T02:18:44Z) - Pi-NAS: Improving Neural Architecture Search by Reducing Supernet
Training Consistency Shift [128.32670289503025]
Recently proposed neural architecture search (NAS) methods co-train billions of architectures in a supernet and estimate their potential accuracy.
The ranking correlation between the architectures' predicted accuracy and their actual capability is incorrect, which causes the existing NAS methods' dilemma.
We attribute this ranking correlation problem to the supernet training consistency shift, including feature shift and parameter shift.
We address these two shifts simultaneously using a nontrivial supernet-Pi model, called Pi-NAS.
arXiv Detail & Related papers (2021-08-22T09:08:48Z) - How Does Supernet Help in Neural Architecture Search? [3.8348281160758027]
We conduct a comprehensive analysis on five search spaces, including NAS-Bench-101, NAS-Bench-201, DARTS-CIFAR10, DARTS-PTB, and ProxylessNAS.
We find that weight sharing works well on some search spaces but fails on others.
Our work is expected to inspire future NAS researchers to better leverage the power of weight sharing.
arXiv Detail & Related papers (2020-10-16T08:07:03Z) - GreedyNAS: Towards Fast One-Shot NAS with Greedy Supernet [63.96959854429752]
GreedyNAS is easy-to-follow, and experimental results on ImageNet dataset indicate that it can achieve better Top-1 accuracy under same search space and FLOPs or latency level.
By searching on a larger space, our GreedyNAS can also obtain new state-of-the-art architectures.
arXiv Detail & Related papers (2020-03-25T06:54:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.