Related papers: K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets

K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets

URL: http://arxiv.org/abs/2106.06442v1
Date: Fri, 11 Jun 2021 14:57:36 GMT
Title: K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets
Authors: Xiu Su, Shan You, Mingkai Zheng, Fei Wang, Chen Qian, Changshui Zhang, Chang Xu
Abstract summary: We introduce $K$-shot supernets and take their weights for each operation as a dictionary. A textitsimplex-net is introduced to produce architecture-customized code for each path. Experiments on benchmark datasets validate that K-shot NAS significantly improves the evaluation accuracy of paths.
Score: 52.983810997539486
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In one-shot weight sharing for NAS, the weights of each operation (at each layer) are supposed to be identical for all architectures (paths) in the supernet. However, this rules out the possibility of adjusting operation weights to cater for different paths, which limits the reliability of the evaluation results. In this paper, instead of counting on a single supernet, we introduce $K$-shot supernets and take their weights for each operation as a dictionary. The operation weight for each path is represented as a convex combination of items in a dictionary with a simplex code. This enables a matrix approximation of the stand-alone weight matrix with a higher rank ($K>1$). A \textit{simplex-net} is introduced to produce architecture-customized code for each path. As a result, all paths can adaptively learn how to share weights in the $K$-shot supernets and acquire corresponding weights for better evaluation. $K$-shot supernets and simplex-net can be iteratively trained, and we further extend the search to the channel dimension. Extensive experiments on benchmark datasets validate that K-shot NAS significantly improves the evaluation accuracy of paths and thus brings in impressive performance improvements.

Related papers

Efficient Few-Shot Neural Architecture Search by Counting the Number of Nonlinear Functions [29.76210308781724]
We introduce a novel few-shot NAS method that exploits the number of nonlinear functions to split the search space. Our method is efficient, since it does not require comparing gradients of a supernet to split the space. In addition, we have found that dividing the space allows us to reduce the channel dimensions required for each supernet.
arXiv Detail & Related papers (2024-12-19T09:31:53Z)
Learning to Compose SuperWeights for Neural Parameter Allocation Search [61.078949532440724]
We show that our approach can generate parameters for many network using the same set of weights. This enables us to support tasks like efficient ensembling and anytime prediction.
arXiv Detail & Related papers (2023-12-03T04:20:02Z)
Prior-Guided One-shot Neural Architecture Search [11.609732776776982]
We present Prior-Guided One-shot NAS (PGONAS) to strengthen the ranking correlation of supernets. Our PGONAS ranks 3rd place in the supernet Track Track of CVPR2022 Second lightweight NAS challenge.
arXiv Detail & Related papers (2022-06-27T14:19:56Z)
Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing. Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets. It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z)
An Analysis of Super-Net Heuristics in Weight-Sharing NAS [70.57382341642418]
We show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained. We show that simple random search achieves competitive performance to complex state-of-the-art NAS algorithms when the super-net is properly trained.
arXiv Detail & Related papers (2021-10-04T02:18:44Z)
Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift [128.32670289503025]
Recently proposed neural architecture search (NAS) methods co-train billions of architectures in a supernet and estimate their potential accuracy. The ranking correlation between the architectures' predicted accuracy and their actual capability is incorrect, which causes the existing NAS methods' dilemma. We attribute this ranking correlation problem to the supernet training consistency shift, including feature shift and parameter shift. We address these two shifts simultaneously using a nontrivial supernet-Pi model, called Pi-NAS.
arXiv Detail & Related papers (2021-08-22T09:08:48Z)
Prioritized Architecture Sampling with Monto-Carlo Tree Search [54.72096546595955]
One-shot neural architecture search (NAS) methods significantly reduce the search cost by considering the whole search space as one network. In this paper, we introduce a sampling strategy based on Monte Carlo tree search (MCTS) with the search space modeled as a Monte Carlo tree (MCT) For a fair comparison, we construct an open-source NAS benchmark of a macro search space evaluated on CIFAR-10, namely NAS-Bench-Macro.
arXiv Detail & Related papers (2021-03-22T15:09:29Z)
How Does Supernet Help in Neural Architecture Search? [3.8348281160758027]
We conduct a comprehensive analysis on five search spaces, including NAS-Bench-101, NAS-Bench-201, DARTS-CIFAR10, DARTS-PTB, and ProxylessNAS. We find that weight sharing works well on some search spaces but fails on others. Our work is expected to inspire future NAS researchers to better leverage the power of weight sharing.
arXiv Detail & Related papers (2020-10-16T08:07:03Z)
GreedyNAS: Towards Fast One-Shot NAS with Greedy Supernet [63.96959854429752]
GreedyNAS is easy-to-follow, and experimental results on ImageNet dataset indicate that it can achieve better Top-1 accuracy under same search space and FLOPs or latency level. By searching on a larger space, our GreedyNAS can also obtain new state-of-the-art architectures.
arXiv Detail & Related papers (2020-03-25T06:54:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.