Improving Ranking Correlation of Supernet with Candidates Enhancement
and Progressive Training
- URL: http://arxiv.org/abs/2108.05866v1
- Date: Thu, 12 Aug 2021 17:27:10 GMT
- Title: Improving Ranking Correlation of Supernet with Candidates Enhancement
and Progressive Training
- Authors: Ziwei Yang, Ruyi Zhang, Zhi Yang, Xubo Yang, Lei Wang and Zheyang Li
- Abstract summary: One-shot neural architecture search (NAS) applies weight-sharing supernet to reduce the unaffordable computation overhead of automated architecture designing.
We propose a candidates enhancement method and progressive training pipeline to improve the ranking correlation of supernet.
Our method ranks the 1st place in the Supernet Track of CVPR 2021 1st Lightweight NAS Challenge.
- Score: 8.373420721376739
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One-shot neural architecture search (NAS) applies weight-sharing supernet to
reduce the unaffordable computation overhead of automated architecture
designing. However, the weight-sharing technique worsens the ranking
consistency of performance due to the interferences between different candidate
networks. To address this issue, we propose a candidates enhancement method and
progressive training pipeline to improve the ranking correlation of supernet.
Specifically, we carefully redesign the sub-networks in the supernet and map
the original supernet to a new one of high capacity. In addition, we gradually
add narrow branches of supernet to reduce the degree of weight sharing which
effectively alleviates the mutual interference between sub-networks. Finally,
our method ranks the 1st place in the Supernet Track of CVPR2021 1st
Lightweight NAS Challenge.
Related papers
- Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach [57.175488207316654]
We propose a novel concept of Supernet Shifting, a refined search strategy combining architecture searching with supernet fine-tuning.
We show that Supernet Shifting can fulfill transferring supernet to a new dataset.
Comprehensive experiments show that our method has better order-preserving ability and can find a dominating architecture.
arXiv Detail & Related papers (2024-03-18T00:13:41Z) - Learning to Compose SuperWeights for Neural Parameter Allocation Search [61.078949532440724]
We show that our approach can generate parameters for many network using the same set of weights.
This enables us to support tasks like efficient ensembling and anytime prediction.
arXiv Detail & Related papers (2023-12-03T04:20:02Z) - Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts [55.470959564665705]
Weight-sharing supernets are crucial for performance estimation in cutting-edge neural search frameworks.
The proposed method attains state-of-the-art (SoTA) performance in NAS for fast machine translation models.
It excels in NAS for building memory-efficient task-agnostic BERT models.
arXiv Detail & Related papers (2023-06-08T00:35:36Z) - CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot
NAS [19.485514022334844]
One-shot Neural Architecture Search (NAS) has been widely used to discover architectures due to its efficiency.
Previous studies reveal that one-shot performance estimations of architectures might not be well correlated with their performances in stand-alone training.
We propose Curriculum Learning On Sharing Extent (CLOSE) to train the supernet both efficiently and effectively.
arXiv Detail & Related papers (2022-07-16T07:45:17Z) - Prior-Guided One-shot Neural Architecture Search [11.609732776776982]
We present Prior-Guided One-shot NAS (PGONAS) to strengthen the ranking correlation of supernets.
Our PGONAS ranks 3rd place in the supernet Track Track of CVPR2022 Second lightweight NAS challenge.
arXiv Detail & Related papers (2022-06-27T14:19:56Z) - Improve Ranking Correlation of Super-net through Training Scheme from
One-shot NAS to Few-shot NAS [13.390484379343908]
We propose a step-by-step training super-net scheme from one-shot NAS to few-shot NAS.
In the training scheme, we firstly train super-net in a one-shot way, and then we disentangle the weights of super-net.
Our method ranks 4th place in the CVPR2022 3rd Lightweight NAS Challenge Track1.
arXiv Detail & Related papers (2022-06-13T04:02:12Z) - Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing.
Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets.
It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z) - Pi-NAS: Improving Neural Architecture Search by Reducing Supernet
Training Consistency Shift [128.32670289503025]
Recently proposed neural architecture search (NAS) methods co-train billions of architectures in a supernet and estimate their potential accuracy.
The ranking correlation between the architectures' predicted accuracy and their actual capability is incorrect, which causes the existing NAS methods' dilemma.
We attribute this ranking correlation problem to the supernet training consistency shift, including feature shift and parameter shift.
We address these two shifts simultaneously using a nontrivial supernet-Pi model, called Pi-NAS.
arXiv Detail & Related papers (2021-08-22T09:08:48Z) - AlphaNet: Improved Training of Supernet with Alpha-Divergence [28.171262066145616]
We propose to improve the supernet training with a more generalized alpha-divergence.
We apply the proposed alpha-divergence based supernet training to both slimmable neural networks and weight-sharing NAS.
Specifically, our discovered model family, AlphaNet, outperforms prior-art models on a wide range of FLOPs regimes.
arXiv Detail & Related papers (2021-02-16T04:23:55Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.