Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach
- URL: http://arxiv.org/abs/2403.11380v1
- Date: Mon, 18 Mar 2024 00:13:41 GMT
- Title: Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach
- Authors: Beichen Zhang, Xiaoxing Wang, Xiaohan Qin, Junchi Yan,
- Abstract summary: We propose a novel concept of Supernet Shifting, a refined search strategy combining architecture searching with supernet fine-tuning.
We show that Supernet Shifting can fulfill transferring supernet to a new dataset.
Comprehensive experiments show that our method has better order-preserving ability and can find a dominating architecture.
- Score: 57.175488207316654
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supernet is a core component in many recent Neural Architecture Search (NAS) methods. It not only helps embody the search space but also provides a (relative) estimation of the final performance of candidate architectures. Thus, it is critical that the top architectures ranked by a supernet should be consistent with those ranked by true performance, which is known as the order-preserving ability. In this work, we analyze the order-preserving ability on the whole search space (global) and a sub-space of top architectures (local), and empirically show that the local order-preserving for current two-stage NAS methods still need to be improved. To rectify this, we propose a novel concept of Supernet Shifting, a refined search strategy combining architecture searching with supernet fine-tuning. Specifically, apart from evaluating, the training loss is also accumulated in searching and the supernet is updated every iteration. Since superior architectures are sampled more frequently in evolutionary searching, the supernet is encouraged to focus on top architectures, thus improving local order-preserving. Besides, a pre-trained supernet is often un-reusable for one-shot methods. We show that Supernet Shifting can fulfill transferring supernet to a new dataset. Specifically, the last classifier layer will be unset and trained through evolutionary searching. Comprehensive experiments show that our method has better order-preserving ability and can find a dominating architecture. Moreover, the pre-trained supernet can be easily transferred into a new dataset with no loss of performance.
Related papers
- The devil is in discretization discrepancy. Robustifying Differentiable NAS with Single-Stage Searching Protocol [2.4300749758571905]
gradient-based methods suffer from the discretization error, which can severely damage the process of obtaining the final architecture.
We introduce a novel single-stage searching protocol, which is not reliant on decoding a continuous architecture.
Our results demonstrate that this approach outperforms other DNAS methods by achieving 75.3% in the searching stage on the Cityscapes validation dataset.
arXiv Detail & Related papers (2024-05-26T15:44:53Z) - Improving Differentiable Architecture Search via Self-Distillation [20.596850268316565]
Differentiable Architecture Search (DARTS) is a simple yet efficient Neural Architecture Search (NAS) method.
We propose Self-Distillation Differentiable Neural Architecture Search (SD-DARTS) to alleviate the discretization gap.
arXiv Detail & Related papers (2023-02-11T08:58:55Z) - Novelty Driven Evolutionary Neural Architecture Search [6.8129169853808795]
Evolutionary algorithms (EA) based neural architecture search (NAS) involves evaluating each architecture by training it from scratch, which is extremely time-consuming.
We propose a method called NEvoNAS wherein the NAS problem is posed as a multi-objective problem with 2 objectives: (i) maximize architecture novelty, (ii) maximize architecture fitness/accuracy.
NSGA-II is used for finding the textitpareto optimal front for the NAS problem and the best architecture in the pareto front is returned as the searched architecture.
arXiv Detail & Related papers (2022-04-01T03:32:55Z) - Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing.
Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets.
It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z) - Neural Architecture Search using Progressive Evolution [6.8129169853808795]
We propose a method called pEvoNAS for neural architecture search using evolutionary algorithms.
The whole neural architecture search space is progressively reduced to smaller search space regions with good architectures.
pEvoNAS gives better results on CIFAR-10 and CIFAR-100 while using significantly less computational resources as compared to previous EA-based methods.
arXiv Detail & Related papers (2022-03-03T08:15:14Z) - Rethinking Architecture Selection in Differentiable NAS [74.61723678821049]
Differentiable Neural Architecture Search is one of the most popular NAS methods for its search efficiency and simplicity.
We propose an alternative perturbation-based architecture selection that directly measures each operation's influence on the supernet.
We find that several failure modes of DARTS can be greatly alleviated with the proposed selection method.
arXiv Detail & Related papers (2021-08-10T00:53:39Z) - Weak NAS Predictors Are All You Need [91.11570424233709]
Recent predictor-based NAS approaches attempt to solve the problem with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor.
We shift the paradigm from finding a complicated predictor that covers the whole architecture space to a set of weaker predictors that progressively move towards the high-performance sub-space.
Our method costs fewer samples to find the top-performance architectures on NAS-Bench-101 and NAS-Bench-201, and it achieves the state-of-the-art ImageNet performance on the NASNet search space.
arXiv Detail & Related papers (2021-02-21T01:58:43Z) - GOLD-NAS: Gradual, One-Level, Differentiable [100.12492801459105]
We propose a novel algorithm named Gradual One-Level Differentiable Neural Architecture Search (GOLD-NAS)
It introduces a variable resource constraint to one-level optimization so that the weak operators are gradually pruned out from the super-network.
arXiv Detail & Related papers (2020-07-07T10:37:49Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.