Few-shot Neural Architecture Search
- URL: http://arxiv.org/abs/2006.06863v9
- Date: Mon, 2 Aug 2021 03:41:45 GMT
- Title: Few-shot Neural Architecture Search
- Authors: Yiyang Zhao, Linnan Wang, Yuandong Tian, Rodrigo Fonseca, Tian Guo
- Abstract summary: We propose few-shot NAS that uses multiple supernetworks, called sub-supernets, each covering different regions of the search space to alleviate the undesired co-adaption.
With only up to 7 sub-supernets, few-shot NAS establishes new SoTAs: on ImageNet, it finds models that reach 80.5% top-1 accuracy at 600 MB FLOPS and 77.5% top-1 accuracy at 238 MFLOPS.
- Score: 35.28010196935195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient evaluation of a network architecture drawn from a large search
space remains a key challenge in Neural Architecture Search (NAS). Vanilla NAS
evaluates each architecture by training from scratch, which gives the true
performance but is extremely time-consuming. Recently, one-shot NAS
substantially reduces the computation cost by training only one supernetwork,
a.k.a. supernet, to approximate the performance of every architecture in the
search space via weight-sharing. However, the performance estimation can be
very inaccurate due to the co-adaption among operations. In this paper, we
propose few-shot NAS that uses multiple supernetworks, called sub-supernet,
each covering different regions of the search space to alleviate the undesired
co-adaption. Compared to one-shot NAS, few-shot NAS improves the accuracy of
architecture evaluation with a small increase of evaluation cost. With only up
to 7 sub-supernets, few-shot NAS establishes new SoTAs: on ImageNet, it finds
models that reach 80.5% top-1 accuracy at 600 MB FLOPS and 77.5% top-1 accuracy
at 238 MFLOPS; on CIFAR10, it reaches 98.72% top-1 accuracy without using extra
data or transfer learning. In Auto-GAN, few-shot NAS outperforms the previously
published results by up to 20%. Extensive experiments show that few-shot NAS
significantly improves various one-shot methods, including 4 gradient-based and
6 search-based methods on 3 different tasks in NasBench-201 and
NasBench1-shot-1.
Related papers
- SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss
Landscape [14.550053893504764]
We introduce a "sub-one-shot" paradigm that serves as a bridge between zero-shot and one-shot NAS.
In sub-one-shot NAS, the supernet is trained using only a small subset of the training data, a phase we refer to as "warm-up"
We present SiGeo, a proxy founded on a novel theoretical framework that connects the supernet warm-up with the efficacy of the proxy.
arXiv Detail & Related papers (2023-11-22T05:25:24Z) - Are Neural Architecture Search Benchmarks Well Designed? A Deeper Look
Into Operation Importance [5.065947993017157]
We conduct an empirical analysis of the widely used NAS-Bench-101, NAS-Bench-201 and TransNAS-Bench-101 benchmarks.
We found that only a subset of the operation pool is required to generate architectures close to the upper-bound of the performance range.
We consistently found convolution layers to have the highest impact on the architecture's performance.
arXiv Detail & Related papers (2023-03-29T18:03:28Z) - ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients [17.139381064317778]
We propose a new zero-shot proxy, ZiCo, that works consistently better than #Params.
ZiCo-based NAS can find optimal architectures with 78.1%, 79.4%, and 80.4% test accuracy under inference budgets of 450M, 600M, and 1000M FLOPs, respectively.
arXiv Detail & Related papers (2023-01-26T18:38:56Z) - RD-NAS: Enhancing One-shot Supernet Ranking Ability via Ranking
Distillation from Zero-cost Proxies [20.076610051602618]
We propose Ranking Distillation one-shot NAS (RD-NAS) to enhance ranking consistency.
Our evaluation of the NAS-Bench-201 and ResNet-based search space demonstrates that RD-NAS achieve 10.7% and 9.65% improvements in ranking ability.
arXiv Detail & Related papers (2023-01-24T07:49:04Z) - When NAS Meets Trees: An Efficient Algorithm for Neural Architecture
Search [117.89827740405694]
Key challenge in neural architecture search (NAS) is designing how to explore wisely in the huge search space.
We propose a new NAS method called TNAS (NAS with trees), which improves search efficiency by exploring only a small number of architectures.
TNAS finds the global optimal architecture on CIFAR-10 with test accuracy of 94.37% in four GPU hours in NAS-Bench-201.
arXiv Detail & Related papers (2022-04-11T07:34:21Z) - BaLeNAS: Differentiable Architecture Search via the Bayesian Learning
Rule [95.56873042777316]
Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost.
This paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions.
We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability.
arXiv Detail & Related papers (2021-11-25T18:13:42Z) - TND-NAS: Towards Non-differentiable Objectives in Progressive
Differentiable NAS Framework [6.895590095853327]
Differentiable architecture search has gradually become the mainstream research topic in the field of Neural Architecture Search (NAS)
Recent differentiable NAS also aims at further improving the search performance and reducing the GPU-memory consumption.
We propose the TND-NAS, which is with the merits of the high efficiency in differentiable NAS framework and the compatibility among non-differentiable metrics in Multi-objective NAS.
arXiv Detail & Related papers (2021-11-06T14:19:36Z) - Binarized Neural Architecture Search for Efficient Object Recognition [120.23378346337311]
Binarized neural architecture search (BNAS) produces extremely compressed models to reduce huge computational cost on embedded devices for edge computing.
An accuracy of $96.53%$ vs. $97.22%$ is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a $40%$ faster search than the state-of-the-art PC-DARTS.
arXiv Detail & Related papers (2020-09-08T15:51:23Z) - Powering One-shot Topological NAS with Stabilized Share-parameter Proxy [65.09967910722932]
One-shot NAS method has attracted much interest from the research community due to its remarkable training efficiency and capacity to discover high performance models.
In this work, we try to enhance the one-shot NAS by exploring high-performing network architectures in our large-scale Topology Augmented Search Space.
The proposed method achieves state-of-the-art performance under Multiply-Adds (MAdds) constraint on ImageNet.
arXiv Detail & Related papers (2020-05-21T08:18:55Z) - DSNAS: Direct Neural Architecture Search without Parameter Retraining [112.02966105995641]
We propose a new problem definition for NAS, task-specific end-to-end, based on this observation.
We propose DSNAS, an efficient differentiable NAS framework that simultaneously optimize architecture and parameters with a low-biased Monte Carlo estimate.
DSNAS successfully discovers networks with comparable accuracy (74.4%) on ImageNet in 420 GPU hours, reducing the total time by more than 34%.
arXiv Detail & Related papers (2020-02-21T04:41:47Z) - NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture
Search [55.12928953187342]
We propose an extension to NAS-Bench-101: NAS-Bench-201 with a different search space, results on multiple datasets, and more diagnostic information.
NAS-Bench-201 has a fixed search space and provides a unified benchmark for almost any up-to-date NAS algorithms.
We provide additional diagnostic information such as fine-grained loss and accuracy, which can give inspirations to new designs of NAS algorithms.
arXiv Detail & Related papers (2020-01-02T05:28:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.