DSNAS: Direct Neural Architecture Search without Parameter Retraining
- URL: http://arxiv.org/abs/2002.09128v2
- Date: Wed, 1 Apr 2020 00:31:37 GMT
- Title: DSNAS: Direct Neural Architecture Search without Parameter Retraining
- Authors: Shoukang Hu, Sirui Xie, Hehui Zheng, Chunxiao Liu, Jianping Shi,
Xunying Liu, Dahua Lin
- Abstract summary: We propose a new problem definition for NAS, task-specific end-to-end, based on this observation.
We propose DSNAS, an efficient differentiable NAS framework that simultaneously optimize architecture and parameters with a low-biased Monte Carlo estimate.
DSNAS successfully discovers networks with comparable accuracy (74.4%) on ImageNet in 420 GPU hours, reducing the total time by more than 34%.
- Score: 112.02966105995641
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: If NAS methods are solutions, what is the problem? Most existing NAS methods
require two-stage parameter optimization. However, performance of the same
architecture in the two stages correlates poorly. In this work, we propose a
new problem definition for NAS, task-specific end-to-end, based on this
observation. We argue that given a computer vision task for which a NAS method
is expected, this definition can reduce the vaguely-defined NAS evaluation to
i) accuracy of this task and ii) the total computation consumed to finally
obtain a model with satisfying accuracy. Seeing that most existing methods do
not solve this problem directly, we propose DSNAS, an efficient differentiable
NAS framework that simultaneously optimizes architecture and parameters with a
low-biased Monte Carlo estimate. Child networks derived from DSNAS can be
deployed directly without parameter retraining. Comparing with two-stage
methods, DSNAS successfully discovers networks with comparable accuracy (74.4%)
on ImageNet in 420 GPU hours, reducing the total time by more than 34%. Our
implementation is available at https://github.com/SNAS-Series/SNAS-Series.
Related papers
- Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets [55.2118691522524]
Distillation-aware Neural Architecture Search (DaNAS) aims to search for an optimal student architecture.
We propose a distillation-aware meta accuracy prediction model, DaSS (Distillation-aware Student Search), which can predict a given architecture's final performances on a dataset.
arXiv Detail & Related papers (2023-05-26T14:00:35Z) - When NAS Meets Trees: An Efficient Algorithm for Neural Architecture
Search [117.89827740405694]
Key challenge in neural architecture search (NAS) is designing how to explore wisely in the huge search space.
We propose a new NAS method called TNAS (NAS with trees), which improves search efficiency by exploring only a small number of architectures.
TNAS finds the global optimal architecture on CIFAR-10 with test accuracy of 94.37% in four GPU hours in NAS-Bench-201.
arXiv Detail & Related papers (2022-04-11T07:34:21Z) - TND-NAS: Towards Non-differentiable Objectives in Progressive
Differentiable NAS Framework [6.895590095853327]
Differentiable architecture search has gradually become the mainstream research topic in the field of Neural Architecture Search (NAS)
Recent differentiable NAS also aims at further improving the search performance and reducing the GPU-memory consumption.
We propose the TND-NAS, which is with the merits of the high efficiency in differentiable NAS framework and the compatibility among non-differentiable metrics in Multi-objective NAS.
arXiv Detail & Related papers (2021-11-06T14:19:36Z) - Pi-NAS: Improving Neural Architecture Search by Reducing Supernet
Training Consistency Shift [128.32670289503025]
Recently proposed neural architecture search (NAS) methods co-train billions of architectures in a supernet and estimate their potential accuracy.
The ranking correlation between the architectures' predicted accuracy and their actual capability is incorrect, which causes the existing NAS methods' dilemma.
We attribute this ranking correlation problem to the supernet training consistency shift, including feature shift and parameter shift.
We address these two shifts simultaneously using a nontrivial supernet-Pi model, called Pi-NAS.
arXiv Detail & Related papers (2021-08-22T09:08:48Z) - Accelerating Neural Architecture Search via Proxy Data [17.86463546971522]
We propose a novel proxy data selection method tailored for neural architecture search (NAS)
executing DARTS with the proposed selection requires only 40 minutes on CIFAR-10 and 7.5 hours on ImageNet with a single GPU.
When the architecture searched on ImageNet using the proposed selection is inversely transferred to CIFAR-10, a state-of-the-art test error of 2.4% is yielded.
arXiv Detail & Related papers (2021-06-09T03:08:53Z) - AdvantageNAS: Efficient Neural Architecture Search with Credit
Assignment [23.988393741948485]
We propose a novel search strategy for one-shot and sparse propagation NAS, namely AdvantageNAS.
AdvantageNAS is a gradient-based approach that improves the search efficiency by introducing credit assignment in gradient estimation for architecture updates.
Experiments on the NAS-Bench-201 and PTB dataset show that AdvantageNAS discovers an architecture with higher performance under a limited time budget.
arXiv Detail & Related papers (2020-12-11T05:45:03Z) - Efficient Neural Architecture Search for End-to-end Speech Recognition
via Straight-Through Gradients [17.501966450686282]
We develop an efficient Neural Architecture Search (NAS) method via Straight-Through (ST) gradients, called ST-NAS.
Experiments over the widely benchmarked 80-hour WSJ and 300-hour Switchboard datasets show that ST-NAS induced architectures significantly outperform the human-designed architecture across the two datasets.
Strengths of ST-NAS such as architecture transferability and low computation cost in memory and time are also reported.
arXiv Detail & Related papers (2020-11-11T09:18:58Z) - Binarized Neural Architecture Search for Efficient Object Recognition [120.23378346337311]
Binarized neural architecture search (BNAS) produces extremely compressed models to reduce huge computational cost on embedded devices for edge computing.
An accuracy of $96.53%$ vs. $97.22%$ is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a $40%$ faster search than the state-of-the-art PC-DARTS.
arXiv Detail & Related papers (2020-09-08T15:51:23Z) - Few-shot Neural Architecture Search [35.28010196935195]
We propose few-shot NAS that uses multiple supernetworks, called sub-supernets, each covering different regions of the search space to alleviate the undesired co-adaption.
With only up to 7 sub-supernets, few-shot NAS establishes new SoTAs: on ImageNet, it finds models that reach 80.5% top-1 accuracy at 600 MB FLOPS and 77.5% top-1 accuracy at 238 MFLOPS.
arXiv Detail & Related papers (2020-06-11T22:36:01Z) - NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture
Search [55.12928953187342]
We propose an extension to NAS-Bench-101: NAS-Bench-201 with a different search space, results on multiple datasets, and more diagnostic information.
NAS-Bench-201 has a fixed search space and provides a unified benchmark for almost any up-to-date NAS algorithms.
We provide additional diagnostic information such as fine-grained loss and accuracy, which can give inspirations to new designs of NAS algorithms.
arXiv Detail & Related papers (2020-01-02T05:28:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.