Related papers: DSNAS: Direct Neural Architecture Search without Parameter Retraining

DSNAS: Direct Neural Architecture Search without Parameter Retraining

URL: http://arxiv.org/abs/2002.09128v2
Date: Wed, 1 Apr 2020 00:31:37 GMT
Title: DSNAS: Direct Neural Architecture Search without Parameter Retraining
Authors: Shoukang Hu, Sirui Xie, Hehui Zheng, Chunxiao Liu, Jianping Shi, Xunying Liu, Dahua Lin
Abstract summary: We propose a new problem definition for NAS, task-specific end-to-end, based on this observation. We propose DSNAS, an efficient differentiable NAS framework that simultaneously optimize architecture and parameters with a low-biased Monte Carlo estimate. DSNAS successfully discovers networks with comparable accuracy (74.4%) on ImageNet in 420 GPU hours, reducing the total time by more than 34%.
Score: 112.02966105995641
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: If NAS methods are solutions, what is the problem? Most existing NAS methods require two-stage parameter optimization. However, performance of the same architecture in the two stages correlates poorly. In this work, we propose a new problem definition for NAS, task-specific end-to-end, based on this observation. We argue that given a computer vision task for which a NAS method is expected, this definition can reduce the vaguely-defined NAS evaluation to i) accuracy of this task and ii) the total computation consumed to finally obtain a model with satisfying accuracy. Seeing that most existing methods do not solve this problem directly, we propose DSNAS, an efficient differentiable NAS framework that simultaneously optimizes architecture and parameters with a low-biased Monte Carlo estimate. Child networks derived from DSNAS can be deployed directly without parameter retraining. Comparing with two-stage methods, DSNAS successfully discovers networks with comparable accuracy (74.4%) on ImageNet in 420 GPU hours, reducing the total time by more than 34%. Our implementation is available at https://github.com/SNAS-Series/SNAS-Series.

Related papers

Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets [55.2118691522524]
Distillation-aware Neural Architecture Search (DaNAS) aims to search for an optimal student architecture. We propose a distillation-aware meta accuracy prediction model, DaSS (Distillation-aware Student Search), which can predict a given architecture's final performances on a dataset.
arXiv Detail & Related papers (2023-05-26T14:00:35Z)
When NAS Meets Trees: An Efficient Algorithm for Neural Architecture Search [117.89827740405694]
Key challenge in neural architecture search (NAS) is designing how to explore wisely in the huge search space. We propose a new NAS method called TNAS (NAS with trees), which improves search efficiency by exploring only a small number of architectures. TNAS finds the global optimal architecture on CIFAR-10 with test accuracy of 94.37% in four GPU hours in NAS-Bench-201.
arXiv Detail & Related papers (2022-04-11T07:34:21Z)
TND-NAS: Towards Non-differentiable Objectives in Progressive Differentiable NAS Framework [6.895590095853327]
Differentiable architecture search has gradually become the mainstream research topic in the field of Neural Architecture Search (NAS) Recent differentiable NAS also aims at further improving the search performance and reducing the GPU-memory consumption. We propose the TND-NAS, which is with the merits of the high efficiency in differentiable NAS framework and the compatibility among non-differentiable metrics in Multi-objective NAS.
arXiv Detail & Related papers (2021-11-06T14:19:36Z)
Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift [128.32670289503025]
Recently proposed neural architecture search (NAS) methods co-train billions of architectures in a supernet and estimate their potential accuracy. The ranking correlation between the architectures' predicted accuracy and their actual capability is incorrect, which causes the existing NAS methods' dilemma. We attribute this ranking correlation problem to the supernet training consistency shift, including feature shift and parameter shift. We address these two shifts simultaneously using a nontrivial supernet-Pi model, called Pi-NAS.
arXiv Detail & Related papers (2021-08-22T09:08:48Z)
Accelerating Neural Architecture Search via Proxy Data [17.86463546971522]
We propose a novel proxy data selection method tailored for neural architecture search (NAS) executing DARTS with the proposed selection requires only 40 minutes on CIFAR-10 and 7.5 hours on ImageNet with a single GPU. When the architecture searched on ImageNet using the proposed selection is inversely transferred to CIFAR-10, a state-of-the-art test error of 2.4% is yielded.
arXiv Detail & Related papers (2021-06-09T03:08:53Z)
AdvantageNAS: Efficient Neural Architecture Search with Credit Assignment [23.988393741948485]
We propose a novel search strategy for one-shot and sparse propagation NAS, namely AdvantageNAS. AdvantageNAS is a gradient-based approach that improves the search efficiency by introducing credit assignment in gradient estimation for architecture updates. Experiments on the NAS-Bench-201 and PTB dataset show that AdvantageNAS discovers an architecture with higher performance under a limited time budget.
arXiv Detail & Related papers (2020-12-11T05:45:03Z)
Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients [17.501966450686282]
We develop an efficient Neural Architecture Search (NAS) method via Straight-Through (ST) gradients, called ST-NAS. Experiments over the widely benchmarked 80-hour WSJ and 300-hour Switchboard datasets show that ST-NAS induced architectures significantly outperform the human-designed architecture across the two datasets. Strengths of ST-NAS such as architecture transferability and low computation cost in memory and time are also reported.
arXiv Detail & Related papers (2020-11-11T09:18:58Z)
Binarized Neural Architecture Search for Efficient Object Recognition [120.23378346337311]
Binarized neural architecture search (BNAS) produces extremely compressed models to reduce huge computational cost on embedded devices for edge computing. An accuracy of $96.53%$ vs. $97.22%$ is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a $40%$ faster search than the state-of-the-art PC-DARTS.
arXiv Detail & Related papers (2020-09-08T15:51:23Z)
Few-shot Neural Architecture Search [35.28010196935195]
We propose few-shot NAS that uses multiple supernetworks, called sub-supernets, each covering different regions of the search space to alleviate the undesired co-adaption. With only up to 7 sub-supernets, few-shot NAS establishes new SoTAs: on ImageNet, it finds models that reach 80.5% top-1 accuracy at 600 MB FLOPS and 77.5% top-1 accuracy at 238 MFLOPS.
arXiv Detail & Related papers (2020-06-11T22:36:01Z)
NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search [55.12928953187342]
We propose an extension to NAS-Bench-101: NAS-Bench-201 with a different search space, results on multiple datasets, and more diagnostic information. NAS-Bench-201 has a fixed search space and provides a unified benchmark for almost any up-to-date NAS algorithms. We provide additional diagnostic information such as fine-grained loss and accuracy, which can give inspirations to new designs of NAS algorithms.
arXiv Detail & Related papers (2020-01-02T05:28:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.