Related papers: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

URL: http://arxiv.org/abs/2010.06176v1
Date: Tue, 13 Oct 2020 04:34:24 GMT
Title: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding
Authors: Yibo Yang, Hongyang Li, Shan You, Fei Wang, Chen Qian, Zhouchen Lin
Abstract summary: We formulate neural architecture search as a sparse coding problem. In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search. Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
Score: 86.40042104698792
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural architecture search (NAS) aims to produce the optimal sparse solution from a high-dimensional space spanned by all candidate connections. Current gradient-based NAS methods commonly ignore the constraint of sparsity in the search phase, but project the optimized solution onto a sparse one by post-processing. As a result, the dense super-net for search is inefficient to train and has a gap with the projected architecture for evaluation. In this paper, we formulate neural architecture search as a sparse coding problem. We perform the differentiable search on a compressed lower-dimensional space that has the same validation loss as the original sparse solution space, and recover an architecture by solving the sparse coding problem. The differentiable search and architecture recovery are optimized in an alternate manner. By doing so, our network for search at each update satisfies the sparsity constraint and is efficient to train. In order to also eliminate the depth and width gap between the network in search and the target-net in evaluation, we further propose a method to search and evaluate in one stage under the target-net settings. When training finishes, architecture variables are absorbed into network weights. Thus we get the searched architecture and optimized parameters in a single run. In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search. Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.

Related papers

Efficient Joint-Dimensional Search with Solution Space Regularization for Real-Time Semantic Segmentation [27.94898516315886]
We search an optimal network structure that can run in real-time for this problem. A novel Solution Space Regularization (SSR) loss is first proposed to effectively encourage the supernet to converge to its discrete one. A new Hierarchical and Progressive Solution Space Shrinking method is presented to further achieve high efficiency of searching.
arXiv Detail & Related papers (2022-08-10T11:07:33Z)
DASS: Differentiable Architecture Search for Sparse neural networks [0.5735035463793009]
We find that the architectures designed for dense networks by differentiable architecture search methods are ineffective when pruning mechanisms are applied to them. In this paper, we propose a new method to search for sparsity-friendly neural architectures. We do this by adding two new sparse operations to the search space and modifying the search objective.
arXiv Detail & Related papers (2022-07-14T14:53:50Z)
Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently. Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z)
iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS) We tackle the hypergradient computation in DARTS based on the implicit function theorem. We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z)
Making Differentiable Architecture Search less local [9.869449181400466]
Differentiable neural architecture search (DARTS) is a promising NAS approach that dramatically increases search efficiency. It has been shown to suffer from performance collapse, where the search often leads to detrimental architectures. We develop a more global optimisation scheme that is able to better explore the space without changing the DARTS problem formulation.
arXiv Detail & Related papers (2021-04-21T10:36:43Z)
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS) We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately. We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z)
Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search [84.4140192638394]
Most differentiable neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation. In this paper, we introduce EnTranNAS that is composed of Engine-cells and Transit-cells. Our method also spares much memory and computation cost, which speeds up the search process.
arXiv Detail & Related papers (2021-01-27T12:16:47Z)
DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search [76.9225014200746]
Efficient search is a core issue in Neural Architecture Search (NAS) We present DA-NAS that can directly search the architecture for large-scale target tasks while allowing a large candidate set in a more efficient manner. It is 2x faster than previous methods while the accuracy is currently state-of-the-art, at 76.2% under small FLOPs constraint.
arXiv Detail & Related papers (2020-03-27T17:55:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.