DrNAS: Dirichlet Neural Architecture Search
- URL: http://arxiv.org/abs/2006.10355v4
- Date: Tue, 16 Mar 2021 02:32:55 GMT
- Title: DrNAS: Dirichlet Neural Architecture Search
- Authors: Xiangning Chen, Ruochen Wang, Minhao Cheng, Xiaocheng Tang, Cho-Jui
Hsieh
- Abstract summary: We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution.
With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based generalization.
To alleviate the large memory consumption of differentiable NAS, we propose a simple yet effective progressive learning scheme.
- Score: 88.56953713817545
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a novel differentiable architecture search method by
formulating it into a distribution learning problem. We treat the continuously
relaxed architecture mixing weight as random variables, modeled by Dirichlet
distribution. With recently developed pathwise derivatives, the Dirichlet
parameters can be easily optimized with gradient-based optimizer in an
end-to-end manner. This formulation improves the generalization ability and
induces stochasticity that naturally encourages exploration in the search
space. Furthermore, to alleviate the large memory consumption of differentiable
NAS, we propose a simple yet effective progressive learning scheme that enables
searching directly on large-scale tasks, eliminating the gap between search and
evaluation phases. Extensive experiments demonstrate the effectiveness of our
method. Specifically, we obtain a test error of 2.46% for CIFAR-10, 23.7% for
ImageNet under the mobile setting. On NAS-Bench-201, we also achieve
state-of-the-art results on all three datasets and provide insights for the
effective design of neural architecture search algorithms.
Related papers
- Pruning-as-Search: Efficient Neural Architecture Search via Channel
Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently.
Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z) - $\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture
Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process.
Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z) - BaLeNAS: Differentiable Architecture Search via the Bayesian Learning
Rule [95.56873042777316]
Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost.
This paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions.
We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability.
arXiv Detail & Related papers (2021-11-25T18:13:42Z) - ZARTS: On Zero-order Optimization for Neural Architecture Search [94.41017048659664]
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency.
This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation.
In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue.
arXiv Detail & Related papers (2021-10-10T09:35:15Z) - iDARTS: Differentiable Architecture Search with Stochastic Implicit
Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS)
We tackle the hypergradient computation in DARTS based on the implicit function theorem.
We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z) - Smooth Variational Graph Embeddings for Efficient Neural Architecture
Search [41.62970837629573]
We propose a two-sided variational graph autoencoder, which allows to smoothly encode and accurately reconstruct neural architectures from various search spaces.
We evaluate the proposed approach on neural architectures defined by the ENAS approach, the NAS-Bench-101 and the NAS-Bench-201 search spaces.
arXiv Detail & Related papers (2020-10-09T17:05:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.