MSR-DARTS: Minimum Stable Rank of Differentiable Architecture Search
- URL: http://arxiv.org/abs/2009.09209v2
- Date: Mon, 15 Mar 2021 08:58:01 GMT
- Title: MSR-DARTS: Minimum Stable Rank of Differentiable Architecture Search
- Authors: Kengo Machida, Kuniaki Uto, Koichi Shinoda and Taiji Suzuki
- Abstract summary: In neural architecture search (NAS), differentiable architecture search (DARTS) has recently attracted much attention due to its high efficiency.
We propose a method called minimum stable rank DARTS (MSR-DARTS) for finding a model with the best generalization error.
MSR-DARTS achieves an error rate of 2.54% with 4.0M parameters within 0.3 GPU-days on CIFAR-10, and a top-1 error rate of 23.9% on ImageNet.
- Score: 45.09936304802425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In neural architecture search (NAS), differentiable architecture search
(DARTS) has recently attracted much attention due to its high efficiency. It
defines an over-parameterized network with mixed edges, each of which
represents all operator candidates, and jointly optimizes the weights of the
network and its architecture in an alternating manner. However, this method
finds a model with the weights converging faster than the others, and such a
model with fastest convergence often leads to overfitting. Accordingly, the
resulting model cannot always be well-generalized. To overcome this problem, we
propose a method called minimum stable rank DARTS (MSR-DARTS), for finding a
model with the best generalization error by replacing architecture optimization
with the selection process using the minimum stable rank criterion.
Specifically, a convolution operator is represented by a matrix, and MSR-DARTS
selects the one with the smallest stable rank. We evaluated MSR-DARTS on
CIFAR-10 and ImageNet datasets. It achieves an error rate of 2.54% with 4.0M
parameters within 0.3 GPU-days on CIFAR-10, and a top-1 error rate of 23.9% on
ImageNet. The official code is available at
https://github.com/mtaecchhi/msrdarts.git.
Related papers
- Differentiable Architecture Search with Random Features [80.31916993541513]
Differentiable architecture search (DARTS) has significantly promoted the development of NAS techniques because of its high search efficiency and effectiveness but suffers from performance collapse.
In this paper, we make efforts to alleviate the performance collapse problem for DARTS with only training BatchNorm.
arXiv Detail & Related papers (2022-08-18T13:55:27Z) - ZARTS: On Zero-order Optimization for Neural Architecture Search [94.41017048659664]
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency.
This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation.
In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue.
arXiv Detail & Related papers (2021-10-10T09:35:15Z) - D-DARTS: Distributed Differentiable Architecture Search [75.12821786565318]
Differentiable ARchiTecture Search (DARTS) is one of the most trending Neural Architecture Search (NAS) methods.
We propose D-DARTS, a novel solution that addresses this problem by nesting several neural networks at cell-level.
arXiv Detail & Related papers (2021-08-20T09:07:01Z) - Neural Architecture Search using Covariance Matrix Adaptation Evolution
Strategy [6.8129169853808795]
We propose a framework of applying the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to the neural architecture search problem called CMANAS.
The architecture are modelled using a normal distribution, which is updated using CMA-ES based on the fitness of the sampled population.
CMANAS finished the architecture search on CIFAR-10 with the top-1 test accuracy of 97.44% in 0.45 GPU day and on CIFAR-100 with the top-1 test accuracy of 83.24% for 0.6 GPU day on a single GPU.
arXiv Detail & Related papers (2021-07-15T11:41:23Z) - Evolving Neural Architecture Using One Shot Model [5.188825486231326]
We propose a novel way of applying a simple genetic algorithm to the NAS problem called EvNAS (Evolving Neural Architecture using One Shot Model)
EvNAS searches for the architecture on the proxy dataset i.e. CIFAR-10 for 4.4 GPU day on a single GPU and achieves top-1 test error of 2.47%.
Results show the potential of evolutionary methods in solving the architecture search problem.
arXiv Detail & Related papers (2020-12-23T08:40:53Z) - Single-level Optimization For Differential Architecture Search [6.3531384587183135]
differential architecture search (DARTS) makes gradient of architecture parameters biased for network weights.
We propose to use single-level to replace bi-level optimization and non-competitive activation function like sigmoid to replace softmax.
Experiments on NAS Benchmark 201 validate our hypothesis and stably find out nearly the optimal architecture.
arXiv Detail & Related papers (2020-12-15T18:40:33Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z) - Taming GANs with Lookahead-Minmax [63.90038365274479]
Experimental results on MNIST, SVHN, CIFAR-10, and ImageNet demonstrate a clear advantage of combining Lookahead-minmax with Adam or extragradient.
Using 30-fold fewer parameters and 16-fold smaller minibatches we outperform the reported performance of the class-dependent BigGAN on CIFAR-10 by obtaining FID of 12.19 without using the class labels.
arXiv Detail & Related papers (2020-06-25T17:13:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.