Related papers: On Constrained Optimization in Differentiable Neural Architecture Search

On Constrained Optimization in Differentiable Neural Architecture Search

URL: http://arxiv.org/abs/2106.11655v1
Date: Tue, 22 Jun 2021 10:22:31 GMT
Title: On Constrained Optimization in Differentiable Neural Architecture Search
Authors: Kaitlin Maile, Erwan Lecarpentier, Herv\'e Luga, Dennis G. Wilson
Abstract summary: Differentiable Architecture Search (DARTS) is a recently proposed neural architecture search (NAS) method based on a differentiable relaxation. We propose and analyze three improvements to architectural weight competition, update scheduling, and regularization towards discretization.
Score: 3.0682439731292592
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Differentiable Architecture Search (DARTS) is a recently proposed neural architecture search (NAS) method based on a differentiable relaxation. Due to its success, numerous variants analyzing and improving parts of the DARTS framework have recently been proposed. By considering the problem as a constrained bilevel optimization, we propose and analyze three improvements to architectural weight competition, update scheduling, and regularization towards discretization. First, we introduce a new approach to the activation of architecture weights, which prevents confounding competition within an edge and allows for fair comparison across edges to aid in discretization. Next, we propose a dynamic schedule based on per-minibatch network information to make architecture updates more informed. Finally, we consider two regularizations, based on proximity to discretization and the Alternating Directions Method of Multipliers (ADMM) algorithm, to promote early discretization. Our results show that this new activation scheme reduces final architecture size and the regularizations improve reliability in search results while maintaining comparable performance to state-of-the-art in NAS, especially when used with our new dynamic informed schedule.

Related papers

HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels. We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions. A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z)
$\eta$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process. Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z)
Rethinking Architecture Selection in Differentiable NAS [74.61723678821049]
Differentiable Neural Architecture Search is one of the most popular NAS methods for its search efficiency and simplicity. We propose an alternative perturbation-based architecture selection that directly measures each operation's influence on the supernet. We find that several failure modes of DARTS can be greatly alleviated with the proposed selection method.
arXiv Detail & Related papers (2021-08-10T00:53:39Z)
iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS) We tackle the hypergradient computation in DARTS based on the implicit function theorem. We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z)
Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search [70.57382341642418]
Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware. Recent works have empirically shown a ranking disorder between the performance of stand-alone architectures and that of the corresponding shared-weight networks. We propose a regularization term that aims to maximize the correlation between the performance rankings of the shared-weight network and that of the standalone architectures.
arXiv Detail & Related papers (2021-04-12T09:32:33Z)
Smooth Variational Graph Embeddings for Efficient Neural Architecture Search [41.62970837629573]
We propose a two-sided variational graph autoencoder, which allows to smoothly encode and accurately reconstruct neural architectures from various search spaces. We evaluate the proposed approach on neural architectures defined by the ENAS approach, the NAS-Bench-101 and the NAS-Bench-201 search spaces.
arXiv Detail & Related papers (2020-10-09T17:05:41Z)
Hyperparameter Optimization in Neural Networks via Structured Sparse Recovery [54.60327265077322]
We study two important problems in the automated design of neural networks through the lens of sparse recovery methods. In the first part of this paper, we establish a novel connection between HPO and structured sparse recovery. In the second part of this paper, we establish a connection between NAS and structured sparse recovery.
arXiv Detail & Related papers (2020-07-07T00:57:09Z)
DrNAS: Dirichlet Neural Architecture Search [88.56953713817545]
We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution. With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based generalization. To alleviate the large memory consumption of differentiable NAS, we propose a simple yet effective progressive learning scheme.
arXiv Detail & Related papers (2020-06-18T08:23:02Z)
Geometry-Aware Gradient Algorithms for Neural Architecture Search [41.943045315986744]
We argue for the study of single-level empirical risk minimization to understand NAS with weight-sharing. We present a geometry-aware framework that exploits the underlying structure of this optimization to return sparse architectural parameters. We achieve state-of-the-art accuracy on the latest NAS benchmarks in computer vision.
arXiv Detail & Related papers (2020-04-16T17:46:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.