Related papers: L$^{2}$NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning

L$^{2}$NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning

URL: http://arxiv.org/abs/2109.12425v1
Date: Sat, 25 Sep 2021 19:26:30 GMT
Title: L$^{2}$NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning
Authors: Keith G. Mills, Fred X. Han, Mohammad Salameh, Seyed Saeed Changiz Rezaei, Linglong Kong, Wei Lu, Shuo Lian, Shangling Jui and Di Niu
Abstract summary: Differentiable architecture search (NAS) achieved remarkable results in deep neural network design. We show that L$2$ achieves state-of-theart results on DART201 benchmark as well as NASS and Once-for-All search policies.
Score: 23.25155249879658
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural architecture search (NAS) has achieved remarkable results in deep neural network design. Differentiable architecture search converts the search over discrete architectures into a hyperparameter optimization problem which can be solved by gradient descent. However, questions have been raised regarding the effectiveness and generalizability of gradient methods for solving non-convex architecture hyperparameter optimization problems. In this paper, we propose L$^{2}$NAS, which learns to intelligently optimize and update architecture hyperparameters via an actor neural network based on the distribution of high-performing architectures in the search history. We introduce a quantile-driven training procedure which efficiently trains L$^{2}$NAS in an actor-critic framework via continuous-action reinforcement learning. Experiments show that L$^{2}$NAS achieves state-of-the-art results on NAS-Bench-201 benchmark as well as DARTS search space and Once-for-All MobileNetV3 search space. We also show that search policies generated by L$^{2}$NAS are generalizable and transferable across different training datasets with minimal fine-tuning.

Related papers

DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit CNNs [53.82853297675979]
1-bit convolutional neural networks (CNNs) with binary weights and activations show their potential for resource-limited embedded devices. One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS. We introduce Discrepant Child-Parent Neural Architecture Search (DCP-NAS) to efficiently search 1-bit CNNs.
arXiv Detail & Related papers (2023-06-27T11:28:29Z)
GeNAS: Neural Architecture Search with Better Generalization [14.92869716323226]
Recent neural architecture search (NAS) approaches rely on validation loss or accuracy to find the superior network for the target data. In this paper, we investigate a new neural architecture search measure for excavating architectures with better generalization.
arXiv Detail & Related papers (2023-05-15T12:44:54Z)
LayerNAS: Neural Architecture Search in Polynomial Complexity [18.36070437021082]
We propose LayerNAS to address the challenge of multi-objective NAS. LayerNAS groups model candidates based on one objective, such as model size or latency, and searches for the optimal model based on another objective. Our experiments show that LayerNAS is able to consistently discover superior models across a variety of search spaces.
arXiv Detail & Related papers (2023-04-23T02:08:00Z)
Novelty Driven Evolutionary Neural Architecture Search [6.8129169853808795]
Evolutionary algorithms (EA) based neural architecture search (NAS) involves evaluating each architecture by training it from scratch, which is extremely time-consuming. We propose a method called NEvoNAS wherein the NAS problem is posed as a multi-objective problem with 2 objectives: (i) maximize architecture novelty, (ii) maximize architecture fitness/accuracy. NSGA-II is used for finding the textitpareto optimal front for the NAS problem and the best architecture in the pareto front is returned as the searched architecture.
arXiv Detail & Related papers (2022-04-01T03:32:55Z)
iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS) We tackle the hypergradient computation in DARTS based on the implicit function theorem. We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z)
Memory-Efficient Hierarchical Neural Architecture Search for Image Restoration [68.6505473346005]
We propose a memory-efficient hierarchical NAS HiNAS (HiNAS) for image denoising and image super-resolution tasks. With a single GTX1080Ti GPU, it takes only about 1 hour for searching for denoising network on BSD 500 and 3.5 hours for searching for the super-resolution structure on DIV2K.
arXiv Detail & Related papers (2020-12-24T12:06:17Z)
Smooth Variational Graph Embeddings for Efficient Neural Architecture Search [41.62970837629573]
We propose a two-sided variational graph autoencoder, which allows to smoothly encode and accurately reconstruct neural architectures from various search spaces. We evaluate the proposed approach on neural architectures defined by the ENAS approach, the NAS-Bench-101 and the NAS-Bench-201 search spaces.
arXiv Detail & Related papers (2020-10-09T17:05:41Z)
Binarized Neural Architecture Search for Efficient Object Recognition [120.23378346337311]
Binarized neural architecture search (BNAS) produces extremely compressed models to reduce huge computational cost on embedded devices for edge computing. An accuracy of $96.53%$ vs. $97.22%$ is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a $40%$ faster search than the state-of-the-art PC-DARTS.
arXiv Detail & Related papers (2020-09-08T15:51:23Z)
DrNAS: Dirichlet Neural Architecture Search [88.56953713817545]
We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution. With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based generalization. To alleviate the large memory consumption of differentiable NAS, we propose a simple yet effective progressive learning scheme.
arXiv Detail & Related papers (2020-06-18T08:23:02Z)
ADWPNAS: Architecture-Driven Weight Prediction for Neural Architecture Search [6.458169480971417]
We propose an Architecture-Driven Weight Prediction (ADWP) approach for neural architecture search (NAS) In our approach, we first design an architecture-intensive search space and then train a HyperNetwork by inputting encoding architecture parameters. Results show that one search procedure can be completed in 4.0 GPU hours on CIFAR-10.
arXiv Detail & Related papers (2020-03-03T05:06:20Z)
DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning [135.27931587381596]
We propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning. In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs. With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints.
arXiv Detail & Related papers (2019-05-28T06:35:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.