CR-LSO: Convex Neural Architecture Optimization in the Latent Space of
Graph Variational Autoencoder with Input Convex Neural Networks
- URL: http://arxiv.org/abs/2211.05950v1
- Date: Fri, 11 Nov 2022 01:55:11 GMT
- Title: CR-LSO: Convex Neural Architecture Optimization in the Latent Space of
Graph Variational Autoencoder with Input Convex Neural Networks
- Authors: Xuan Rao, Bo Zhao, Xiaosong Yi and Derong Liu
- Abstract summary: In neural architecture search (NAS) methods based on latent space optimization (LSO), a deep generative model is trained to embed discrete neural architectures into a continuous latent space.
This paper develops a convexity architecture regularized space (CRLSO) method, which aims to regularize the learning process of space in order to obtain a convex performance mapping.
Experimental results on three popular NAS benchmarks show that CR-LSO achieves competitive evaluation results in terms of both computational complexity and performance.
- Score: 7.910915721525413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In neural architecture search (NAS) methods based on latent space
optimization (LSO), a deep generative model is trained to embed discrete neural
architectures into a continuous latent space. In this case, different
optimization algorithms that operate in the continuous space can be implemented
to search neural architectures. However, the optimization of latent variables
is challenging for gradient-based LSO since the mapping from the latent space
to the architecture performance is generally non-convex. To tackle this
problem, this paper develops a convexity regularized latent space optimization
(CR-LSO) method, which aims to regularize the learning process of latent space
in order to obtain a convex architecture performance mapping. Specifically,
CR-LSO trains a graph variational autoencoder (G-VAE) to learn the continuous
representations of discrete architectures. Simultaneously, the learning process
of latent space is regularized by the guaranteed convexity of input convex
neural networks (ICNNs). In this way, the G-VAE is forced to learn a convex
mapping from the architecture representation to the architecture performance.
Hereafter, the CR-LSO approximates the performance mapping using the ICNN and
leverages the estimated gradient to optimize neural architecture
representations. Experimental results on three popular NAS benchmarks show that
CR-LSO achieves competitive evaluation results in terms of both computational
complexity and architecture performance.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives.
We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis.
We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z) - An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture [0.0]
This work presents a two-stage adaptive framework for developing deep neural network (DNN) architectures that generalize well for a given training data set.
In the first stage, a layerwise training approach is adopted where a new layer is added each time and trained independently by freezing parameters in the previous layers.
We introduce a epsilon-delta stability-promoting concept as a desirable property for a learning algorithm and show that employing manifold regularization yields a epsilon-delta stability-promoting algorithm.
arXiv Detail & Related papers (2022-11-13T09:51:16Z) - Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network.
We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint.
Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z) - iDARTS: Differentiable Architecture Search with Stochastic Implicit
Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS)
We tackle the hypergradient computation in DARTS based on the implicit function theorem.
We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z) - Differentiable Neural Architecture Learning for Efficient Neural Network
Design [31.23038136038325]
We introduce a novel emph architecture parameterisation based on scaled sigmoid function.
We then propose a general emphiable Neural Architecture Learning (DNAL) method to optimize the neural architecture without the need to evaluate candidate neural networks.
arXiv Detail & Related papers (2021-03-03T02:03:08Z) - Trilevel Neural Architecture Search for Efficient Single Image
Super-Resolution [127.92235484598811]
This paper proposes a trilevel neural architecture search (NAS) method for efficient single image super-resolution (SR)
For modeling the discrete search space, we apply a new continuous relaxation on the discrete search spaces to build a hierarchical mixture of network-path, cell-operations, and kernel-width.
An efficient search algorithm is proposed to perform optimization in a hierarchical supernet manner.
arXiv Detail & Related papers (2021-01-17T12:19:49Z) - Smooth Variational Graph Embeddings for Efficient Neural Architecture
Search [41.62970837629573]
We propose a two-sided variational graph autoencoder, which allows to smoothly encode and accurately reconstruct neural architectures from various search spaces.
We evaluate the proposed approach on neural architectures defined by the ENAS approach, the NAS-Bench-101 and the NAS-Bench-201 search spaces.
arXiv Detail & Related papers (2020-10-09T17:05:41Z) - Neural Architecture Optimization with Graph VAE [21.126140965779534]
We propose an efficient NAS approach to optimize network architectures in a continuous space.
The framework jointly learns four components: the encoder, the performance predictor, the complexity predictor and the decoder.
arXiv Detail & Related papers (2020-06-18T07:05:48Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.