Hyperparameter Optimization in Neural Networks via Structured Sparse
Recovery
- URL: http://arxiv.org/abs/2007.04087v1
- Date: Tue, 7 Jul 2020 00:57:09 GMT
- Title: Hyperparameter Optimization in Neural Networks via Structured Sparse
Recovery
- Authors: Minsu Cho, Mohammadreza Soltani, and Chinmay Hegde
- Abstract summary: We study two important problems in the automated design of neural networks through the lens of sparse recovery methods.
In the first part of this paper, we establish a novel connection between HPO and structured sparse recovery.
In the second part of this paper, we establish a connection between NAS and structured sparse recovery.
- Score: 54.60327265077322
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study two important problems in the automated design of
neural networks -- Hyper-parameter Optimization (HPO), and Neural Architecture
Search (NAS) -- through the lens of sparse recovery methods. In the first part
of this paper, we establish a novel connection between HPO and structured
sparse recovery. In particular, we show that a special encoding of the
hyperparameter space enables a natural group-sparse recovery formulation, which
when coupled with HyperBand (a multi-armed bandit strategy), leads to
improvement over existing hyperparameter optimization methods. Experimental
results on image datasets such as CIFAR-10 confirm the benefits of our
approach. In the second part of this paper, we establish a connection between
NAS and structured sparse recovery. Building upon ``one-shot'' approaches in
NAS, we propose a novel algorithm that we call CoNAS by merging ideas from
one-shot approaches with a techniques for learning low-degree sparse Boolean
polynomials. We provide theoretical analysis on the number of validation error
measurements. Finally, we validate our approach on several datasets and
discover novel architectures hitherto unreported, achieving competitive (or
better) results in both performance and search time compared to the existing
NAS approaches.
Related papers
- DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit
CNNs [53.82853297675979]
1-bit convolutional neural networks (CNNs) with binary weights and activations show their potential for resource-limited embedded devices.
One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS.
We introduce Discrepant Child-Parent Neural Architecture Search (DCP-NAS) to efficiently search 1-bit CNNs.
arXiv Detail & Related papers (2023-06-27T11:28:29Z) - HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel
Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels.
We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions.
A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z) - iDARTS: Differentiable Architecture Search with Stochastic Implicit
Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS)
We tackle the hypergradient computation in DARTS based on the implicit function theorem.
We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z) - Bag of Baselines for Multi-objective Joint Neural Architecture Search
and Hyperparameter Optimization [29.80410614305851]
Neural architecture search (NAS) and hyper parameter optimization (HPO) make deep learning accessible to non-experts.
We propose a set of methods that extend current approaches to jointly optimize neural architectures and hyper parameters with respect to multiple objectives.
These methods will serve as simple baselines for future research on multi-objective joint NAS + HPO.
arXiv Detail & Related papers (2021-05-03T17:04:56Z) - Trilevel Neural Architecture Search for Efficient Single Image
Super-Resolution [127.92235484598811]
This paper proposes a trilevel neural architecture search (NAS) method for efficient single image super-resolution (SR)
For modeling the discrete search space, we apply a new continuous relaxation on the discrete search spaces to build a hierarchical mixture of network-path, cell-operations, and kernel-width.
An efficient search algorithm is proposed to perform optimization in a hierarchical supernet manner.
arXiv Detail & Related papers (2021-01-17T12:19:49Z) - Effective, Efficient and Robust Neural Architecture Search [4.273005643715522]
Recent advances in adversarial attacks show the vulnerability of deep neural networks searched by Neural Architecture Search (NAS)
We propose an Effective, Efficient, and Robust Neural Architecture Search (E2RNAS) method to search a neural network architecture by taking the performance, robustness, and resource constraint into consideration.
Experiments on benchmark datasets show that the proposed E2RNAS method can find adversarially robust architectures with optimized model size and comparable classification accuracy.
arXiv Detail & Related papers (2020-11-19T13:46:23Z) - Smooth Variational Graph Embeddings for Efficient Neural Architecture
Search [41.62970837629573]
We propose a two-sided variational graph autoencoder, which allows to smoothly encode and accurately reconstruct neural architectures from various search spaces.
We evaluate the proposed approach on neural architectures defined by the ENAS approach, the NAS-Bench-101 and the NAS-Bench-201 search spaces.
arXiv Detail & Related papers (2020-10-09T17:05:41Z) - Neural Architecture Search as Sparse Supernet [78.09905626281046]
This paper aims at enlarging the problem of Neural Architecture Search (NAS) from Single-Path and Multi-Path Search to automated Mixed-Path Search.
We model the NAS problem as a sparse supernet using a new continuous architecture representation with a mixture of sparsity constraints.
The sparse supernet enables us to automatically achieve sparsely-mixed paths upon a compact set of nodes.
arXiv Detail & Related papers (2020-07-31T14:51:52Z) - Geometry-Aware Gradient Algorithms for Neural Architecture Search [41.943045315986744]
We argue for the study of single-level empirical risk minimization to understand NAS with weight-sharing.
We present a geometry-aware framework that exploits the underlying structure of this optimization to return sparse architectural parameters.
We achieve state-of-the-art accuracy on the latest NAS benchmarks in computer vision.
arXiv Detail & Related papers (2020-04-16T17:46:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.