Bayesian Neural Architecture Search using A Training-Free Performance
Metric
- URL: http://arxiv.org/abs/2001.10726v2
- Date: Fri, 23 Apr 2021 07:48:42 GMT
- Title: Bayesian Neural Architecture Search using A Training-Free Performance
Metric
- Authors: Andr\'es Camero, Hao Wang, Enrique Alba, Thomas B\"ack
- Abstract summary: Recurrent neural networks (RNNs) are a powerful approach for time series prediction.
This paper proposes to tackle the architecture optimization problem with a variant of the Bayesian Optimization (BO) algorithm.
Also, we propose three fixed-length encoding schemes to cope with the variable-length architecture representation.
- Score: 7.775212462771685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent neural networks (RNNs) are a powerful approach for time series
prediction. However, their performance is strongly affected by their
architecture and hyperparameter settings. The architecture optimization of RNNs
is a time-consuming task, where the search space is typically a mixture of
real, integer and categorical values. To allow for shrinking and expanding the
size of the network, the representation of architectures often has a variable
length. In this paper, we propose to tackle the architecture optimization
problem with a variant of the Bayesian Optimization (BO) algorithm. To reduce
the evaluation time of candidate architectures the Mean Absolute Error Random
Sampling (MRS), a training-free method to estimate the network performance, is
adopted as the objective function for BO. Also, we propose three fixed-length
encoding schemes to cope with the variable-length architecture representation.
The result is a new perspective on accurate and efficient design of RNNs, that
we validate on three problems. Our findings show that 1) the BO algorithm can
explore different network architectures using the proposed encoding schemes and
successfully designs well-performing architectures, and 2) the optimization
time is significantly reduced by using MRS, without compromising the
performance as compared to the architectures obtained from the actual training
procedure.
Related papers
- Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them Optimally [2.645067871482715]
In machine learning tasks, one searches for an optimal function within a certain functional space.
This way forces the evolution of the function during training to lie within the realm of what is expressible with the chosen architecture.
We show that the information about desirable architectural changes, due to expressivity bottlenecks can be extracted from %the backpropagation.
arXiv Detail & Related papers (2024-05-30T08:23:56Z) - Time Elastic Neural Networks [2.1756081703276]
We introduce and detail an atypical neural network architecture, called time elastic neural network (teNN)
The novelty compared to classical neural network architecture is that it explicitly incorporates time warping ability.
We demonstrate that, during the training process, the teNN succeeds in reducing the number of neurons required within each cell.
arXiv Detail & Related papers (2024-05-27T09:01:30Z) - FlowNAS: Neural Architecture Search for Optical Flow Estimation [65.44079917247369]
We propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task.
Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67% F1-all error on KITTI.
arXiv Detail & Related papers (2022-07-04T09:05:25Z) - Shapley-NAS: Discovering Operation Contribution for Neural Architecture
Search [96.20505710087392]
We propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search.
We show that our method outperforms the state-of-the-art methods by a considerable margin with light search cost.
arXiv Detail & Related papers (2022-06-20T14:41:49Z) - EmProx: Neural Network Performance Estimation For Neural Architecture
Search [0.0]
This study proposes a new method, EmProx Score (Embedding Proximity Score) to map architectures to a continuous embedding space.
Performance of candidates is then estimated using weighted kNN based on the embedding vectors of architectures of which the performance is known.
Performance estimations of this method are comparable to the performance predictor used in NAO in terms of accuracy, while being nearly nine times faster to train compared to NAO.
arXiv Detail & Related papers (2022-06-13T08:35:52Z) - Learning Interpretable Models Through Multi-Objective Neural
Architecture Search [0.9990687944474739]
We propose a framework to optimize for both task performance and "introspectability," a surrogate metric for aspects of interpretability.
We demonstrate that jointly optimizing for task error and introspectability leads to more disentangled and debuggable architectures that perform within error.
arXiv Detail & Related papers (2021-12-16T05:50:55Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z) - Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs)
These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training.
Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z) - Neural Architecture Optimization with Graph VAE [21.126140965779534]
We propose an efficient NAS approach to optimize network architectures in a continuous space.
The framework jointly learns four components: the encoder, the performance predictor, the complexity predictor and the decoder.
arXiv Detail & Related papers (2020-06-18T07:05:48Z) - FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking.
We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints.
FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.