AutoLR: An Evolutionary Approach to Learning Rate Policies
- URL: http://arxiv.org/abs/2007.04223v1
- Date: Wed, 8 Jul 2020 16:03:44 GMT
- Title: AutoLR: An Evolutionary Approach to Learning Rate Policies
- Authors: Pedro Carvalho, Nuno Louren\c{c}o, Filipe Assun\c{c}\~ao, Penousal
Machado
- Abstract summary: This work presents AutoLR, a framework that evolves Learning Rate Schedulers for a specific Neural Network Architecture.
Results show that training performed using certain evolved policies is more efficient than the established baseline.
- Score: 2.3577368017815705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The choice of a proper learning rate is paramount for good Artificial Neural
Network training and performance. In the past, one had to rely on experience
and trial-and-error to find an adequate learning rate. Presently, a plethora of
state of the art automatic methods exist that make the search for a good
learning rate easier. While these techniques are effective and have yielded
good results over the years, they are general solutions. This means the
optimization of learning rate for specific network topologies remains largely
unexplored. This work presents AutoLR, a framework that evolves Learning Rate
Schedulers for a specific Neural Network Architecture using Structured
Grammatical Evolution. The system was used to evolve learning rate policies
that were compared with a commonly used baseline value for learning rate.
Results show that training performed using certain evolved policies is more
efficient than the established baseline and suggest that this approach is a
viable means of improving a neural network's performance.
Related papers
- A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time"
It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z) - Normalization and effective learning rates in reinforcement learning [52.59508428613934]
Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature.
We show that normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network parameters and decay in the effective learning rate.
We propose to make the learning rate schedule explicit with a simple re- parameterization which we call Normalize-and-Project.
arXiv Detail & Related papers (2024-07-01T20:58:01Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Evolving Learning Rate Optimizers for Deep Neural Networks [2.6498598849144472]
We propose a framework called AutoLR to automatically design Learning Rates.
The system evolved a classifier, ADES, that appears to be novel and innovative since, to the best of our knowledge, it has a structure that differs from state the art methods.
arXiv Detail & Related papers (2021-03-23T15:23:57Z) - Interleaving Learning, with Application to Neural Architecture Search [12.317568257671427]
We propose a novel machine learning framework referred to as interleaving learning (IL)
In our framework, a set of models collaboratively learn a data encoder in an interleaving fashion.
We apply interleaving learning to search neural architectures for image classification on CIFAR-10, CIFAR-100, and ImageNet.
arXiv Detail & Related papers (2021-03-12T00:54:22Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Deep Reinforcement Learning for Adaptive Learning Systems [4.8685842576962095]
We formulate the problem of how to find an individualized learning plan based on learner's latent traits.
We apply a model-free deep reinforcement learning algorithm that can effectively find the optimal learning policy.
We also develop a transition model estimator that emulates the learner's learning process using neural networks.
arXiv Detail & Related papers (2020-04-17T18:04:03Z) - Applying Cyclical Learning Rate to Neural Machine Translation [6.715895949288471]
We show how cyclical learning rate can be applied to train transformer-based neural networks for neural machine translation.
We establish guidelines when applying cyclical learning rates to neural machine translation tasks.
arXiv Detail & Related papers (2020-04-06T04:45:49Z) - Rethinking Few-Shot Image Classification: a Good Embedding Is All You
Need? [72.00712736992618]
We show that a simple baseline: learning a supervised or self-supervised representation on the meta-training set, outperforms state-of-the-art few-shot learning methods.
An additional boost can be achieved through the use of self-distillation.
We believe that our findings motivate a rethinking of few-shot image classification benchmarks and the associated role of meta-learning algorithms.
arXiv Detail & Related papers (2020-03-25T17:58:42Z) - The large learning rate phase of deep learning: the catapult mechanism [50.23041928811575]
We present a class of neural networks with solvable training dynamics.
We find good agreement between our model's predictions and training dynamics in realistic deep learning settings.
We believe our results shed light on characteristics of models trained at different learning rates.
arXiv Detail & Related papers (2020-03-04T17:52:48Z) - Biologically-Motivated Deep Learning Method using Hierarchical
Competitive Learning [0.0]
I propose to introduce unsupervised competitive learning which only requires forward propagating signals as a pre-training method for CNNs.
The proposed method could be useful for a variety of poorly labeled data, for example, time series or medical data.
arXiv Detail & Related papers (2020-01-04T20:07:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.