Faster Improvement Rate Population Based Training
- URL: http://arxiv.org/abs/2109.13800v1
- Date: Tue, 28 Sep 2021 15:30:55 GMT
- Title: Faster Improvement Rate Population Based Training
- Authors: Valentin Dalibard, Max Jaderberg
- Abstract summary: This paper presents Faster Improvement Rate PBT (FIRE PBT) which addresses the problem of Population Based Training (PBT)
We derive a novel fitness metric and use it to make some of the population members focus on long-term performance.
Experiments show that FIRE PBT is able to outperform PBT on the ImageNet benchmark and match the performance of networks that were trained with a hand-tuned learning rate schedule.
- Score: 7.661301899629696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The successful training of neural networks typically involves careful and
time consuming hyperparameter tuning. Population Based Training (PBT) has
recently been proposed to automate this process. PBT trains a population of
neural networks concurrently, frequently mutating their hyperparameters
throughout their training. However, the decision mechanisms of PBT are greedy
and favour short-term improvements which can, in some cases, lead to poor
long-term performance. This paper presents Faster Improvement Rate PBT (FIRE
PBT) which addresses this problem. Our method is guided by an assumption: given
two neural networks with similar performance and training with similar
hyperparameters, the network showing the faster rate of improvement will lead
to a better final performance. Using this, we derive a novel fitness metric and
use it to make some of the population members focus on long-term performance.
Our experiments show that FIRE PBT is able to outperform PBT on the ImageNet
benchmark and match the performance of networks that were trained with a
hand-tuned learning rate schedule. We apply FIRE PBT to reinforcement learning
tasks and show that it leads to faster learning and higher final performance
than both PBT and random hyperparameter search.
Related papers
- Simultaneous Training of First- and Second-Order Optimizers in Population-Based Reinforcement Learning [0.0]
Population-based training (PBT) provides a method to achieve this by continuously tuning hyperparameters throughout the training.
We propose an enhancement to PBT by simultaneously utilizing both first- and second-orders within a single population.
arXiv Detail & Related papers (2024-08-27T21:54:26Z) - Generalized Population-Based Training for Hyperparameter Optimization in Reinforcement Learning [10.164982368785854]
Generalized Population-Based Training (GPBT) and Pairwise Learning (PL)
PL employs a comprehensive pairwise strategy to identify performance differentials and provide holistic guidance to underperforming agents.
arXiv Detail & Related papers (2024-04-12T04:23:20Z) - Efficient Stagewise Pretraining via Progressive Subnetworks [53.00045381931778]
The prevailing view suggests that stagewise dropping strategies, such as layer dropping, are ineffective when compared to stacking-based approaches.
This paper challenges this notion by demonstrating that, with proper design, dropping strategies can be competitive, if not better, than stacking methods.
We propose an instantiation of this framework - Random Part Training (RAPTR) - that selects and trains only a random subnetwork at each step, progressively increasing the size in stages.
arXiv Detail & Related papers (2024-02-08T18:49:09Z) - Shrink-Perturb Improves Architecture Mixing during Population Based
Training for Neural Architecture Search [62.997667081978825]
We show that simultaneously training and mixing neural networks is a promising way to conduct Neural Architecture Search (NAS)
We propose PBT-NAS, an adaptation of PBT to NAS where architectures are improved during training by replacing poorly-performing networks in a population with the result of mixing well-performing ones and inheriting the weights using the shrink-perturb technique.
arXiv Detail & Related papers (2023-07-28T15:29:52Z) - Towards Memory- and Time-Efficient Backpropagation for Training Spiking
Neural Networks [70.75043144299168]
Spiking Neural Networks (SNNs) are promising energy-efficient models for neuromorphic computing.
We propose the Spatial Learning Through Time (SLTT) method that can achieve high performance while greatly improving training efficiency.
Our method achieves state-of-the-art accuracy on ImageNet, while the memory cost and training time are reduced by more than 70% and 50%, respectively, compared with BPTT.
arXiv Detail & Related papers (2023-02-28T05:01:01Z) - Online Training Through Time for Spiking Neural Networks [66.7744060103562]
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models.
Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency.
We propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning.
arXiv Detail & Related papers (2022-10-09T07:47:56Z) - Test-time Batch Normalization [61.292862024903584]
Deep neural networks often suffer the data distribution shift between training and testing.
We revisit the batch normalization (BN) in the training process and reveal two key insights benefiting test-time optimization.
We propose a novel test-time BN layer design, GpreBN, which is optimized during testing by minimizing Entropy loss.
arXiv Detail & Related papers (2022-05-20T14:33:39Z) - How much progress have we made in neural network training? A New
Evaluation Protocol for Benchmarking Optimizers [86.36020260204302]
We propose a new benchmarking protocol to evaluate both end-to-end efficiency and data-addition training efficiency.
A human study is conducted to show that our evaluation protocol matches human tuning behavior better than the random search.
We then apply the proposed benchmarking framework to 7s and various tasks, including computer vision, natural language processing, reinforcement learning, and graph mining.
arXiv Detail & Related papers (2020-10-19T21:46:39Z) - Regularized Evolutionary Population-Based Training [11.624954122221562]
This paper presents an algorithm called Population-Based Training (EPBT) that interleaves the training of a DNN's weights with the metalearning of loss functions.
EPBT results in faster, more accurate learning on image classification benchmarks.
arXiv Detail & Related papers (2020-02-11T06:28:13Z) - Provably Efficient Online Hyperparameter Optimization with
Population-Based Bandits [12.525529586816955]
We introduce the first provably efficient Population-Based Bandits algorithm.
PB2 uses a probabilistic model to guide the search in an efficient way.
We show in a series of RL experiments that PB2 is able to achieve high performance with a modest computational budget.
arXiv Detail & Related papers (2020-02-06T21:27:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.