Faster Improvement Rate Population Based Training
- URL: http://arxiv.org/abs/2109.13800v1
- Date: Tue, 28 Sep 2021 15:30:55 GMT
- Title: Faster Improvement Rate Population Based Training
- Authors: Valentin Dalibard, Max Jaderberg
- Abstract summary: This paper presents Faster Improvement Rate PBT (FIRE PBT) which addresses the problem of Population Based Training (PBT)
We derive a novel fitness metric and use it to make some of the population members focus on long-term performance.
Experiments show that FIRE PBT is able to outperform PBT on the ImageNet benchmark and match the performance of networks that were trained with a hand-tuned learning rate schedule.
- Score: 7.661301899629696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The successful training of neural networks typically involves careful and
time consuming hyperparameter tuning. Population Based Training (PBT) has
recently been proposed to automate this process. PBT trains a population of
neural networks concurrently, frequently mutating their hyperparameters
throughout their training. However, the decision mechanisms of PBT are greedy
and favour short-term improvements which can, in some cases, lead to poor
long-term performance. This paper presents Faster Improvement Rate PBT (FIRE
PBT) which addresses this problem. Our method is guided by an assumption: given
two neural networks with similar performance and training with similar
hyperparameters, the network showing the faster rate of improvement will lead
to a better final performance. Using this, we derive a novel fitness metric and
use it to make some of the population members focus on long-term performance.
Our experiments show that FIRE PBT is able to outperform PBT on the ImageNet
benchmark and match the performance of networks that were trained with a
hand-tuned learning rate schedule. We apply FIRE PBT to reinforcement learning
tasks and show that it leads to faster learning and higher final performance
than both PBT and random hyperparameter search.
Related papers
- Generalized Population-Based Training for Hyperparameter Optimization in Reinforcement Learning [10.164982368785854]
Generalized Population-Based Training (GPBT) and Pairwise Learning (PL)
PL employs a comprehensive pairwise strategy to identify performance differentials and provide holistic guidance to underperforming agents.
arXiv Detail & Related papers (2024-04-12T04:23:20Z) - Shrink-Perturb Improves Architecture Mixing during Population Based
Training for Neural Architecture Search [62.997667081978825]
We show that simultaneously training and mixing neural networks is a promising way to conduct Neural Architecture Search (NAS)
We propose PBT-NAS, an adaptation of PBT to NAS where architectures are improved during training by replacing poorly-performing networks in a population with the result of mixing well-performing ones and inheriting the weights using the shrink-perturb technique.
arXiv Detail & Related papers (2023-07-28T15:29:52Z) - Towards Memory- and Time-Efficient Backpropagation for Training Spiking
Neural Networks [70.75043144299168]
Spiking Neural Networks (SNNs) are promising energy-efficient models for neuromorphic computing.
We propose the Spatial Learning Through Time (SLTT) method that can achieve high performance while greatly improving training efficiency.
Our method achieves state-of-the-art accuracy on ImageNet, while the memory cost and training time are reduced by more than 70% and 50%, respectively, compared with BPTT.
arXiv Detail & Related papers (2023-02-28T05:01:01Z) - Online Training Through Time for Spiking Neural Networks [66.7744060103562]
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models.
Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency.
We propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning.
arXiv Detail & Related papers (2022-10-09T07:47:56Z) - Test-time Batch Normalization [61.292862024903584]
Deep neural networks often suffer the data distribution shift between training and testing.
We revisit the batch normalization (BN) in the training process and reveal two key insights benefiting test-time optimization.
We propose a novel test-time BN layer design, GpreBN, which is optimized during testing by minimizing Entropy loss.
arXiv Detail & Related papers (2022-05-20T14:33:39Z) - Accurate online training of dynamical spiking neural networks through
Forward Propagation Through Time [1.8515971640245998]
We show how a recently developed alternative to BPTT can be applied in spiking neural networks.
FPTT attempts to minimize an ongoing dynamically regularized risk on the loss.
We show that SNNs trained with FPTT outperform online BPTT approximations, and approach or exceed offline BPTT accuracy on temporal classification tasks.
arXiv Detail & Related papers (2021-12-20T13:44:20Z) - Towards Evaluating and Training Verifiably Robust Neural Networks [81.39994285743555]
We study the relationship between IBP and CROWN, and prove that CROWN is always tighter than IBP when choosing appropriate bounding lines.
We propose a relaxed version of CROWN, linear bound propagation (LBP), that can be used to verify large networks to obtain lower verified errors.
arXiv Detail & Related papers (2021-04-01T13:03:48Z) - How much progress have we made in neural network training? A New
Evaluation Protocol for Benchmarking Optimizers [86.36020260204302]
We propose a new benchmarking protocol to evaluate both end-to-end efficiency and data-addition training efficiency.
A human study is conducted to show that our evaluation protocol matches human tuning behavior better than the random search.
We then apply the proposed benchmarking framework to 7s and various tasks, including computer vision, natural language processing, reinforcement learning, and graph mining.
arXiv Detail & Related papers (2020-10-19T21:46:39Z) - Regularized Evolutionary Population-Based Training [11.624954122221562]
This paper presents an algorithm called Population-Based Training (EPBT) that interleaves the training of a DNN's weights with the metalearning of loss functions.
EPBT results in faster, more accurate learning on image classification benchmarks.
arXiv Detail & Related papers (2020-02-11T06:28:13Z) - Provably Efficient Online Hyperparameter Optimization with
Population-Based Bandits [12.525529586816955]
We introduce the first provably efficient Population-Based Bandits algorithm.
PB2 uses a probabilistic model to guide the search in an efficient way.
We show in a series of RL experiments that PB2 is able to achieve high performance with a modest computational budget.
arXiv Detail & Related papers (2020-02-06T21:27:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.