Bayesian Generational Population-Based Training
- URL: http://arxiv.org/abs/2207.09405v1
- Date: Tue, 19 Jul 2022 16:57:38 GMT
- Title: Bayesian Generational Population-Based Training
- Authors: Xingchen Wan, Cong Lu, Jack Parker-Holder, Philip J. Ball, Vu Nguyen,
Binxin Ru, Michael A. Osborne
- Abstract summary: Population-Based Training (PBT) has led to impressive performance in several large scale settings.
We introduce two new innovations in PBT-style methods.
We show that these innovations lead to large performance gains.
- Score: 35.70338636901159
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) offers the potential for training generally
capable agents that can interact autonomously in the real world. However, one
key limitation is the brittleness of RL algorithms to core hyperparameters and
network architecture choice. Furthermore, non-stationarities such as evolving
training data and increased agent complexity mean that different
hyperparameters and architectures may be optimal at different points of
training. This motivates AutoRL, a class of methods seeking to automate these
design choices. One prominent class of AutoRL methods is Population-Based
Training (PBT), which have led to impressive performance in several large scale
settings. In this paper, we introduce two new innovations in PBT-style methods.
First, we employ trust-region based Bayesian Optimization, enabling full
coverage of the high-dimensional mixed hyperparameter search space. Second, we
show that using a generational approach, we can also learn both architectures
and hyperparameters jointly on-the-fly in a single training run. Leveraging the
new highly parallelizable Brax physics engine, we show that these innovations
lead to large performance gains, significantly outperforming the tuned baseline
while learning entire configurations on the fly. Code is available at
https://github.com/xingchenwan/bgpbt.
Related papers
- Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems.
Current value-based RL methods have proven particularly challenging to scale to the setting of large language models.
We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - BiERL: A Meta Evolutionary Reinforcement Learning Framework via Bilevel
Optimization [34.24884427152513]
We propose a general meta ERL framework via bilevel optimization (BiERL)
We design an elegant meta-level architecture that embeds the inner-level's evolving experience into an informative population representation.
We perform extensive experiments in MuJoCo and Box2D tasks to verify that as a general framework, BiERL outperforms various baselines and consistently improves the learning performance for a diversity of ERL algorithms.
arXiv Detail & Related papers (2023-08-01T09:31:51Z) - AutoRL Hyperparameter Landscapes [69.15927869840918]
Reinforcement Learning (RL) has shown to be capable of producing impressive results, but its use is limited by the impact of its hyperparameters on performance.
We propose an approach to build and analyze these hyperparameter landscapes not just for one point in time but at multiple points in time throughout training.
This supports the theory that hyperparameters should be dynamically adjusted during training and shows the potential for more insights on AutoRL problems that can be gained through landscape analyses.
arXiv Detail & Related papers (2023-04-05T12:14:41Z) - Hyperparameter Tuning for Deep Reinforcement Learning Applications [0.3553493344868413]
We propose a distributed variable-length genetic algorithm framework to tune hyperparameters for various RL applications.
Our results show that with more generations, optimal solutions that require fewer training episodes and are computationally cheap while being more robust for deployment.
arXiv Detail & Related papers (2022-01-26T20:43:13Z) - Scalable One-Pass Optimisation of High-Dimensional Weight-Update
Hyperparameters by Implicit Differentiation [0.0]
We develop an approximate hypergradient-based hyper parameter optimiser.
It requires only one training episode, with no restarts.
We also provide a motivating argument for convergence to the true hypergradient.
arXiv Detail & Related papers (2021-10-20T09:57:57Z) - AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures.
We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS.
Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z) - On the Importance of Hyperparameter Optimization for Model-based
Reinforcement Learning [27.36718899899319]
Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner.
MBRL typically requires significant human expertise before it can be applied to new problems and domains.
arXiv Detail & Related papers (2021-02-26T18:57:47Z) - Sample-Efficient Automated Deep Reinforcement Learning [33.53903358611521]
We propose a population-based automated RL framework to meta-optimize arbitrary off-policy RL algorithms.
By sharing the collected experience across the population, we substantially increase the sample efficiency of the meta-optimization.
We demonstrate the capabilities of our sample-efficient AutoRL approach in a case study with the popular TD3 algorithm in the MuJoCo benchmark suite.
arXiv Detail & Related papers (2020-09-03T10:04:06Z) - Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of
Partitioned Edge Learning [73.82875010696849]
Machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models.
This paper focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation.
arXiv Detail & Related papers (2020-03-10T05:52:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.