Multiple-Frequencies Population-Based Training
- URL: http://arxiv.org/abs/2506.03225v2
- Date: Thu, 17 Jul 2025 15:41:03 GMT
- Title: Multiple-Frequencies Population-Based Training
- Authors: Waƫl Doulazmi, Auguste Lehuger, Marin Toromanoff, Valentin Charraut, Thibault Buhet, Fabien Moutarde,
- Abstract summary: We propose a novel HPO algorithm that addresses greediness by employing sub-populations.<n>MF-PBT introduces a migration process to transfer information between sub-populations.
- Score: 2.691655918692203
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement Learning's high sensitivity to hyperparameters is a source of instability and inefficiency, creating significant challenges for practitioners. Hyperparameter Optimization (HPO) algorithms have been developed to address this issue, among them Population-Based Training (PBT) stands out for its ability to generate hyperparameters schedules instead of fixed configurations. PBT trains a population of agents, each with its own hyperparameters, frequently ranking them and replacing the worst performers with mutations of the best agents. These intermediate selection steps can cause PBT to focus on short-term improvements, leading it to get stuck in local optima and eventually fall behind vanilla Random Search over longer timescales. This paper studies how this greediness issue is connected to the choice of evolution frequency, the rate at which the selection is done. We propose Multiple-Frequencies Population-Based Training (MF-PBT), a novel HPO algorithm that addresses greediness by employing sub-populations, each evolving at distinct frequencies. MF-PBT introduces a migration process to transfer information between sub-populations, with an asymmetric design to balance short and long-term optimization. Extensive experiments on the Brax suite demonstrate that MF-PBT improves sample efficiency and long-term performance, even without actually tuning hyperparameters.
Related papers
- Simultaneous Training of First- and Second-Order Optimizers in Population-Based Reinforcement Learning [0.0]
Population-based training (PBT) provides a method to achieve this by continuously tuning hyperparameters throughout the training.
We propose an enhancement to PBT by simultaneously utilizing both first- and second-orders within a single population.
arXiv Detail & Related papers (2024-08-27T21:54:26Z) - Generalized Population-Based Training for Hyperparameter Optimization in Reinforcement Learning [10.164982368785854]
Generalized Population-Based Training (GPBT) and Pairwise Learning (PL)
PL employs a comprehensive pairwise strategy to identify performance differentials and provide holistic guidance to underperforming agents.
arXiv Detail & Related papers (2024-04-12T04:23:20Z) - PriorBand: Practical Hyperparameter Optimization in the Age of Deep
Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines.
We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z) - Multi-Objective Population Based Training [62.997667081978825]
Population Based Training (PBT) is an efficient hyperparameter optimization algorithm.
In this work, we introduce a multi-objective version of PBT, MO-PBT.
arXiv Detail & Related papers (2023-06-02T10:54:24Z) - Massively Parallel Genetic Optimization through Asynchronous Propagation
of Populations [50.591267188664666]
Propulate is an evolutionary optimization algorithm and software package for global optimization.
We provide an MPI-based implementation of our algorithm, which features variants of selection, mutation, crossover, and migration.
We find that Propulate is up to three orders of magnitude faster without sacrificing solution accuracy.
arXiv Detail & Related papers (2023-01-20T18:17:34Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - Auto-FedRL: Federated Hyperparameter Optimization for
Multi-institutional Medical Image Segmentation [48.821062916381685]
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing.
In this work, we propose an efficient reinforcement learning(RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL.
The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset and two real-world medical image segmentation datasets.
arXiv Detail & Related papers (2022-03-12T04:11:42Z) - Faster Improvement Rate Population Based Training [7.661301899629696]
This paper presents Faster Improvement Rate PBT (FIRE PBT) which addresses the problem of Population Based Training (PBT)
We derive a novel fitness metric and use it to make some of the population members focus on long-term performance.
Experiments show that FIRE PBT is able to outperform PBT on the ImageNet benchmark and match the performance of networks that were trained with a hand-tuned learning rate schedule.
arXiv Detail & Related papers (2021-09-28T15:30:55Z) - Low-Latency Federated Learning over Wireless Channels with Differential
Privacy [142.5983499872664]
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server.
In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement.
arXiv Detail & Related papers (2021-06-20T13:51:18Z) - A Population-based Hybrid Approach to Hyperparameter Optimization for
Neural Networks [0.0]
HBRKGA is a hybrid approach that combines the Biased Random Key Genetic Algorithm with a Random Walk technique to search the hyper parameter space efficiently.
Results showed that HBRKGA could find hyper parameter configurations that outperformed the baseline methods in six out of eight datasets.
arXiv Detail & Related papers (2020-11-22T17:12:31Z) - Provably Efficient Online Hyperparameter Optimization with
Population-Based Bandits [12.525529586816955]
We introduce the first provably efficient Population-Based Bandits algorithm.
PB2 uses a probabilistic model to guide the search in an efficient way.
We show in a series of RL experiments that PB2 is able to achieve high performance with a modest computational budget.
arXiv Detail & Related papers (2020-02-06T21:27:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.