Related papers: ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning

ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning

URL: http://arxiv.org/abs/2409.18827v1
Date: Fri, 27 Sep 2024 15:22:28 GMT
Title: ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning
Authors: Jannis Becktepe, Julian Dierkes, Carolin Benjamins, Aditya Mohan, David Salinas, Raghu Rajan, Frank Hutter, Holger Hoos, Marius Lindauer, Theresa Eimer,
Abstract summary: ARLBench is a benchmark for hyperparameter optimization (HPO) in reinforcement learning (RL) It allows comparisons of diverse HPO approaches while being highly efficient in evaluation. ARLBench is an efficient, flexible, and future-oriented foundation for research on AutoRL.
Score: 42.33815055388433
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hyperparameters are a critical factor in reliably training well-performing reinforcement learning (RL) agents. Unfortunately, developing and evaluating automated approaches for tuning such hyperparameters is both costly and time-consuming. As a result, such approaches are often only evaluated on a single domain or algorithm, making comparisons difficult and limiting insights into their generalizability. We propose ARLBench, a benchmark for hyperparameter optimization (HPO) in RL that allows comparisons of diverse HPO approaches while being highly efficient in evaluation. To enable research into HPO in RL, even in settings with low compute resources, we select a representative subset of HPO tasks spanning a variety of algorithm and environment combinations. This selection allows for generating a performance profile of an automated RL (AutoRL) method using only a fraction of the compute previously necessary, enabling a broader range of researchers to work on HPO in RL. With the extensive and large-scale dataset on hyperparameter landscapes that our selection is based on, ARLBench is an efficient, flexible, and future-oriented foundation for research on AutoRL. Both the benchmark and the dataset are available at https://github.com/automl/arlbench.

Related papers

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning [50.53705050673944]
We propose ULTHO, an ultra-lightweight yet powerful framework for fast HPO in deep RL within single runs. Specifically, we formulate the HPO process as a multi-armed bandit with clustered arms (MABC) and link it directly to long-term return optimization. We test ULTHO on benchmarks including ALE, Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-03-08T07:03:43Z)
Value-Based Deep RL Scales Predictably [100.21834069400023]
We show that value-based off-policy RL methods are predictable despite community lore regarding their pathological behavior. We validate our approach using three algorithms: SAC, BRO, and PQL on DeepMind Control, OpenAI gym, and IsaacGym.
arXiv Detail & Related papers (2025-02-06T18:59:47Z)
Efficient Transformer-based Hyper-parameter Optimization for Resource-constrained IoT Environments [9.72257571115249]
We propose a novel approach that combines transformer architecture and actor-critic Reinforcement Learning model, TRL-HPO. The results show that TRL-HPO outperforms the classification results of these approaches by 6.8% within the same time frame. This paper identifies new avenues for improving RL-based HPO processes in resource-constrained environments.
arXiv Detail & Related papers (2024-03-18T20:35:35Z)
Hyperparameters in Reinforcement Learning and How To Tune Them [25.782420501870295]
We show that hyper parameter choices in deep reinforcement learning can significantly affect the agent's final performance and sample efficiency. We propose adopting established best practices from AutoML, such as the separation of tuning and testing seeds. We support this by comparing state-of-the-art HPO tools on a range of RL algorithms and environments to their hand-tuned counterparts.
arXiv Detail & Related papers (2023-06-02T07:48:18Z)
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX. textttMEX integrates estimation and planning components while balancing exploration exploitation automatically. It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z)
AutoRL Hyperparameter Landscapes [69.15927869840918]
Reinforcement Learning (RL) has shown to be capable of producing impressive results, but its use is limited by the impact of its hyperparameters on performance. We propose an approach to build and analyze these hyperparameter landscapes not just for one point in time but at multiple points in time throughout training. This supports the theory that hyperparameters should be dynamically adjusted during training and shows the potential for more insights on AutoRL problems that can be gained through landscape analyses.
arXiv Detail & Related papers (2023-04-05T12:14:41Z)
Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset [0.15420205433587747]
We present a two-step HPO method as a strategic solution to curbing computational demands and wait times. We present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation.
arXiv Detail & Related papers (2023-02-08T02:38:26Z)
Automating DBSCAN via Deep Reinforcement Learning [73.82740568765279]
We propose a novel Deep Reinforcement Learning guided automatic DBSCAN parameters search framework, namely DRL-DBSCAN. The framework models the process of adjusting the parameter search direction by perceiving the clustering environment as a Markov decision process. The framework consistently improves DBSCAN clustering accuracy by up to 26% and 25% respectively.
arXiv Detail & Related papers (2022-08-09T04:40:11Z)
Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction. Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z)
Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation [48.821062916381685]
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. In this work, we propose an efficient reinforcement learning(RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL. The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset and two real-world medical image segmentation datasets.
arXiv Detail & Related papers (2022-03-12T04:11:42Z)
Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL [12.135280422000635]
We introduce a new efficient hierarchical approach for optimizing both continuous and categorical variables. We show that explicitly modelling dependence between data augmentation and other hyper parameters improves generalization.
arXiv Detail & Related papers (2021-06-30T08:15:59Z)
Sample-Efficient Automated Deep Reinforcement Learning [33.53903358611521]
We propose a population-based automated RL framework to meta-optimize arbitrary off-policy RL algorithms. By sharing the collected experience across the population, we substantially increase the sample efficiency of the meta-optimization. We demonstrate the capabilities of our sample-efficient AutoRL approach in a case study with the popular TD3 algorithm in the MuJoCo benchmark suite.
arXiv Detail & Related papers (2020-09-03T10:04:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.