A Framework for History-Aware Hyperparameter Optimisation in
Reinforcement Learning
- URL: http://arxiv.org/abs/2303.05186v1
- Date: Thu, 9 Mar 2023 11:30:40 GMT
- Title: A Framework for History-Aware Hyperparameter Optimisation in
Reinforcement Learning
- Authors: Juan Marcelo Parra-Ullauri, Chen Zhen, Antonio Garc\'ia-Dom\'inguez,
Nelly Bencomo, Changgang Zheng, Juan Boubeta-Puig, Guadalupe Ortiz, Shufan
Yang
- Abstract summary: A Reinforcement Learning (RL) system depends on a set of initial conditions that affect the system's performance.
We propose a framework based on integrating complex event processing and temporal models, to alleviate these trade-offs.
We tested the proposed approach in a 5G mobile communications case study that uses DQN, a variant of RL, for its decision-making.
- Score: 8.659973888018781
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: A Reinforcement Learning (RL) system depends on a set of initial conditions
(hyperparameters) that affect the system's performance. However, defining a
good choice of hyperparameters is a challenging problem.
Hyperparameter tuning often requires manual or automated searches to find
optimal values. Nonetheless, a noticeable limitation is the high cost of
algorithm evaluation for complex models, making the tuning process
computationally expensive and time-consuming.
In this paper, we propose a framework based on integrating complex event
processing and temporal models, to alleviate these trade-offs. Through this
combination, it is possible to gain insights about a running RL system
efficiently and unobtrusively based on data stream monitoring and to create
abstract representations that allow reasoning about the historical behaviour of
the RL system. The obtained knowledge is exploited to provide feedback to the
RL system for optimising its hyperparameters while making effective use of
parallel resources.
We introduce a novel history-aware epsilon-greedy logic for hyperparameter
optimisation that instead of using static hyperparameters that are kept fixed
for the whole training, adjusts the hyperparameters at runtime based on the
analysis of the agent's performance over time windows in a single agent's
lifetime. We tested the proposed approach in a 5G mobile communications case
study that uses DQN, a variant of RL, for its decision-making. Our experiments
demonstrated the effects of hyperparameter tuning using history on training
stability and reward values. The encouraging results show that the proposed
history-aware framework significantly improved performance compared to
traditional hyperparameter tuning approaches.
Related papers
- Efficient Hyperparameter Importance Assessment for CNNs [1.7778609937758323]
This paper aims to quantify the importance weights of some hyperparameters in Convolutional Neural Networks (CNNs) with an algorithm called N-RReliefF.
We conduct an extensive study by training over ten thousand CNN models across ten popular image classification datasets.
arXiv Detail & Related papers (2024-10-11T15:47:46Z) - Optimization Hyper-parameter Laws for Large Language Models [56.322914260197734]
We present Opt-Laws, a framework that captures the relationship between hyper- parameters and training outcomes.
Our validation across diverse model sizes and data scales demonstrates Opt-Laws' ability to accurately predict training loss.
This approach significantly reduces computational costs while enhancing overall model performance.
arXiv Detail & Related papers (2024-09-07T09:37:19Z) - Combining Automated Optimisation of Hyperparameters and Reward Shape [7.407166175374958]
We propose a methodology for the combined optimisation of hyperparameters and the reward function.
We conducted extensive experiments using Proximal Policy optimisation and Soft Actor-Critic.
Our results show that combined optimisation significantly improves over baseline performance in half of the environments and achieves competitive performance in the others.
arXiv Detail & Related papers (2024-06-26T12:23:54Z) - AutoRL Hyperparameter Landscapes [69.15927869840918]
Reinforcement Learning (RL) has shown to be capable of producing impressive results, but its use is limited by the impact of its hyperparameters on performance.
We propose an approach to build and analyze these hyperparameter landscapes not just for one point in time but at multiple points in time throughout training.
This supports the theory that hyperparameters should be dynamically adjusted during training and shows the potential for more insights on AutoRL problems that can be gained through landscape analyses.
arXiv Detail & Related papers (2023-04-05T12:14:41Z) - Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits [55.03293214439741]
In contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience.
We propose the first online continuous hyperparameter tuning framework for contextual bandits.
We show that it could achieve a sublinear regret in theory and performs consistently better than all existing methods on both synthetic and real datasets.
arXiv Detail & Related papers (2023-02-18T23:31:20Z) - No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL [28.31529154045046]
We propose a new approach to tune hyperparameters from offline logs of data.
We first learn a model of the environment from the offline data, which we call a calibration model, and then simulate learning in the calibration model.
We empirically investigate the method in a variety of settings to identify when it is effective and when it fails.
arXiv Detail & Related papers (2022-05-18T04:26:23Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Hyperparameter Tuning for Deep Reinforcement Learning Applications [0.3553493344868413]
We propose a distributed variable-length genetic algorithm framework to tune hyperparameters for various RL applications.
Our results show that with more generations, optimal solutions that require fewer training episodes and are computationally cheap while being more robust for deployment.
arXiv Detail & Related papers (2022-01-26T20:43:13Z) - Automatic tuning of hyper-parameters of reinforcement learning
algorithms using Bayesian optimization with behavioral cloning [0.0]
In reinforcement learning (RL), the information content of data gathered by the learning agent is dependent on the setting of many hyper- parameters.
In this work, a novel approach for autonomous hyper- parameter setting using Bayesian optimization is proposed.
Experiments reveal promising results compared to other manual tweaking and optimization-based approaches.
arXiv Detail & Related papers (2021-12-15T13:10:44Z) - Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for
Hyperparameter Recommendation [83.85021205445662]
We propose an instantiation--amortized auto-tuning (AT2) to speed up tuning of machine learning models.
We conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2)
arXiv Detail & Related papers (2021-06-17T00:01:18Z) - Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs)
It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously.
This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.