On the Importance of Hyperparameter Optimization for Model-based
Reinforcement Learning
- URL: http://arxiv.org/abs/2102.13651v1
- Date: Fri, 26 Feb 2021 18:57:47 GMT
- Title: On the Importance of Hyperparameter Optimization for Model-based
Reinforcement Learning
- Authors: Baohe Zhang, Raghu Rajan, Luis Pineda, Nathan Lambert, Andr\'e
Biedenkapp, Kurtland Chua, Frank Hutter, Roberto Calandra
- Abstract summary: Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner.
MBRL typically requires significant human expertise before it can be applied to new problems and domains.
- Score: 27.36718899899319
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Model-based Reinforcement Learning (MBRL) is a promising framework for
learning control in a data-efficient manner. MBRL algorithms can be fairly
complex due to the separate dynamics modeling and the subsequent planning
algorithm, and as a result, they often possess tens of hyperparameters and
architectural choices. For this reason, MBRL typically requires significant
human expertise before it can be applied to new problems and domains. To
alleviate this problem, we propose to use automatic hyperparameter optimization
(HPO). We demonstrate that this problem can be tackled effectively with
automated HPO, which we demonstrate to yield significantly improved performance
compared to human experts. In addition, we show that tuning of several MBRL
hyperparameters dynamically, i.e. during the training itself, further improves
the performance compared to using static hyperparameters which are kept fixed
for the whole training. Finally, our experiments provide valuable insights into
the effects of several hyperparameters, such as plan horizon or learning rate
and their influence on the stability of training and resulting rewards.
Related papers
- Efficient Hyperparameter Importance Assessment for CNNs [1.7778609937758323]
This paper aims to quantify the importance weights of some hyperparameters in Convolutional Neural Networks (CNNs) with an algorithm called N-RReliefF.
We conduct an extensive study by training over ten thousand CNN models across ten popular image classification datasets.
arXiv Detail & Related papers (2024-10-11T15:47:46Z) - Optimization Hyper-parameter Laws for Large Language Models [56.322914260197734]
We present Opt-Laws, a framework that captures the relationship between hyper- parameters and training outcomes.
Our validation across diverse model sizes and data scales demonstrates Opt-Laws' ability to accurately predict training loss.
This approach significantly reduces computational costs while enhancing overall model performance.
arXiv Detail & Related papers (2024-09-07T09:37:19Z) - Interactive Hyperparameter Optimization in Multi-Objective Problems via
Preference Learning [65.51668094117802]
We propose a human-centered interactive HPO approach tailored towards multi-objective machine learning (ML)
Instead of relying on the user guessing the most suitable indicator for their needs, our approach automatically learns an appropriate indicator.
arXiv Detail & Related papers (2023-09-07T09:22:05Z) - AutoRL Hyperparameter Landscapes [69.15927869840918]
Reinforcement Learning (RL) has shown to be capable of producing impressive results, but its use is limited by the impact of its hyperparameters on performance.
We propose an approach to build and analyze these hyperparameter landscapes not just for one point in time but at multiple points in time throughout training.
This supports the theory that hyperparameters should be dynamically adjusted during training and shows the potential for more insights on AutoRL problems that can be gained through landscape analyses.
arXiv Detail & Related papers (2023-04-05T12:14:41Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Hyperparameter Tuning for Deep Reinforcement Learning Applications [0.3553493344868413]
We propose a distributed variable-length genetic algorithm framework to tune hyperparameters for various RL applications.
Our results show that with more generations, optimal solutions that require fewer training episodes and are computationally cheap while being more robust for deployment.
arXiv Detail & Related papers (2022-01-26T20:43:13Z) - On Effective Scheduling of Model-based Reinforcement Learning [53.027698625496015]
We propose a framework named AutoMBPO to automatically schedule the real data ratio.
In this paper, we first theoretically analyze the role of real data in policy training, which suggests that gradually increasing the ratio of real data yields better performance.
arXiv Detail & Related papers (2021-11-16T15:24:59Z) - To tune or not to tune? An Approach for Recommending Important
Hyperparameters [2.121963121603413]
We consider building the relationship between the performance of the machine learning models and their hyperparameters to discover the trend and gain insights.
Our results enable users to decide whether it is worth conducting a possibly time-consuming tuning strategy.
arXiv Detail & Related papers (2021-08-30T08:54:58Z) - Efficient Hyperparameter Optimization for Physics-based Character
Animation [1.2183405753834562]
We propose a novel Curriculum-based Multi-Fidelity Bayesian Optimization framework (CMFBO) for efficient hyperparameter optimization of DRL-based character control systems.
We show that our algorithm results in at least 5x efficiency gain comparing to author-released settings in DeepMimic.
arXiv Detail & Related papers (2021-04-26T06:46:36Z) - Sample-Efficient Automated Deep Reinforcement Learning [33.53903358611521]
We propose a population-based automated RL framework to meta-optimize arbitrary off-policy RL algorithms.
By sharing the collected experience across the population, we substantially increase the sample efficiency of the meta-optimization.
We demonstrate the capabilities of our sample-efficient AutoRL approach in a case study with the popular TD3 algorithm in the MuJoCo benchmark suite.
arXiv Detail & Related papers (2020-09-03T10:04:06Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.