Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning
- URL: http://arxiv.org/abs/2412.03767v1
- Date: Wed, 04 Dec 2024 23:12:41 GMT
- Title: Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning
- Authors: Yiran Wang, Chenshu Liu, Yunfan Li, Sanae Amani, Bolei Zhou, Lin F. Yang,
- Abstract summary: textbfHyper is provably efficient under function approximation setting and empirically demonstrate its appealing performance and robustness in various environments.
textbfHyper extensively mitigates the problem by effectively regularizing the visitation of the exploration and decoupling the exploitation to ensure stable training.
- Score: 48.81121647322492
- License:
- Abstract: The exploration \& exploitation dilemma poses significant challenges in reinforcement learning (RL). Recently, curiosity-based exploration methods achieved great success in tackling hard-exploration problems. However, they necessitate extensive hyperparameter tuning on different environments, which heavily limits the applicability and accessibility of this line of methods. In this paper, we characterize this problem via analysis of the agent behavior, concluding the fundamental difficulty of choosing a proper hyperparameter. We then identify the difficulty and the instability of the optimization when the agent learns with curiosity. We propose our method, hyperparameter robust exploration (\textbf{Hyper}), which extensively mitigates the problem by effectively regularizing the visitation of the exploration and decoupling the exploitation to ensure stable training. We theoretically justify that \textbf{Hyper} is provably efficient under function approximation setting and empirically demonstrate its appealing performance and robustness in various environments.
Related papers
- Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function [24.457000214575245]
We introduce a new technique to characterize the discontinuities and oscillations of the utility function on any fixed problem instance.
This can be used to show that the learning theoretic complexity of the corresponding family of utility functions is bounded.
arXiv Detail & Related papers (2025-01-23T15:10:51Z) - Variable-Agnostic Causal Exploration for Reinforcement Learning [56.52768265734155]
We introduce a novel framework, Variable-Agnostic Causal Exploration for Reinforcement Learning (VACERL)
Our approach automatically identifies crucial observation-action steps associated with key variables using attention mechanisms.
It constructs the causal graph connecting these steps, which guides the agent towards observation-action pairs with greater causal influence on task completion.
arXiv Detail & Related papers (2024-07-17T09:45:27Z) - Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning.
Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques.
Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z) - WESE: Weak Exploration to Strong Exploitation for LLM Agents [95.6720931773781]
This paper proposes a novel approach, Weak Exploration to Strong Exploitation (WESE) to enhance LLM agents in solving open-world interactive tasks.
WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge.
A knowledge graph-based strategy is then introduced to store the acquired knowledge and extract task-relevant knowledge, enhancing the stronger agent in success rate and efficiency for the exploitation task.
arXiv Detail & Related papers (2024-04-11T03:31:54Z) - Adaptive Hyperparameter Optimization for Continual Learning Scenarios [19.151871846937738]
This paper aims to explore the role of hyperparameter selection in continual learning.
By using the functional analysis of variance-based techniques, we identify the most crucial hyperparameters that have an impact on performance.
We demonstrate empirically that this approach, agnostic to continual scenarios and strategies, allows us to speed up hyperparameters optimization continually across tasks and exhibit robustness even in the face of varying sequential task orders.
arXiv Detail & Related papers (2024-03-09T16:47:42Z) - Adaptive trajectory-constrained exploration strategy for deep
reinforcement learning [6.589742080994319]
Deep reinforcement learning (DRL) faces significant challenges in addressing the hard-exploration problems in tasks with sparse or deceptive rewards and large state spaces.
We propose an efficient adaptive trajectory-constrained exploration strategy for DRL.
We conduct experiments on two large 2D grid world mazes and several MuJoCo tasks.
arXiv Detail & Related papers (2023-12-27T07:57:15Z) - Hyperparameter Optimization for Multi-Objective Reinforcement Learning [0.27309692684728615]
Reinforcement learning (RL) has emerged as a powerful approach for tackling complex problems.
The recent introduction of multi-objective reinforcement learning (MORL) has further expanded the scope of RL.
In practice, this task often proves to be challenging, leading to unsuccessful deployments of these techniques.
arXiv Detail & Related papers (2023-10-25T09:17:25Z) - Improve Noise Tolerance of Robust Loss via Noise-Awareness [60.34670515595074]
We propose a meta-learning method which is capable of adaptively learning a hyper parameter prediction function, called Noise-Aware-Robust-Loss-Adjuster (NARL-Adjuster for brevity)
Four SOTA robust loss functions are attempted to be integrated with our algorithm, and comprehensive experiments substantiate the general availability and effectiveness of the proposed method in both its noise tolerance and performance.
arXiv Detail & Related papers (2023-01-18T04:54:58Z) - Task-Optimal Exploration in Linear Dynamical Systems [29.552894877883883]
We study task-guided exploration and determine what precisely an agent must learn about their environment in order to complete a task.
We provide instance- and task-dependent lower bounds which explicitly quantify the difficulty of completing a task of interest.
We show that it optimally explores the environment, collecting precisely the information needed to complete the task, and provide finite-time bounds guaranteeing that it achieves the instance- and task-optimal sample complexity.
arXiv Detail & Related papers (2021-02-10T01:42:22Z) - Learning Adaptive Loss for Robust Learning with Noisy Labels [59.06189240645958]
Robust loss is an important strategy for handling robust learning issue.
We propose a meta-learning method capable of robust hyper tuning.
Four kinds of SOTA loss functions are attempted to be minimization, general availability and effectiveness.
arXiv Detail & Related papers (2020-02-16T00:53:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.