HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning
Problem
- URL: http://arxiv.org/abs/2002.04238v2
- Date: Sat, 5 Jun 2021 06:36:21 GMT
- Title: HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning
Problem
- Authors: Yun Hua, Xiangfeng Wang, Bo Jin, Wenhao Li, Junchi Yan, Xiaofeng He,
Hongyuan Zha
- Abstract summary: We develop a novel meta reinforcement learning framework called Hyper-Meta RL(HMRL) for sparse reward RL problems.
It is consisted with three modules including the cross-environment meta state embedding module which constructs a common meta state space to adapt to different environments.
Experiments with sparse-reward environments show the superiority of HMRL on both transferability and policy learning efficiency.
- Score: 107.52043871875898
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In spite of the success of existing meta reinforcement learning methods, they
still have difficulty in learning a meta policy effectively for RL problems
with sparse reward. In this respect, we develop a novel meta reinforcement
learning framework called Hyper-Meta RL(HMRL), for sparse reward RL problems.
It is consisted with three modules including the cross-environment meta state
embedding module which constructs a common meta state space to adapt to
different environments; the meta state based environment-specific meta reward
shaping which effectively extends the original sparse reward trajectory by
cross-environmental knowledge complementarity and as a consequence the meta
policy achieves better generalization and efficiency with the shaped meta
reward. Experiments with sparse-reward environments show the superiority of
HMRL on both transferability and policy learning efficiency.
Related papers
- MAMBA: an Effective World Model Approach for Meta-Reinforcement Learning [18.82398325614491]
We propose a new model-based approach to meta-RL, based on elements from existing state-of-the-art model-based and meta-RL methods.
We demonstrate the effectiveness of our approach on common meta-RL benchmark domains, attaining greater return with better sample efficiency.
In addition, we validate our approach on a slate of more challenging, higher-dimensional domains, taking a step towards real-world generalizing agents.
arXiv Detail & Related papers (2024-03-14T20:40:36Z) - Train Hard, Fight Easy: Robust Meta Reinforcement Learning [78.16589993684698]
A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients.
Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty.
In this work, we define a robust MRL objective with a controlled level.
The data inefficiency is addressed via the novel Robust Meta RL algorithm (RoML)
arXiv Detail & Related papers (2023-01-26T14:54:39Z) - Enhanced Meta Reinforcement Learning using Demonstrations in Sparse
Reward Environments [10.360491332190433]
We develop a class of algorithms entitled Enhanced Meta-RL using Demonstrations.
We show how EMRLD jointly utilizes RL and supervised learning over the offline data to generate a meta-policy.
We also show that our EMRLD algorithms significantly outperform existing approaches in a variety of sparse reward environments.
arXiv Detail & Related papers (2022-09-26T22:01:12Z) - Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms.
Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z) - Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks.
We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy.
We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z) - Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL [91.26538493552817]
We present a formulation of hindsight relabeling for meta-RL, which relabels experience during meta-training to enable learning to learn entirely using sparse reward.
We demonstrate the effectiveness of our approach on a suite of challenging sparse reward goal-reaching environments.
arXiv Detail & Related papers (2021-12-02T00:51:17Z) - MetaCURE: Meta Reinforcement Learning with Empowerment-Driven
Exploration [52.48362697163477]
Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on sparse-reward tasks.
We model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning.
We develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies.
arXiv Detail & Related papers (2020-06-15T06:56:18Z) - Curriculum in Gradient-Based Meta-Reinforcement Learning [10.447238563837173]
We show that gradient-based meta-learners are sensitive to task distributions.
With the wrong curriculum, agents suffer the effects of meta-overfitting, shallow adaptation, and adaptation instability.
arXiv Detail & Related papers (2020-02-19T01:40:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.