MetaCURE: Meta Reinforcement Learning with Empowerment-Driven
Exploration
- URL: http://arxiv.org/abs/2006.08170v5
- Date: Fri, 12 Nov 2021 03:15:55 GMT
- Title: MetaCURE: Meta Reinforcement Learning with Empowerment-Driven
Exploration
- Authors: Jin Zhang, Jianhao Wang, Hao Hu, Tong Chen, Yingfeng Chen, Changjie
Fan and Chongjie Zhang
- Abstract summary: Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on sparse-reward tasks.
We model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning.
We develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies.
- Score: 52.48362697163477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta reinforcement learning (meta-RL) extracts knowledge from previous tasks
and achieves fast adaptation to new tasks. Despite recent progress, efficient
exploration in meta-RL remains a key challenge in sparse-reward tasks, as it
requires quickly finding informative task-relevant experiences in both
meta-training and adaptation. To address this challenge, we explicitly model an
exploration policy learning problem for meta-RL, which is separated from
exploitation policy learning, and introduce a novel empowerment-driven
exploration objective, which aims to maximize information gain for task
identification. We derive a corresponding intrinsic reward and develop a new
off-policy meta-RL framework, which efficiently learns separate context-aware
exploration and exploitation policies by sharing the knowledge of task
inference. Experimental evaluation shows that our meta-RL method significantly
outperforms state-of-the-art baselines on various sparse-reward MuJoCo
locomotion tasks and more complex sparse-reward Meta-World tasks.
Related papers
- Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks.
We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy.
We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Robust Meta-Reinforcement Learning with Curriculum-Based Task Sampling [0.0]
We show that Robust Meta Reinforcement Learning with Guided Task Sampling (RMRL-GTS) is an effective method that restricts task sampling based on scores and epochs.
In order to achieve robust meta-RL, it is necessary not only to intensively sample tasks with poor scores, but also to restrict and expand the task regions of the tasks to be sampled.
arXiv Detail & Related papers (2022-03-31T05:16:24Z) - CoMPS: Continual Meta Policy Search [113.33157585319906]
We develop a new continual meta-learning method to address challenges in sequential multi-task learning.
We find that CoMPS outperforms prior continual learning and off-policy meta-reinforcement methods on several sequences of challenging continuous control tasks.
arXiv Detail & Related papers (2021-12-08T18:53:08Z) - Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL [91.26538493552817]
We present a formulation of hindsight relabeling for meta-RL, which relabels experience during meta-training to enable learning to learn entirely using sparse reward.
We demonstrate the effectiveness of our approach on a suite of challenging sparse reward goal-reaching environments.
arXiv Detail & Related papers (2021-12-02T00:51:17Z) - Decoupling Exploration and Exploitation for Meta-Reinforcement Learning
without Sacrifices [132.49849640628727]
meta-reinforcement learning (meta-RL) builds agents that can quickly learn new tasks by leveraging prior experience on related tasks.
In principle, optimal exploration and exploitation can be learned end-to-end by simply maximizing task performance.
We present DREAM, which avoids local optima in end-to-end training, without sacrificing optimal exploration.
arXiv Detail & Related papers (2020-08-06T17:57:36Z) - Learning Context-aware Task Reasoning for Efficient Meta-reinforcement
Learning [29.125234093368732]
We propose a novel meta-RL strategy to achieve human-level efficiency in learning novel tasks.
We decompose the meta-RL problem into three sub-tasks, task-exploration, task-inference and task-fulfillment.
Our algorithm effectively performs exploration for task inference, improves sample efficiency during both training and testing, and mitigates the meta-overfitting problem.
arXiv Detail & Related papers (2020-03-03T07:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.