Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks
- URL: http://arxiv.org/abs/2207.09071v2
- Date: Wed, 20 Jul 2022 14:30:50 GMT
- Title: Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks
- Authors: Yijie Guo, Qiucheng Wu, Honglak Lee
- Abstract summary: This work introduces a novel objective function to learn an action translator among training tasks.
We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy.
We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
- Score: 56.63855534940827
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta reinforcement learning (meta-RL) aims to learn a policy solving a set of
training tasks simultaneously and quickly adapting to new tasks. It requires
massive amounts of data drawn from training tasks to infer the common structure
shared among tasks. Without heavy reward engineering, the sparse rewards in
long-horizon tasks exacerbate the problem of sample efficiency in meta-RL.
Another challenge in meta-RL is the discrepancy of difficulty level among
tasks, which might cause one easy task dominating learning of the shared policy
and thus preclude policy adaptation to new tasks. This work introduces a novel
objective function to learn an action translator among training tasks. We
theoretically verify that the value of the transferred policy with the action
translator can be close to the value of the source policy and our objective
function (approximately) upper bounds the value difference. We propose to
combine the action translator with context-based meta-RL algorithms for better
data collection and more efficient exploration during meta-training. Our
approach empirically improves the sample efficiency and performance of meta-RL
algorithms on sparse-reward tasks.
Related papers
- Meta-Reinforcement Learning Based on Self-Supervised Task Representation
Learning [23.45043290237396]
MoSS is a context-based Meta-reinforcement learning algorithm based on Self-Supervised task representation learning.
On MuJoCo and Meta-World benchmarks, MoSS outperforms prior in terms of performance, sample efficiency (3-50x faster), adaptation efficiency, and generalization.
arXiv Detail & Related papers (2023-04-29T15:46:19Z) - Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms.
Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Skill-based Meta-Reinforcement Learning [65.31995608339962]
We devise a method that enables meta-learning on long-horizon, sparse-reward tasks.
Our core idea is to leverage prior experience extracted from offline datasets during meta-learning.
arXiv Detail & Related papers (2022-04-25T17:58:19Z) - Robust Meta-Reinforcement Learning with Curriculum-Based Task Sampling [0.0]
We show that Robust Meta Reinforcement Learning with Guided Task Sampling (RMRL-GTS) is an effective method that restricts task sampling based on scores and epochs.
In order to achieve robust meta-RL, it is necessary not only to intensively sample tasks with poor scores, but also to restrict and expand the task regions of the tasks to be sampled.
arXiv Detail & Related papers (2022-03-31T05:16:24Z) - MetaICL: Learning to Learn In Context [87.23056864536613]
We introduce MetaICL, a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learn-ing on a large set of training tasks.
We show that MetaICL approaches (and sometimes beats) the performance of models fully finetuned on the target task training data, and outperforms much bigger models with nearly 8x parameters.
arXiv Detail & Related papers (2021-10-29T17:42:08Z) - MetaCURE: Meta Reinforcement Learning with Empowerment-Driven
Exploration [52.48362697163477]
Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on sparse-reward tasks.
We model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning.
We develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies.
arXiv Detail & Related papers (2020-06-15T06:56:18Z) - Learning Context-aware Task Reasoning for Efficient Meta-reinforcement
Learning [29.125234093368732]
We propose a novel meta-RL strategy to achieve human-level efficiency in learning novel tasks.
We decompose the meta-RL problem into three sub-tasks, task-exploration, task-inference and task-fulfillment.
Our algorithm effectively performs exploration for task inference, improves sample efficiency during both training and testing, and mitigates the meta-overfitting problem.
arXiv Detail & Related papers (2020-03-03T07:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.