Towards Effective Context for Meta-Reinforcement Learning: an Approach
based on Contrastive Learning
- URL: http://arxiv.org/abs/2009.13891v3
- Date: Tue, 15 Dec 2020 08:48:23 GMT
- Title: Towards Effective Context for Meta-Reinforcement Learning: an Approach
based on Contrastive Learning
- Authors: Haotian Fu, Hongyao Tang, Jianye Hao, Chen Chen, Xidong Feng, Dong Li,
Wulong Liu
- Abstract summary: We propose a novel Meta-RL framework called CCM (Contrastive learning augmented Context-based Meta-RL)
We first focus on the contrastive nature behind different tasks and leverage it to train a compact and sufficient context encoder.
We derive a new information-gain-based objective which aims to collect informative trajectories in a few steps.
- Score: 33.19862944149082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Context, the embedding of previous collected trajectories, is a powerful
construct for Meta-Reinforcement Learning (Meta-RL) algorithms. By conditioning
on an effective context, Meta-RL policies can easily generalize to new tasks
within a few adaptation steps. We argue that improving the quality of context
involves answering two questions: 1. How to train a compact and sufficient
encoder that can embed the task-specific information contained in prior
trajectories? 2. How to collect informative trajectories of which the
corresponding context reflects the specification of tasks? To this end, we
propose a novel Meta-RL framework called CCM (Contrastive learning augmented
Context-based Meta-RL). We first focus on the contrastive nature behind
different tasks and leverage it to train a compact and sufficient context
encoder. Further, we train a separate exploration policy and theoretically
derive a new information-gain-based objective which aims to collect informative
trajectories in a few steps. Empirically, we evaluate our approaches on common
benchmarks as well as several complex sparse-reward environments. The
experimental results show that CCM outperforms state-of-the-art algorithms by
addressing previously mentioned problems respectively.
Related papers
- Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning [48.79569442193824]
We show that COMRL algorithms are essentially optimizing the same mutual information objective between the task variable $M$ and its latent representation $Z$ by implementing various approximate bounds.
This work lays the information theoretic foundation for COMRL methods, leading to a better understanding of task representation learning in the context of reinforcement learning.
arXiv Detail & Related papers (2024-02-04T09:58:42Z) - On Context Distribution Shift in Task Representation Learning for
Offline Meta RL [7.8317653074640186]
We focus on context-based OMRL, specifically on the challenge of learning task representation for OMRL.
To overcome this problem, we present a hard-sampling-based strategy to train a robust task context encoder.
arXiv Detail & Related papers (2023-04-01T16:21:55Z) - Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms.
Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z) - Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks.
We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy.
We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - MetaICL: Learning to Learn In Context [87.23056864536613]
We introduce MetaICL, a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learn-ing on a large set of training tasks.
We show that MetaICL approaches (and sometimes beats) the performance of models fully finetuned on the target task training data, and outperforms much bigger models with nearly 8x parameters.
arXiv Detail & Related papers (2021-10-29T17:42:08Z) - Improving Context-Based Meta-Reinforcement Learning with Self-Supervised
Trajectory Contrastive Learning [32.112504515457445]
We propose Trajectory Contrastive Learning to improve meta-training.
TCL trains a context encoder to predict whether two transition windows are sampled from the same trajectory.
It accelerates the training of context encoders and improves meta-training overall.
arXiv Detail & Related papers (2021-03-10T23:31:19Z) - MetaCURE: Meta Reinforcement Learning with Empowerment-Driven
Exploration [52.48362697163477]
Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on sparse-reward tasks.
We model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning.
We develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies.
arXiv Detail & Related papers (2020-06-15T06:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.