Generalizable Task Representation Learning for Offline
Meta-Reinforcement Learning with Data Limitations
- URL: http://arxiv.org/abs/2312.15909v1
- Date: Tue, 26 Dec 2023 07:02:12 GMT
- Title: Generalizable Task Representation Learning for Offline
Meta-Reinforcement Learning with Data Limitations
- Authors: Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang, Yang Yu
- Abstract summary: We propose a novel algorithm called GENTLE for learning generalizable task representations in the face of data limitations.
GENTLE employs Task Auto-Encoder(TAE), which is an encoder-decoder architecture to extract the characteristics of the tasks.
To alleviate the effect of limited behavior diversity, we construct pseudo-transitions to align the data distribution used to train TAE with the data distribution encountered during testing.
- Score: 22.23114883485924
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalization and sample efficiency have been long-standing issues
concerning reinforcement learning, and thus the field of Offline
Meta-Reinforcement Learning~(OMRL) has gained increasing attention due to its
potential of solving a wide range of problems with static and limited offline
data. Existing OMRL methods often assume sufficient training tasks and data
coverage to apply contrastive learning to extract task representations.
However, such assumptions are not applicable in several real-world applications
and thus undermine the generalization ability of the representations. In this
paper, we consider OMRL with two types of data limitations: limited training
tasks and limited behavior diversity and propose a novel algorithm called
GENTLE for learning generalizable task representations in the face of data
limitations. GENTLE employs Task Auto-Encoder~(TAE), which is an
encoder-decoder architecture to extract the characteristics of the tasks.
Unlike existing methods, TAE is optimized solely by reconstruction of the state
transition and reward, which captures the generative structure of the task
models and produces generalizable representations when training tasks are
limited. To alleviate the effect of limited behavior diversity, we consistently
construct pseudo-transitions to align the data distribution used to train TAE
with the data distribution encountered during testing. Empirically, GENTLE
significantly outperforms existing OMRL methods on both in-distribution tasks
and out-of-distribution tasks across both the given-context protocol and the
one-shot protocol.
Related papers
- Disentangling Policy from Offline Task Representation Learning via
Adversarial Data Augmentation [29.49883684368039]
offline meta-reinforcement learning (OMRL) proficiently allows an agent to tackle novel tasks while relying on a static dataset.
We introduce a novel algorithm to disentangle the impact of behavior policy from task representation learning.
arXiv Detail & Related papers (2024-03-12T02:38:36Z) - Offline Multi-task Transfer RL with Representational Penalization [26.114893629771736]
We study the problem of representation transfer in offline Reinforcement Learning (RL)
We propose an algorithm to compute pointwise uncertainty measures for the learnt representation.
arXiv Detail & Related papers (2024-02-19T21:52:44Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - Meta-Reinforcement Learning Based on Self-Supervised Task Representation
Learning [23.45043290237396]
MoSS is a context-based Meta-reinforcement learning algorithm based on Self-Supervised task representation learning.
On MuJoCo and Meta-World benchmarks, MoSS outperforms prior in terms of performance, sample efficiency (3-50x faster), adaptation efficiency, and generalization.
arXiv Detail & Related papers (2023-04-29T15:46:19Z) - Task Aware Feature Extraction Framework for Sequential Dependence
Multi-Task Learning [1.0765359420035392]
We analyze sequential dependence MTL from rigorous mathematical perspective.
We propose a Task Aware Feature Extraction (TAFE) framework for sequential dependence MTL.
arXiv Detail & Related papers (2023-01-06T13:12:59Z) - Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML.
We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML.
Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.