SMART: Self-supervised Multi-task pretrAining with contRol Transformers
- URL: http://arxiv.org/abs/2301.09816v1
- Date: Tue, 24 Jan 2023 05:01:23 GMT
- Title: SMART: Self-supervised Multi-task pretrAining with contRol Transformers
- Authors: Yanchao Sun, Shuang Ma, Ratnesh Madaan, Rogerio Bonatti, Furong Huang,
Ashish Kapoor
- Abstract summary: Self-supervised pretraining has been extensively studied in language and vision domains.
It is difficult to properly design such a pretraining approach for sequential decision-making tasks.
We propose a generic pretraining framework for sequential decision making.
- Score: 34.604339091596884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised pretraining has been extensively studied in language and
vision domains, where a unified model can be easily adapted to various
downstream tasks by pretraining representations without explicit labels. When
it comes to sequential decision-making tasks, however, it is difficult to
properly design such a pretraining approach that can cope with both
high-dimensional perceptual information and the complexity of sequential
control over long interaction horizons. The challenge becomes combinatorially
more complex if we want to pretrain representations amenable to a large variety
of tasks. To tackle this problem, in this work, we formulate a general
pretraining-finetuning pipeline for sequential decision making, under which we
propose a generic pretraining framework \textit{Self-supervised Multi-task
pretrAining with contRol Transformer (SMART)}. By systematically investigating
pretraining regimes, we carefully design a Control Transformer (CT) coupled
with a novel control-centric pretraining objective in a self-supervised manner.
SMART encourages the representation to capture the common essential information
relevant to short-term control and long-term control, which is transferrable
across tasks. We show by extensive experiments in DeepMind Control Suite that
SMART significantly improves the learning efficiency among seen and unseen
downstream tasks and domains under different learning scenarios including
Imitation Learning (IL) and Reinforcement Learning (RL). Benefiting from the
proposed control-centric objective, SMART is resilient to distribution shift
between pretraining and finetuning, and even works well with low-quality
pretraining datasets that are randomly collected.
Related papers
- Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment [0.0]
"Harmonized Transfer Learning and Modality alignment (HarMA)" is a method that simultaneously satisfies task constraints, modality alignment, and single-modality uniform alignment.
HarMA achieves state-of-the-art performance in two popular multimodal retrieval tasks in the field of remote sensing.
arXiv Detail & Related papers (2024-04-28T17:20:08Z) - Decision Transformer as a Foundation Model for Partially Observable Continuous Control [5.453548045211778]
Decision Transformer (DT) architecture is used to predict optimal action based on past observations, actions, and rewards.
DT exhibits remarkable zero-shot generalization abilities for completely new tasks.
These findings highlight the potential of DT as a foundational controller for general control applications.
arXiv Detail & Related papers (2024-04-03T02:17:34Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - Supervised Pretraining Can Learn In-Context Reinforcement Learning [96.62869749926415]
In this paper, we study the in-context learning capabilities of transformers in decision-making problems.
We introduce and study Decision-Pretrained Transformer (DPT), a supervised pretraining method where the transformer predicts an optimal action.
We find that the pretrained transformer can be used to solve a range of RL problems in-context, exhibiting both exploration online and conservatism offline.
arXiv Detail & Related papers (2023-06-26T17:58:50Z) - Self-Supervised Reinforcement Learning that Transfers using Random
Features [41.00256493388967]
We propose a self-supervised reinforcement learning method that enables the transfer of behaviors across tasks with different rewards.
Our method is self-supervised in that it can be trained on offline datasets without reward labels, but can then be quickly deployed on new tasks.
arXiv Detail & Related papers (2023-05-26T20:37:06Z) - Self-Supervised Representation Learning from Temporal Ordering of
Automated Driving Sequences [49.91741677556553]
We propose TempO, a temporal ordering pretext task for pre-training region-level feature representations for perception tasks.
We embed each frame by an unordered set of proposal feature vectors, a representation that is natural for object detection or tracking systems.
Extensive evaluations on the BDD100K, nuImages, and MOT17 datasets show that our TempO pre-training approach outperforms single-frame self-supervised learning methods.
arXiv Detail & Related papers (2023-02-17T18:18:27Z) - Effective Adaptation in Multi-Task Co-Training for Unified Autonomous
Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks.
We find that their performances are sub-optimal or even lag far behind the single-task baseline.
We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z) - Task Agnostic Representation Consolidation: a Self-supervised based
Continual Learning Approach [14.674494335647841]
We propose a two-stage training paradigm for CL that intertwines task-agnostic and task-specific learning.
We show that our training paradigm can be easily added to memory- or regularization-based approaches.
arXiv Detail & Related papers (2022-07-13T15:16:51Z) - Task-Customized Self-Supervised Pre-training with Scalable Dynamic
Routing [76.78772372631623]
A common practice for self-supervised pre-training is to use as much data as possible.
For a specific downstream task, however, involving irrelevant data in pre-training may degenerate the downstream performance.
It is burdensome and infeasible to use different downstream-task-customized datasets in pre-training for different tasks.
arXiv Detail & Related papers (2022-05-26T10:49:43Z) - UPDeT: Universal Multi-agent Reinforcement Learning via Policy
Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks.
Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy.
The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.