SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling
- URL: http://arxiv.org/abs/2306.11886v3
- Date: Mon, 29 Jan 2024 17:28:20 GMT
- Title: SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling
- Authors: Jesse Zhang and Karl Pertsch and Jiahui Zhang and Joseph J. Lim
- Abstract summary: We propose SPRINT, a scalable offline policy pre-training approach.
Our method uses two core ideas to automatically expand a base set of pre-training tasks.
Experimental results in a household simulator and on a real robot kitchen manipulation task show that SPRINT leads to substantially faster learning of new long-horizon tasks.
- Score: 28.380226726781082
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-training robot policies with a rich set of skills can substantially
accelerate the learning of downstream tasks. Prior works have defined
pre-training tasks via natural language instructions, but doing so requires
tedious human annotation of hundreds of thousands of instructions. Thus, we
propose SPRINT, a scalable offline policy pre-training approach which
substantially reduces the human effort needed for pre-training a diverse set of
skills. Our method uses two core ideas to automatically expand a base set of
pre-training tasks: instruction relabeling via large language models and
cross-trajectory skill chaining through offline reinforcement learning. As a
result, SPRINT pre-training equips robots with a much richer repertoire of
skills. Experimental results in a household simulator and on a real robot
kitchen manipulation task show that SPRINT leads to substantially faster
learning of new long-horizon tasks than previous pre-training approaches.
Website at https://clvrai.com/sprint.
Related papers
- SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths.
We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z) - EXTRACT: Efficient Policy Learning by Extracting Transferable Robot Skills from Offline Data [22.471559284344462]
Most reinforcement learning (RL) methods focus on learning optimal policies over low-level action spaces.
While these methods can perform well in their training environments, they lack the flexibility to transfer to new tasks.
We demonstrate through experiments in sparse, image-based, robot manipulation environments that can more quickly learn new tasks than prior works.
arXiv Detail & Related papers (2024-06-25T17:50:03Z) - Instruction Pre-Training: Language Models are Supervised Multitask Learners [115.95022434390181]
In this paper, we propose a framework that augments massive raw corpora with instruction-response pairs to pre-train language models (LMs)
In our experiments, we synthesize 200M instruction-response pairs covering 40+ task categories to verify the effectiveness of Instruction Pre-Training.
arXiv Detail & Related papers (2024-06-20T16:55:33Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - When Prompt-based Incremental Learning Does Not Meet Strong Pretraining [36.0889029038102]
In this work, we develop a learnable Adaptive Prompt Generator (APG)
The key is to unify the prompt retrieval and prompt learning processes into a learnable prompt generator.
Our method significantly outperforms advanced methods in exemplar-free incremental learning without (strong) pretraining.
arXiv Detail & Related papers (2023-08-21T03:33:21Z) - COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning [78.13740204156858]
We show that we can reuse prior data to extend new skills simply through dynamic programming.
We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task.
We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
arXiv Detail & Related papers (2020-10-27T17:57:29Z) - Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less
Forgetting [66.45372974713189]
We propose a recall and learn mechanism, which adopts the idea of multi-task learning and jointly learns pretraining tasks and downstream tasks.
Experiments show that our method achieves state-of-the-art performance on the GLUE benchmark.
We provide open-source RecAdam, which integrates the proposed mechanisms into Adam to facility the NLP community.
arXiv Detail & Related papers (2020-04-27T08:59:57Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.