Related papers: PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control

PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control

URL: http://arxiv.org/abs/2402.10450v3
Date: Thu, 6 Jun 2024 04:47:52 GMT
Title: PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control
Authors: Ruijie Zheng, Ching-An Cheng, Hal Daumé III, Furong Huang, Andrey Kolobov,
Abstract summary: Temporal action abstractions, along with belief state representations, are a powerful knowledge sharing mechanism for sequential decision making. We propose a novel view that treats inducing temporal action abstractions as a sequence compression problem. We introduce an approach that combines continuous action quantization with byte pair encoding to learn powerful action abstractions.
Score: 55.81022882408587
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Temporal action abstractions, along with belief state representations, are a powerful knowledge sharing mechanism for sequential decision making. In this work, we propose a novel view that treats inducing temporal action abstractions as a sequence compression problem. To do so, we bring a subtle but critical component of LLM training pipelines -- input tokenization via byte pair encoding (BPE) -- to the seemingly distant task of learning skills of variable time span in continuous control domains. We introduce an approach called Primitive Sequence Encoding (PRISE) that combines continuous action quantization with BPE to learn powerful action abstractions. We empirically show that high-level skills discovered by PRISE from a multitask set of robotic manipulation demonstrations significantly boost the performance of both multitask imitation learning as well as few-shot imitation learning on unseen tasks. Our code is released at https://github.com/FrankZheng2022/PRISE.

Related papers

Reinforcement Learning with Action Sequence for Data-Efficient Robot Learning [62.3886343725955]
We introduce a novel RL algorithm that learns a critic network that outputs Q-values over a sequence of actions. By explicitly training the value functions to learn the consequence of executing a series of current and future actions, our algorithm allows for learning useful value functions from noisy trajectories.
arXiv Detail & Related papers (2024-11-19T01:23:52Z)
Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations. We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z)
Less is more: Summarizing Patch Tokens for efficient Multi-Label Class-Incremental Learning [38.36863497458095]
We propose a new class-incremental learning method Multi-Label class incremental learning via summarising pAtch tokeN Embeddings (MULTI-LANE) Our proposed method Multi-Label class incremental learning via summarising pAtch tokeN Embeddings (MULTI-LANE) enables learning disentangled task-specific representations in MLCIL while ensuring fast inference.
arXiv Detail & Related papers (2024-05-24T15:18:27Z)
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss [61.355272240758]
Premier-TACO is a multitask feature representation learning approach. It is designed to improve few-shot policy learning efficiency in sequential decision-making tasks.
arXiv Detail & Related papers (2024-02-09T05:04:40Z)
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning [49.92517970237088]
We tackle the problem of training a robot to understand multimodal prompts. This type of task poses a major challenge to robots' capability to understand the interconnection and complementarity between vision and language signals. We introduce an effective framework that learns a policy to perform robot manipulation with multimodal prompts.
arXiv Detail & Related papers (2023-10-14T22:24:58Z)
Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs) Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages. The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z)
Discrete State-Action Abstraction via the Successor Representation [3.453310639983932]
Abstraction is one approach that provides the agent with an intrinsic reward for transitioning in a latent space. Our approach is the first for automatically learning a discrete abstraction of the underlying environment. Our proposed algorithm, Discrete State-Action Abstraction (DSAA), iteratively swaps between training these options and using them to efficiently explore more of the environment.
arXiv Detail & Related papers (2022-06-07T17:37:30Z)
Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations [13.864448233719598]
This paper describes a new neural network-based framework for learning simultaneously low-level policies and high-level policies. A key feature of the proposed approach is that the policies are learned directly from raw videos of task demonstrations. Empirical results on object manipulation tasks with a robotic arm show that the proposed network can efficiently learn from real visual demonstrations to perform the tasks.
arXiv Detail & Related papers (2022-03-08T01:36:48Z)
Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks [17.13584584844048]
This work introduces MAnipulation Primitive-augmented reinforcement LEarning (MAPLE), a learning framework that augments standard reinforcement learning algorithms with a pre-defined library of behavior primitives. We develop a hierarchical policy that involves the primitives and instantiates their executions with input parameters. We demonstrate that MAPLE outperforms baseline approaches by a significant margin on a suite of simulated manipulation tasks.
arXiv Detail & Related papers (2021-10-07T17:44:33Z)
Augmenting Policy Learning with Routines Discovered from a Demonstration [86.9307760606403]
We propose routine-augmented policy learning (RAPL) RAPL discovers routines composed of primitive actions from a single demonstration. We show that RAPL improves the state-of-the-art imitation learning method SQIL and reinforcement learning method A2C.
arXiv Detail & Related papers (2020-12-23T03:15:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.