Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation
- URL: http://arxiv.org/abs/2408.16228v1
- Date: Thu, 29 Aug 2024 03:03:35 GMT
- Title: Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation
- Authors: Vivek Myers, Bill Chunyuan Zheng, Oier Mees, Sergey Levine, Kuan Fang,
- Abstract summary: We propose a novel approach for few-shot adaptation to unseen tasks that exploits the semantic understanding of task decomposition.
Our method, Policy Adaptation via Language Optimization (PALO), combines a handful of demonstrations of a task with proposed language decompositions.
We find that PALO is able of consistently complete long-horizon, multi-tier tasks in the real world, outperforming state of the art pre-trained generalist policies.
- Score: 49.43094200366251
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learned language-conditioned robot policies often struggle to effectively adapt to new real-world tasks even when pre-trained across a diverse set of instructions. We propose a novel approach for few-shot adaptation to unseen tasks that exploits the semantic understanding of task decomposition provided by vision-language models (VLMs). Our method, Policy Adaptation via Language Optimization (PALO), combines a handful of demonstrations of a task with proposed language decompositions sampled from a VLM to quickly enable rapid nonparametric adaptation, avoiding the need for a larger fine-tuning dataset. We evaluate PALO on extensive real-world experiments consisting of challenging unseen, long-horizon robot manipulation tasks. We find that PALO is able of consistently complete long-horizon, multi-tier tasks in the real world, outperforming state of the art pre-trained generalist policies, and methods that have access to the same demonstrations.
Related papers
- MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation.
We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents.
We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z) - Active Fine-Tuning of Generalist Policies [54.65568433408307]
We propose AMF (Active Multi-task Fine-tuning) to maximize multi-task policy performance under a limited demonstration budget.
We derive performance guarantees for AMF under regularity assumptions and demonstrate its empirical effectiveness in complex and high-dimensional environments.
arXiv Detail & Related papers (2024-10-07T13:26:36Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers [20.857692296678632]
For effective human-robot interaction, robots need to understand, plan, and execute complex, long-horizon tasks.
Recent advances in large language models have shown promise for translating natural language into robot action sequences.
We show that our approach outperforms several methods using LLMs as planners in complex task domains.
arXiv Detail & Related papers (2023-06-10T21:58:29Z) - Grounding Language with Visual Affordances over Unstructured Data [26.92329260907805]
We propose a novel approach to efficiently learn language-conditioned robot skills from unstructured, offline and reset-free data.
We exploit a self-supervised visuo-lingual affordance model, which requires as little as 1% of the total data with language.
We find that our method is capable of completing long-horizon, multi-tier tasks in the real world, while requiring an order of magnitude less data than previous approaches.
arXiv Detail & Related papers (2022-10-04T21:16:48Z) - Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks.
We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy.
We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Making Pre-trained Language Models End-to-end Few-shot Learners with
Contrastive Prompt Tuning [41.15017636192417]
We present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning Language Models.
It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters.
Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-01T02:24:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.