Related papers: Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation

Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation

URL: http://arxiv.org/abs/2408.16228v1
Date: Thu, 29 Aug 2024 03:03:35 GMT
Title: Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation
Authors: Vivek Myers, Bill Chunyuan Zheng, Oier Mees, Sergey Levine, Kuan Fang,
Abstract summary: We propose a novel approach for few-shot adaptation to unseen tasks that exploits the semantic understanding of task decomposition. Our method, Policy Adaptation via Language Optimization (PALO), combines a handful of demonstrations of a task with proposed language decompositions. We find that PALO is able of consistently complete long-horizon, multi-tier tasks in the real world, outperforming state of the art pre-trained generalist policies.
Score: 49.43094200366251
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learned language-conditioned robot policies often struggle to effectively adapt to new real-world tasks even when pre-trained across a diverse set of instructions. We propose a novel approach for few-shot adaptation to unseen tasks that exploits the semantic understanding of task decomposition provided by vision-language models (VLMs). Our method, Policy Adaptation via Language Optimization (PALO), combines a handful of demonstrations of a task with proposed language decompositions sampled from a VLM to quickly enable rapid nonparametric adaptation, avoiding the need for a larger fine-tuning dataset. We evaluate PALO on extensive real-world experiments consisting of challenging unseen, long-horizon robot manipulation tasks. We find that PALO is able of consistently complete long-horizon, multi-tier tasks in the real world, outperforming state of the art pre-trained generalist policies, and methods that have access to the same demonstrations.

Related papers

LangPert: Detecting and Handling Task-level Perturbations for Robust Object Rearrangement [21.236557779562794]
LangPert is a language-based framework designed to detect and mitigate Task-Level Perturbations (TLP) LangPert integrates a Visual Language Model (VLM) to comprehensively monitor policy's skill execution and environmental TLP. Our experimental results demonstrate that LangPert handles diverse TLP situations more effectively than baseline methods.
arXiv Detail & Related papers (2025-04-14T05:39:15Z)
LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements [50.544186914115045]
This paper presents TEDUO, a novel training pipeline for offline language-conditioned policy learning. TEDUO operates on easy-to-obtain, unlabeled datasets and is suited for the so-called in-the-wild evaluation, wherein the agent encounters previously unseen goals and states.
arXiv Detail & Related papers (2024-12-09T18:43:56Z)
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation [52.739500459903724]
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotics manipulation and navigation. We propose a novel multi-agent LLM framework that distributes high-level planning and low-level control code generation across specialized LLM agents. We evaluate our approach on nine RLBench tasks, including long-horizon tasks, and demonstrate its ability to solve robotics manipulation in a zero-shot setting.
arXiv Detail & Related papers (2024-11-26T17:53:44Z)
Active Fine-Tuning of Generalist Policies [54.65568433408307]
We propose AMF (Active Multi-task Fine-tuning) to maximize multi-task policy performance under a limited demonstration budget. We derive performance guarantees for AMF under regularity assumptions and demonstrate its empirical effectiveness in complex and high-dimensional environments.
arXiv Detail & Related papers (2024-10-07T13:26:36Z)
Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks. Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z)
AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers [20.857692296678632]
For effective human-robot interaction, robots need to understand, plan, and execute complex, long-horizon tasks. Recent advances in large language models have shown promise for translating natural language into robot action sequences. We show that our approach outperforms several methods using LLMs as planners in complex task domains.
arXiv Detail & Related papers (2023-06-10T21:58:29Z)
Grounding Language with Visual Affordances over Unstructured Data [26.92329260907805]
We propose a novel approach to efficiently learn language-conditioned robot skills from unstructured, offline and reset-free data. We exploit a self-supervised visuo-lingual affordance model, which requires as little as 1% of the total data with language. We find that our method is capable of completing long-horizon, multi-tier tasks in the real world, while requiring an order of magnitude less data than previous approaches.
arXiv Detail & Related papers (2022-10-04T21:16:48Z)
Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks. We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy. We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z)
Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments. To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command. We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z)
Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning [41.15017636192417]
We present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning Language Models. It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters. Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-01T02:24:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.