Related papers: Self-Recovery Prompting: Promptable General Purpose Service Robot System with Foundation Models and Self-Recovery

Self-Recovery Prompting: Promptable General Purpose Service Robot System with Foundation Models and Self-Recovery

URL: http://arxiv.org/abs/2309.14425v2
Date: Wed, 27 Sep 2023 02:46:20 GMT
Title: Self-Recovery Prompting: Promptable General Purpose Service Robot System with Foundation Models and Self-Recovery
Authors: Mimo Shirasaka, Tatsuya Matsushima, Soshi Tsunashima, Yuya Ikeda, Aoi Horo, So Ikoma, Chikaha Tsuji, Hikaru Wada, Tsunekazu Omija, Dai Komukai, Yutaka Matsuo Yusuke Iwasawa
Abstract summary: A general-purpose service robot (GPSR) can execute diverse tasks in various environments. We first developed a top-level GPSR system for worldwide competition (RoboCup@Home 2023) based on multiple foundation models. We propose the self-recovery prompting pipeline, which explores the necessary information and modifies its prompts to recover from failure.
Score: 1.2900354046626057
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A general-purpose service robot (GPSR), which can execute diverse tasks in various environments, requires a system with high generalizability and adaptability to tasks and environments. In this paper, we first developed a top-level GPSR system for worldwide competition (RoboCup@Home 2023) based on multiple foundation models. This system is both generalizable to variations and adaptive by prompting each model. Then, by analyzing the performance of the developed system, we found three types of failure in more realistic GPSR application settings: insufficient information, incorrect plan generation, and plan execution failure. We then propose the self-recovery prompting pipeline, which explores the necessary information and modifies its prompts to recover from failure. We experimentally confirm that the system with the self-recovery mechanism can accomplish tasks by resolving various failure cases. Supplementary videos are available at https://sites.google.com/view/srgpsr .

Related papers

Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing [4.874077691069634]
Retrieval Augmented Generation (RAG) has shown strong capability in enhancing language models' knowledge and reducing AI generative hallucinations.<n>Current multi-round RAG systems may continue searching even when enough information has already been retrieved.<n>This paper introduces a new framework, SIM-RAG, to explicitly enhance RAG systems' self-awareness and multi-round retrieval capabilities.
arXiv Detail & Related papers (2025-05-05T17:39:35Z)
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems [50.29939179830491]
Failure attribution in LLM multi-agent systems remains underexplored and labor-intensive. We develop and evaluate three automated failure attribution methods, summarizing their corresponding pros and cons. The best method achieves 53.5% accuracy in identifying failure-responsible agents but only 14.2% in pinpointing failure steps.
arXiv Detail & Related papers (2025-04-30T23:09:44Z)
REMAC: Self-Reflective and Self-Evolving Multi-Agent Collaboration for Long-Horizon Robot Manipulation [57.628771707989166]
We propose an adaptive multi-agent planning framework, termed REMAC, that enables efficient, scene-agnostic multi-robot long-horizon task planning and execution. ReMAC incorporates two key modules: a self-reflection module performing pre-conditions and post-condition checks in the loop to evaluate progress and refine plans, and a self-evolvement module dynamically adapting plans based on scene-specific reasoning.
arXiv Detail & Related papers (2025-03-28T03:51:40Z)
STAR: A Foundation Model-driven Framework for Robust Task Planning and Failure Recovery in Robotic Systems [5.426894918217948]
STAR (Smart Task Adaptation and Recovery) is a novel framework that synergizes Foundation Models (FMs) with dynamically expanding Knowledge Graphs (KGs) FMs offer remarkable generalization and contextual reasoning, but their limitations hinder reliable deployment. We show that STAR demonstrated an 86% task planning accuracy and 78% recovery success rate, showing significant improvements over baseline methods.
arXiv Detail & Related papers (2025-03-08T05:05:21Z)
$π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge. We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z)
Creating and Repairing Robot Programs in Open-World Domains [8.93008148936798]
We propose a system which traces the execution of a program up until error, and then runs an LLM-produced recovery program that minimizes repeated actions. We create a benchmark consisting of eleven tasks with various error conditions that require the generation of a recovery program. We compare the efficiency of the recovery program to a plan built with an oracle that has foreknowledge of future errors.
arXiv Detail & Related papers (2024-10-24T16:30:14Z)
Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation [65.23793829741014]
Embodied-RAG is a framework that enhances the model of an embodied agent with a non-parametric memory system. At its core, Embodied-RAG's memory is structured as a semantic forest, storing language descriptions at varying levels of detail. We demonstrate that Embodied-RAG effectively bridges RAG to the robotics domain, successfully handling over 200 explanation and navigation queries.
arXiv Detail & Related papers (2024-09-26T21:44:11Z)
Commonsense Reasoning for Legged Robot Adaptation with Vision-Language Models [81.55156507635286]
Legged robots are physically capable of navigating a diverse variety of environments and overcoming a wide range of obstructions. Current learning methods often struggle with generalization to the long tail of unexpected situations without heavy human supervision. We propose a system, VLM-Predictive Control (VLM-PC), combining two key components that we find to be crucial for eliciting on-the-fly, adaptive behavior selection.
arXiv Detail & Related papers (2024-07-02T21:00:30Z)
ROBUST: 221 Bugs in the Robot Operating System [0.256557617522405]
We systematically curated a dataset of 221 bugs across 7 popular and diverse software systems. We produce historically accurate recreations of each of the 221 defective software versions in the form of Docker images. We use a grounded theory approach to examine and categorize their corresponding faults, failures, and fixes.
arXiv Detail & Related papers (2024-04-04T17:49:38Z)
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents [107.97394661147102]
The ultimate goals of robotic learning is to acquire a comprehensive and generalizable robotic system. Recent progress in utilizing language models as high-level planners has demonstrated that the complexity of tasks can be reduced through decomposing them into primitive-level plans. Despite the promising future, the community is not yet adequately prepared for composable generalization agents.
arXiv Detail & Related papers (2024-03-28T17:42:54Z)
AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents [109.3804962220498]
AutoRT is a system to scale up the deployment of operational robots in completely unseen scenarios with minimal human supervision. We demonstrate AutoRT proposing instructions to over 20 robots across multiple buildings and collecting 77k real robot episodes via both teleoperation and autonomous robot policies. We experimentally show that such "in-the-wild" data collected by AutoRT is significantly more diverse, and that AutoRT's use of LLMs allows for instruction following data collection robots that can align to human preferences.
arXiv Detail & Related papers (2024-01-23T18:45:54Z)
Seven Failure Points When Engineering a Retrieval Augmented Generation System [1.8776685617612472]
RAG systems aim to reduce the problem of hallucinated responses from large language models. RAG systems suffer from limitations inherent to information retrieval systems. We present an experience report on the failure points of RAG systems from three case studies.
arXiv Detail & Related papers (2024-01-11T12:04:11Z)
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis [82.59451639072073]
General-purpose robots operate seamlessly in any environment, with any object, and utilize various skills to complete diverse tasks. As a community, we have been constraining most robotic systems by designing them for specific tasks, training them on specific datasets, and deploying them within specific environments. Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models, we devote this survey to exploring how foundation models can be applied to general-purpose robotics.
arXiv Detail & Related papers (2023-12-14T10:02:55Z)
Collective Intelligence for 2D Push Manipulations with Mobile Robots [18.937030864563752]
We show that by distilling a planner from a differentiable soft-body physics simulator into an attention-based neural network, our multi-robot push manipulation system achieves better performance than baselines. Our system also generalizes to configurations not seen during training and is able to adapt toward task completions when external turbulence and environmental changes are applied.
arXiv Detail & Related papers (2022-11-28T08:48:58Z)
Explainable AI for System Failures: Generating Explanations that Improve Human Assistance in Fault Recovery [15.359877013989228]
We develop automated, natural language explanations for failures encountered during an AI agents' plan execution. These explanations are developed with a focus of helping non-expert users understand different point of failures. We extend an existing sequence-to-sequence methodology to automatically generate our context-based explanations.
arXiv Detail & Related papers (2020-11-18T17:08:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.