Related papers: InCoRo: In-Context Learning for Robotics Control with Feedback Loops

InCoRo: In-Context Learning for Robotics Control with Feedback Loops

URL: http://arxiv.org/abs/2402.05188v1
Date: Wed, 7 Feb 2024 19:01:11 GMT
Title: InCoRo: In-Context Learning for Robotics Control with Feedback Loops
Authors: Jiaqiang Ye Zhu, Carla Gomez Cano, David Vazquez Bermudez and Michal Drozdzal
Abstract summary: InCoRo is a system that uses a classical robotic feedback loop composed of an LLM controller, a scene understanding unit, and a robot. We highlight the generalization capabilities of our system and show that InCoRo surpasses the prior art in terms of the success rate. This research paves the way towards building reliable, efficient, intelligent autonomous systems that adapt to dynamic environments.
Score: 4.702566749969133
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the challenges in robotics is to enable robotic units with the reasoning capability that would be robust enough to execute complex tasks in dynamic environments. Recent advances in LLMs have positioned them as go-to tools for simple reasoning tasks, motivating the pioneering work of Liang et al. [35] that uses an LLM to translate natural language commands into low-level static execution plans for robotic units. Using LLMs inside robotics systems brings their generalization to a new level, enabling zero-shot generalization to new tasks. This paper extends this prior work to dynamic environments. We propose InCoRo, a system that uses a classical robotic feedback loop composed of an LLM controller, a scene understanding unit, and a robot. Our system continuously analyzes the state of the environment and provides adapted execution commands, enabling the robot to adjust to changing environmental conditions and correcting for controller errors. Our system does not require any iterative optimization to learn to accomplish a task as it leverages in-context learning with an off-the-shelf LLM model. Through an extensive validation process involving two standardized industrial robotic units -- SCARA and DELTA types -- we contribute knowledge about these robots, not popular in the community, thereby enriching it. We highlight the generalization capabilities of our system and show that (1) in-context learning in combination with the current state-of-the-art LLMs is an effective way to implement a robotic controller; (2) in static environments, InCoRo surpasses the prior art in terms of the success rate; (3) in dynamic environments, we establish new state-of-the-art for the SCARA and DELTA units, respectively. This research paves the way towards building reliable, efficient, intelligent autonomous systems that adapt to dynamic environments.

Related papers

One to rule them all: natural language to bind communication, perception and action [0.9302364070735682]
This paper presents an advanced architecture for robotic action planning that integrates communication, perception, and planning with Large Language Models (LLMs) The Planner Module is the core of the system where LLMs embedded in a modified ReAct framework are employed to interpret and carry out user commands. The modified ReAct framework further enhances the execution space by providing real-time environmental perception and the outcomes of physical actions.
arXiv Detail & Related papers (2024-11-22T16:05:54Z)
Time is on my sight: scene graph filtering for dynamic environment perception in an LLM-driven robot [0.8515309662618664]
This paper presents a robot control architecture that addresses key challenges in human-robot interaction. The architecture uses Large Language Models to integrate diverse information sources, including natural language commands. The architecture enhances adaptability, task efficiency, and human-robot collaboration in dynamic environments.
arXiv Detail & Related papers (2024-11-22T15:58:26Z)
$π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge. We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z)
Solving Robotics Problems in Zero-Shot with Vision-Language Models [0.0]
We introduce Wonderful Team, a multi-agent Vision Large Language Model (VLLM) framework designed to solve robotics problems in a zero-shot regime. In our context, zero-shot means that for a novel environment, we provide a VLLM with an image of the robot's surroundings and a task description. Our system showcases the ability to handle diverse tasks such as manipulation, goal-reaching, and visual reasoning -- all in a zero-shot manner.
arXiv Detail & Related papers (2024-07-26T21:18:57Z)
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy [56.505551117094534]
We introduce LLaRA: Large Language and Robotics Assistant, a framework that formulates robot action policy as visuo-textual conversations. First, we present an automated pipeline to generate conversation-style instruction tuning data for robots from existing behavior cloning datasets. We show that a VLM finetuned with a limited amount of such datasets can produce meaningful action decisions for robotic control.
arXiv Detail & Related papers (2024-06-28T17:59:12Z)
ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning [74.58666091522198]
We present a framework for intuitive robot programming by non-experts. We leverage natural language prompts and contextual information from the Robot Operating System (ROS) Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface.
arXiv Detail & Related papers (2024-06-28T08:28:38Z)
Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration [4.2460673279562755]
Large Language Models (LLMs) are gaining popularity in the field of robotics. This paper proposes a novel approach to enhance the performance of LLM-based autonomous manipulation through Human-Robot Collaboration (HRC) The approach involves using a prompted GPT-4 language model to decompose high-level language commands into sequences of motions that can be executed by the robot.
arXiv Detail & Related papers (2024-06-20T08:23:49Z)
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation. We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language. We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation [68.70755196744533]
RoboGen is a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation. Our work attempts to extract the extensive and versatile knowledge embedded in large-scale models and transfer them to the field of robotics.
arXiv Detail & Related papers (2023-11-02T17:59:21Z)
Prompt a Robot to Walk with Large Language Models [18.214609570837403]
Large language models (LLMs) pre-trained on vast internet-scale data have showcased remarkable capabilities across diverse domains. We introduce a novel paradigm in which we use few-shot prompts collected from the physical environment. Experiments across various robots and environments validate that our method can effectively prompt a robot to walk.
arXiv Detail & Related papers (2023-09-18T17:50:17Z)
Language to Rewards for Robotic Skill Synthesis [37.21434094015743]
We introduce a new paradigm that harnesses large language models (LLMs) to define reward parameters that can be optimized and accomplish variety of robotic tasks. Using reward as the intermediate interface generated by LLMs, we can effectively bridge the gap between high-level language instructions or corrections to low-level robot actions.
arXiv Detail & Related papers (2023-06-14T17:27:10Z)
Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems [79.07468367923619]
We propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC) We design an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards. We show that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches.
arXiv Detail & Related papers (2022-09-19T16:49:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.