InCoRo: In-Context Learning for Robotics Control with Feedback Loops
- URL: http://arxiv.org/abs/2402.05188v1
- Date: Wed, 7 Feb 2024 19:01:11 GMT
- Title: InCoRo: In-Context Learning for Robotics Control with Feedback Loops
- Authors: Jiaqiang Ye Zhu, Carla Gomez Cano, David Vazquez Bermudez and Michal
Drozdzal
- Abstract summary: InCoRo is a system that uses a classical robotic feedback loop composed of an LLM controller, a scene understanding unit, and a robot.
We highlight the generalization capabilities of our system and show that InCoRo surpasses the prior art in terms of the success rate.
This research paves the way towards building reliable, efficient, intelligent autonomous systems that adapt to dynamic environments.
- Score: 4.702566749969133
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the challenges in robotics is to enable robotic units with the
reasoning capability that would be robust enough to execute complex tasks in
dynamic environments. Recent advances in LLMs have positioned them as go-to
tools for simple reasoning tasks, motivating the pioneering work of Liang et
al. [35] that uses an LLM to translate natural language commands into low-level
static execution plans for robotic units. Using LLMs inside robotics systems
brings their generalization to a new level, enabling zero-shot generalization
to new tasks. This paper extends this prior work to dynamic environments. We
propose InCoRo, a system that uses a classical robotic feedback loop composed
of an LLM controller, a scene understanding unit, and a robot. Our system
continuously analyzes the state of the environment and provides adapted
execution commands, enabling the robot to adjust to changing environmental
conditions and correcting for controller errors. Our system does not require
any iterative optimization to learn to accomplish a task as it leverages
in-context learning with an off-the-shelf LLM model. Through an extensive
validation process involving two standardized industrial robotic units -- SCARA
and DELTA types -- we contribute knowledge about these robots, not popular in
the community, thereby enriching it. We highlight the generalization
capabilities of our system and show that (1) in-context learning in combination
with the current state-of-the-art LLMs is an effective way to implement a
robotic controller; (2) in static environments, InCoRo surpasses the prior art
in terms of the success rate; (3) in dynamic environments, we establish new
state-of-the-art for the SCARA and DELTA units, respectively. This research
paves the way towards building reliable, efficient, intelligent autonomous
systems that adapt to dynamic environments.
Related papers
- One to rule them all: natural language to bind communication, perception and action [0.9302364070735682]
This paper presents an advanced architecture for robotic action planning that integrates communication, perception, and planning with Large Language Models (LLMs)
The Planner Module is the core of the system where LLMs embedded in a modified ReAct framework are employed to interpret and carry out user commands.
The modified ReAct framework further enhances the execution space by providing real-time environmental perception and the outcomes of physical actions.
arXiv Detail & Related papers (2024-11-22T16:05:54Z) - $π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge.
We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z) - Solving Robotics Problems in Zero-Shot with Vision-Language Models [0.0]
We introduce Wonderful Team, a multi-agent Vision Large Language Model (VLLM) framework designed to solve robotics problems in a zero-shot regime.
In our context, zero-shot means that for a novel environment, we provide a VLLM with an image of the robot's surroundings and a task description.
Our system showcases the ability to handle diverse tasks such as manipulation, goal-reaching, and visual reasoning -- all in a zero-shot manner.
arXiv Detail & Related papers (2024-07-26T21:18:57Z) - ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning [74.58666091522198]
We present a framework for intuitive robot programming by non-experts.
We leverage natural language prompts and contextual information from the Robot Operating System (ROS)
Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface.
arXiv Detail & Related papers (2024-06-28T08:28:38Z) - Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration [4.2460673279562755]
Large Language Models (LLMs) are gaining popularity in the field of robotics.
This paper proposes a novel approach to enhance the performance of LLM-based autonomous manipulation through Human-Robot Collaboration (HRC)
The approach involves using a prompted GPT-4 language model to decompose high-level language commands into sequences of motions that can be executed by the robot.
arXiv Detail & Related papers (2024-06-20T08:23:49Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation [68.70755196744533]
RoboGen is a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation.
Our work attempts to extract the extensive and versatile knowledge embedded in large-scale models and transfer them to the field of robotics.
arXiv Detail & Related papers (2023-11-02T17:59:21Z) - Prompt a Robot to Walk with Large Language Models [18.214609570837403]
Large language models (LLMs) pre-trained on vast internet-scale data have showcased remarkable capabilities across diverse domains.
We introduce a novel paradigm in which we use few-shot prompts collected from the physical environment.
Experiments across various robots and environments validate that our method can effectively prompt a robot to walk.
arXiv Detail & Related papers (2023-09-18T17:50:17Z) - Language to Rewards for Robotic Skill Synthesis [37.21434094015743]
We introduce a new paradigm that harnesses large language models (LLMs) to define reward parameters that can be optimized and accomplish variety of robotic tasks.
Using reward as the intermediate interface generated by LLMs, we can effectively bridge the gap between high-level language instructions or corrections to low-level robot actions.
arXiv Detail & Related papers (2023-06-14T17:27:10Z) - Active Predicting Coding: Brain-Inspired Reinforcement Learning for
Sparse Reward Robotic Control Problems [79.07468367923619]
We propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC)
We design an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards.
We show that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches.
arXiv Detail & Related papers (2022-09-19T16:49:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.