Robots Can Multitask Too: Integrating a Memory Architecture and LLMs for Enhanced Cross-Task Robot Action Generation
- URL: http://arxiv.org/abs/2407.13505v1
- Date: Thu, 18 Jul 2024 13:38:21 GMT
- Title: Robots Can Multitask Too: Integrating a Memory Architecture and LLMs for Enhanced Cross-Task Robot Action Generation
- Authors: Hassan Ali, Philipp Allgeuer, Carlo Mazzola, Giulia Belgiovine, Burak Can Kaplan, Stefan Wermter,
- Abstract summary: Large Language Models (LLMs) have been recently used in robot applications for grounding common-sense reasoning with the robot's perception and physical abilities.
In this paper, we address incorporating memory processes with LLMs for generating cross-task robot actions, while the robot effectively switches between tasks.
Our results show a significant improvement in performance over a baseline of five robotic tasks, demonstrating the potential of integrating memory with LLMs for combining the robot's action and perception for adaptive task execution.
- Score: 13.84245915608566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have been recently used in robot applications for grounding LLM common-sense reasoning with the robot's perception and physical abilities. In humanoid robots, memory also plays a critical role in fostering real-world embodiment and facilitating long-term interactive capabilities, especially in multi-task setups where the robot must remember previous task states, environment states, and executed actions. In this paper, we address incorporating memory processes with LLMs for generating cross-task robot actions, while the robot effectively switches between tasks. Our proposed dual-layered architecture features two LLMs, utilizing their complementary skills of reasoning and following instructions, combined with a memory model inspired by human cognition. Our results show a significant improvement in performance over a baseline of five robotic tasks, demonstrating the potential of integrating memory with LLMs for combining the robot's action and perception for adaptive task execution.
Related papers
- Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration [4.2460673279562755]
Large Language Models (LLMs) are gaining popularity in the field of robotics.
This paper proposes a novel approach to enhance the performance of LLM-based autonomous manipulation through Human-Robot Collaboration (HRC)
The approach involves using a prompted GPT-4 language model to decompose high-level language commands into sequences of motions that can be executed by the robot.
arXiv Detail & Related papers (2024-06-20T08:23:49Z) - LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning [50.99807031490589]
We introduce LLARVA, a model trained with a novel instruction tuning method to unify a range of robotic learning tasks, scenarios, and environments.
We generate 8.5M image-visual trace pairs from the Open X-Embodiment dataset in order to pre-train our model.
Experiments yield strong performance, demonstrating that LLARVA performs well compared to several contemporary baselines.
arXiv Detail & Related papers (2024-06-17T17:55:29Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models [23.945922720555146]
We propose a system to achieve incremental learning of complex behavior from natural interaction.
We integrate the system in the robot cognitive architecture of the humanoid robot ARMAR-6.
arXiv Detail & Related papers (2023-09-08T13:29:05Z) - Language to Rewards for Robotic Skill Synthesis [37.21434094015743]
We introduce a new paradigm that harnesses large language models (LLMs) to define reward parameters that can be optimized and accomplish variety of robotic tasks.
Using reward as the intermediate interface generated by LLMs, we can effectively bridge the gap between high-level language instructions or corrections to low-level robot actions.
arXiv Detail & Related papers (2023-06-14T17:27:10Z) - LLM as A Robotic Brain: Unifying Egocentric Memory and Control [77.0899374628474]
Embodied AI focuses on the study and development of intelligent systems that possess a physical or virtual embodiment (i.e. robots)
Memory and control are the two essential parts of an embodied system and usually require separate frameworks to model each of them.
We propose a novel framework called LLM-Brain: using Large-scale Language Model as a robotic brain to unify egocentric memory and control.
arXiv Detail & Related papers (2023-04-19T00:08:48Z) - Lifelong Robotic Reinforcement Learning by Retaining Experiences [61.79346922421323]
Many multi-task reinforcement learning efforts assume the robot can collect data from all tasks at all times.
In this work, we study a practical sequential multi-task RL problem motivated by the practical constraints of physical robotic systems.
We derive an approach that effectively leverages the data and policies learned for previous tasks to cumulatively grow the robot's skill-set.
arXiv Detail & Related papers (2021-09-19T18:00:51Z) - Show Me What You Can Do: Capability Calibration on Reachable Workspace
for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth.
We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.