RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
- URL: http://arxiv.org/abs/2412.00171v2
- Date: Tue, 10 Dec 2024 10:02:45 GMT
- Title: RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
- Authors: Weixin Mao, Weiheng Zhong, Zhou Jiang, Dong Fang, Zhongyue Zhang, Zihan Lan, Fan Jia, Tiancai Wang, Haoqiang Fan, Osamu Yoshie,
- Abstract summary: RoboMatrix is a skill-centric and hierarchical framework for scalable task planning and execution.
We first introduce a novel skill-centric paradigm that extracts the common meta-skills from different complex tasks.
To fully leverage meta-skills, we develop a hierarchical framework that decouples complex robot tasks into three interconnected layers.
- Score: 18.998643114600053
- License:
- Abstract: Existing policy learning methods predominantly adopt the task-centric paradigm, necessitating the collection of task data in an end-to-end manner. Consequently, the learned policy tends to fail to tackle novel tasks. Moreover, it is hard to localize the errors for a complex task with multiple stages due to end-to-end learning. To address these challenges, we propose RoboMatrix, a skill-centric and hierarchical framework for scalable task planning and execution. We first introduce a novel skill-centric paradigm that extracts the common meta-skills from different complex tasks. This allows for the capture of embodied demonstrations through a skill-centric approach, enabling the completion of open-world tasks by combining learned meta-skills. To fully leverage meta-skills, we further develop a hierarchical framework that decouples complex robot tasks into three interconnected layers: (1) a high-level modular scheduling layer; (2) a middle-level skill layer; and (3) a low-level hardware layer. Experimental results illustrate that our skill-centric and hierarchical framework achieves remarkable generalization performance across novel objects, scenes, tasks, and embodiments. This framework offers a novel solution for robot task planning and execution in open-world scenarios. Our software and hardware are available at https://github.com/WayneMao/RoboMatrix.
Related papers
- Hierarchical Meta-Reinforcement Learning via Automated Macro-Action Discovery [4.0847743592744905]
It is still challenging to learn performant policies across multiple complex and high-dimensional tasks.
We propose a novel architecture with three hierarchical levels for 1) learning task representations, 2) discovering task-agnostic macro-actions in an automated manner, and 3) learning primitive actions.
arXiv Detail & Related papers (2024-12-16T16:15:36Z) - COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models [49.24666980374751]
COHERENT is a novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems.
A Proposal-Execution-Feedback-Adjustment mechanism is designed to decompose and assign actions for individual robots.
The experimental results show that our work surpasses the previous methods by a large margin in terms of success rate and execution efficiency.
arXiv Detail & Related papers (2024-09-23T15:53:41Z) - Agentic Skill Discovery [19.5703917813767]
Language-conditioned robotic skills make it possible to apply the high-level reasoning of Large Language Models (LLMs) to low-level robotic control.
A remaining challenge is to acquire a diverse set of fundamental skills.
We introduce a novel framework for skill discovery that is entirely driven by LLMs.
arXiv Detail & Related papers (2024-05-23T19:44:03Z) - Interactive Task Planning with Language Models [89.5839216871244]
An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals and distinct tasks, even during execution.
Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain specific pretrained models.
We propose a simple framework that achieves interactive task planning with language models by incorporating both high-level planning and low-level skill execution.
arXiv Detail & Related papers (2023-10-16T17:59:12Z) - Towards an Interpretable Hierarchical Agent Framework using Semantic
Goals [6.677083312952721]
This work introduces an interpretable hierarchical agent framework by combining planning and semantic goal directed reinforcement learning.
We evaluate our framework on a robotic block manipulation task and show that it performs better than other methods.
arXiv Detail & Related papers (2022-10-16T02:04:13Z) - Deep Hierarchical Planning from Pixels [86.14687388689204]
Director is a method for learning hierarchical behaviors directly from pixels by planning inside the latent space of a learned world model.
Despite operating in latent space, the decisions are interpretable because the world model can decode goals into images for visualization.
Director also learns successful behaviors across a wide range of environments, including visual control, Atari games, and DMLab levels.
arXiv Detail & Related papers (2022-06-08T18:20:15Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Divide & Conquer Imitation Learning [75.31752559017978]
Imitation Learning can be a powerful approach to bootstrap the learning process.
We present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory.
We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.
arXiv Detail & Related papers (2022-04-15T09:56:50Z) - Graph-based Reinforcement Learning meets Mixed Integer Programs: An
application to 3D robot assembly discovery [34.25379651790627]
We tackle the problem of building arbitrary, predefined target structures entirely from scratch using a set of Tetris-like building blocks and a robotic manipulator.
Our novel hierarchical approach aims at efficiently decomposing the overall task into three feasible levels that benefit mutually from each other.
arXiv Detail & Related papers (2022-03-08T14:44:51Z) - COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning [78.13740204156858]
We show that we can reuse prior data to extend new skills simply through dynamic programming.
We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task.
We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
arXiv Detail & Related papers (2020-10-27T17:57:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.