Related papers: DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning

DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning

URL: http://arxiv.org/abs/2406.09953v3
Date: Fri, 11 Apr 2025 05:41:19 GMT
Title: DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning
Authors: Zeyu Gao, Yao Mu, Jinye Qu, Mengkang Hu, Shijia Peng, Chengkai Hou, Lingyue Guo, Ping Luo, Shanghang Zhang, Yanfeng Lu,
Abstract summary: DAG-Plan is a structured task planning framework tailored for dual-arm robots.<n>It decomposes intricate tasks into actionable sub-tasks represented as nodes within a directed acyclic graph.<n>It dynamically assigns these sub-tasks to the appropriate arm based on real-time environmental observations.
Score: 37.38735065914983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dual-arm robots offer enhanced versatility and efficiency over single-arm counterparts by enabling concurrent manipulation of multiple objects or cooperative execution of tasks using both arms. However, the coordination of dual-arm systems for long-horizon tasks continues to pose significant challenges, stemming from the intricate temporal and spatial dependencies among sub-tasks, necessitating intelligent decisions regarding the allocation of actions between arms and their optimal execution order. Existing task planning methods predominantly focus on single-arm robots or rely on predefined bimanual operations to use large language models (LLMs) generate task sequence with linear temporal dependency, failing to fully leverage the capabilities of dual-arm systems. To address this limitation, we introduce DAG-Plan, a structured task planning framework tailored for dual-arm robots. DAG-Plan harnesses LLMs to decompose intricate tasks into actionable sub-tasks represented as nodes within a directed acyclic graph (DAG). Critically, DAG-Plan dynamically assigns these sub-tasks to the appropriate arm based on real-time environmental observations, enabling parallel and adaptive execution. We evaluate DAG-Plan on the Dual-Arm Kitchen Benchmark, comprising 5 sequential tasks with 44 sub-tasks. Extensive experiments demonstrate the superiority of DAG-Plan over directly using LLM to generate linear task sequence, achieving 52.8% higher efficiency compared to the single-arm task planning and 48% higher success rate of the dual-arm task planning. Compared to iterative methods, DAG-Plan improving execution efficiency 84.1% due to its fewer query time. More demos and information are available on https://sites.google.com/view/dag-plan.

Related papers

MagicAgent: Towards Generalized Agent Planning [73.21129030631421]
We present textbfMagicAgent, a series of foundation models specifically designed for generalized agent planning.<n>We introduce a lightweight and scalable synthetic data framework that generates high-quality trajectories across diverse planning tasks.<n>We show that MagicAgent-32B and MagicAgent-30B-A3B achieve superior performance across diverse open-source benchmarks.
arXiv Detail & Related papers (2026-02-22T01:39:16Z)
H-AIM: Orchestrating LLMs, PDDL, and Behavior Trees for Hierarchical Multi-Robot Planning [3.2800662172795114]
H-AIM is a novel embodied multi-robot task planning framework.<n>It exploits large language models (LLMs) to parse instructions and generate Planning Domain Definition Language (PDDL) problem descriptions.<n>It compiles the resulting plan into behavior trees for reactive control.
arXiv Detail & Related papers (2026-01-16T07:59:50Z)
Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents [28.061156787350395]
Task-Decoupled Planning (TDP) is a training-free framework that replaces entangled reasoning with task decoupling.<n>TDP confines reasoning and replanning to the active sub-task without disrupting the workflow.<n>Results on TravelPlanner, ScienceWorld, and HotpotQA show that TDP outperforms strong baselines while reducing token consumption by up to 82%.
arXiv Detail & Related papers (2026-01-12T14:30:10Z)
GAP: Graph-Based Agent Planning with Parallel Tool Use and Reinforcement Learning [20.75113227786218]
Graph-based Agent Planning (GAP) is a novel framework that explicitly models inter-task dependencies through graph-based planning.<n>Our approach trains agent foundation models to decompose complex tasks into dependency-aware sub-task graphs.<n>This dependency-aware orchestration achieves substantial improvements in both execution efficiency and task accuracy.
arXiv Detail & Related papers (2025-10-29T09:35:55Z)
Optimizing Sequential Multi-Step Tasks with Parallel LLM Agents [15.26802977779826]
M1-Parallel is a framework that concurrently runs multiple multi-agent teams in parallel to uncover distinct solution paths.<n>We show that M1-Parallel with early termination achieves up to $2.2times$ speedup while preserving accuracy.<n>We further investigate strategies aimed at encouraging diverse execution plans but observe no additional performance gains over repeated sampling.
arXiv Detail & Related papers (2025-07-11T18:09:22Z)
RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks [22.2075114201037]
We propose a novel large language model (LLM)-driven framework for dual-arm task parallelism planning.<n>RoboPARA employs a two-stage process: Dependency Graph-based Planning Candidates Generation and Graph Re-Traversal-based Dual-Arm Parallel Planning.<n>X-DAPT dataset is the first dataset specifically designed to evaluate dual-arm task parallelism across diverse scenarios and difficulty levels.
arXiv Detail & Related papers (2025-06-07T06:46:24Z)
RoboCerebra: A Large-scale Benchmark for Long-horizon Robotic Manipulation Evaluation [80.20970723577818]
We introduce RoboCerebra, a benchmark for evaluating high-level reasoning in long-horizon robotic manipulation.<n>The dataset is constructed via a top-down pipeline, where GPT generates task instructions and decomposes them into subtask sequences.<n>Compared to prior benchmarks, RoboCerebra features significantly longer action sequences and denser annotations.
arXiv Detail & Related papers (2025-06-07T06:15:49Z)
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking [109.09735490692202]
We propose HyperTree Planning (HTP), a novel reasoning paradigm that constructs hypertree-structured planning outlines for effective planning.<n> Experiments demonstrate the effectiveness of HTP, achieving state-of-the-art accuracy on the TravelPlanner benchmark with Gemini-1.5-Pro, resulting in a 3.6 times performance improvement over o1-preview.
arXiv Detail & Related papers (2025-05-05T02:38:58Z)
LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language [17.914580097058106]
Bimanual robotic manipulation presents an inherent challenge due to the complexity involved in the spatial and temporal coordination between two hands. Existing works predominantly focus on attaining human-level manipulation skills for robotic hands, yet little attention has been paid to task planning on long-horizon timescales. This paper introduces LLM+MAP, a bimanual planning framework that integrates LLM reasoning and multi-agent planning.
arXiv Detail & Related papers (2025-03-21T17:04:01Z)
A Task and Motion Planning Framework Using Iteratively Deepened AND/OR Graph Networks [3.635602838654497]
We present an approach for integrated task and motion planning based on an AND/OR graph network. We leverage it to implement different classes of task and motion planning problems (TAMP) The approach is evaluated and validated both in simulation and with a real dual-arm robot manipulator, that is, Baxter from Rethink Robotics.
arXiv Detail & Related papers (2025-03-10T17:28:22Z)
Plan-over-Graph: Towards Parallelable LLM Agent Schedule [53.834646147919436]
Large Language Models (LLMs) have demonstrated exceptional abilities in reasoning for task planning. This paper introduces a novel paradigm, plan-over-graph, in which the model first decomposes a real-life textual task into executable subtasks and constructs an abstract task graph. The model then understands this task graph as input and generates a plan for parallel execution.
arXiv Detail & Related papers (2025-02-20T13:47:51Z)
Robotouille: An Asynchronous Planning Benchmark for LLM Agents [7.574421886354134]
Asynchronous planning is essential for agents that must account for time delays, reason over diverse long-horizon tasks, and collaborate with other agents. We introduce Robotouille, a benchmark environment designed to test agents' ability to handle long-horizon asynchronous scenarios. Our results show that ReAct (gpt4-o) achieves 47% on synchronous tasks but only 11% on asynchronous tasks, highlighting significant room for improvement.
arXiv Detail & Related papers (2025-02-06T05:50:37Z)
CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation [98.11670473661587]
CaPo improves cooperation efficiency with two phases: 1) meta-plan generation, and 2) progress-adaptive meta-plan and execution. Experimental results on the ThreeDworld Multi-Agent Transport and Communicative Watch-And-Help tasks demonstrate that CaPo achieves much higher task completion rate and efficiency compared with state-of-the-arts.
arXiv Detail & Related papers (2024-11-07T13:08:04Z)
COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models [49.24666980374751]
COHERENT is a novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems. A Proposal-Execution-Feedback-Adjustment mechanism is designed to decompose and assign actions for individual robots. The experimental results show that our work surpasses the previous methods by a large margin in terms of success rate and execution efficiency.
arXiv Detail & Related papers (2024-09-23T15:53:41Z)
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes. CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks. It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z)
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought [95.37585041654535]
Embodied AI is capable of planning and executing action sequences for robots to accomplish long-horizon tasks in physical environments. In this work, we introduce EmbodiedGPT, an end-to-end multi-modal foundation model for embodied AI. Experiments show the effectiveness of EmbodiedGPT on embodied tasks, including embodied planning, embodied control, visual captioning, and visual question answering.
arXiv Detail & Related papers (2023-05-24T11:04:30Z)
Optimal task and motion planning and execution for human-robot multi-agent systems in dynamic environments [54.39292848359306]
We propose a combined task and motion planning approach to optimize sequencing, assignment, and execution of tasks. The framework relies on decoupling tasks and actions, where an action is one possible geometric realization of a symbolic task. We demonstrate the approach effectiveness in a collaborative manufacturing scenario, in which a robotic arm and a human worker shall assemble a mosaic.
arXiv Detail & Related papers (2023-03-27T01:50:45Z)
A Framework for Neurosymbolic Robot Action Planning using Large Language Models [3.0501524254444767]
We present a framework aimed at bridging the gap between symbolic task planning and machine learning approaches. The rationale is training Large Language Models (LLMs) into a neurosymbolic task planner compatible with the Planning Domain Definition Language (PDDL) Preliminary results in selected domains show that our method can: (i) solve 95.5% of problems in a test data set of 1,000 samples; (ii) produce plans up to 13.5% shorter than a traditional symbolic planner; (iii) reduce average overall waiting times for a plan availability by up to 61.4%.
arXiv Detail & Related papers (2023-03-01T11:54:22Z)
Toward Efficient Task Planning for Dual-Arm Tabletop Object Rearrangement [12.928188950810043]
In a non-monotone rearrangement task, complex object-object dependencies exist that require moving some objects multiple times to solve an instance. We develop effective task planning algorithms for scheduling the pick-n-place sequence that can be properly distributed between the two arms.
arXiv Detail & Related papers (2022-07-17T05:11:14Z)
A Feedback Scheme to Reorder a Multi-Agent Execution Schedule by Persistently Optimizing a Switchable Action Dependency Graph [65.70656676650391]
We consider multiple Automated Guided Vehicles (AGVs) navigating a common workspace to fulfill various intralogistics tasks. One approach is to construct an Action Dependency Graph (ADG) which encodes the ordering of AGVs as they proceed along their routes. If the workspace is shared by dynamic obstacles such as humans or third party robots, AGVs can experience large delays. We present an online method to repeatedly modify acyclic ADG to minimize route completion times of each AGV.
arXiv Detail & Related papers (2020-10-11T14:39:50Z)
Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination. We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.