Related papers: Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning

Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning

URL: http://arxiv.org/abs/2412.19538v1
Date: Fri, 27 Dec 2024 09:07:11 GMT
Title: Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning
Authors: Xuan Zhou, Xiang Shi, Lele Zhang, Chen Chen, Hongbo Li, Lin Ma, Fang Deng, Jie Chen,
Abstract summary: We construct an efficient multi-stage HRL-based multi-robot task planner for hyper scale MRTP in RMFS.<n>To ensure optimality, the planner is designed with a centralized architecture, but it also brings the challenges of scaling up and generalization.<n>Our planner can successfully scale up to hyper scale MRTP instances in RMFS with up to 200 robots and 1000 retrieval racks on unlearned maps.
Score: 17.989467671223043
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To improve the efficiency of warehousing system and meet huge customer orders, we aim to solve the challenges of dimension disaster and dynamic properties in hyper scale multi-robot task planning (MRTP) for robotic mobile fulfillment system (RMFS). Existing research indicates that hierarchical reinforcement learning (HRL) is an effective method to reduce these challenges. Based on that, we construct an efficient multi-stage HRL-based multi-robot task planner for hyper scale MRTP in RMFS, and the planning process is represented with a special temporal graph topology. To ensure optimality, the planner is designed with a centralized architecture, but it also brings the challenges of scaling up and generalization that require policies to maintain performance for various unlearned scales and maps. To tackle these difficulties, we first construct a hierarchical temporal attention network (HTAN) to ensure basic ability of handling inputs with unfixed lengths, and then design multi-stage curricula for hierarchical policy learning to further improve the scaling up and generalization ability while avoiding catastrophic forgetting. Additionally, we notice that policies with hierarchical structure suffer from unfair credit assignment that is similar to that in multi-agent reinforcement learning, inspired of which, we propose a hierarchical reinforcement learning algorithm with counterfactual rollout baseline to improve learning performance. Experimental results demonstrate that our planner outperform other state-of-the-art methods on various MRTP instances in both simulated and real-world RMFS. Also, our planner can successfully scale up to hyper scale MRTP instances in RMFS with up to 200 robots and 1000 retrieval racks on unlearned maps while keeping superior performance over other methods.

Related papers

PLAN-TUNING: Post-Training Language Models to Learn Step-by-Step Planning for Complex Problem Solving [66.42260489147617]
We introduce PLAN-TUNING, a framework that distills synthetic task decompositions from large-scale language models.<n>Plan-TUNING fine-tunes smaller models via supervised and reinforcement-learning objectives to improve complex reasoning.<n>Our analysis demonstrates how planning trajectories improves complex reasoning capabilities.
arXiv Detail & Related papers (2025-07-10T07:30:44Z)
Multimodal Fused Learning for Solving the Generalized Traveling Salesman Problem in Robotic Task Planning [11.697279328699489]
We propose a Multimodal Fused Learning framework to solve the Generalized Traveling Salesman Problem (GTSP)<n>We first introduce a coordinate-based image builder that transforms GTSP instances into spatially informative representations.<n>We then design an adaptive resolution scaling strategy to enhance adaptability across different problem scales, and develop a multimodal fusion module.
arXiv Detail & Related papers (2025-06-20T11:51:52Z)
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree [52.44403214958304]
In this paper, we introduce TreeLoRA, a novel approach that constructs layer-wise adapters by leveraging hierarchical gradient similarity.<n>To reduce the computational burden of task similarity estimation, we employ bandit techniques to develop an algorithm based on lower confidence bounds.<n> experiments on both vision transformers (ViTs) and large language models (LLMs) demonstrate the effectiveness and efficiency of our approach.
arXiv Detail & Related papers (2025-06-12T05:25:35Z)
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking [109.09735490692202]
We propose HyperTree Planning (HTP), a novel reasoning paradigm that constructs hypertree-structured planning outlines for effective planning.<n> Experiments demonstrate the effectiveness of HTP, achieving state-of-the-art accuracy on the TravelPlanner benchmark with Gemini-1.5-Pro, resulting in a 3.6 times performance improvement over o1-preview.
arXiv Detail & Related papers (2025-05-05T02:38:58Z)
Hierarchical Planning for Complex Tasks with Knowledge Graph-RAG and Symbolic Verification [5.727096041675994]
Large Language Models (LLMs) have shown promise as robotic planners but often struggle with long-horizon and complex tasks. We propose a neuro-symbolic approach that enhances LLMs-based planners with Knowledge Graph-based RAG for hierarchical plan generation.
arXiv Detail & Related papers (2025-04-06T18:36:30Z)
Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion [62.91968752955649]
This paper tackles a novel problem, extendable long-horizon planning-enabling agents to plan trajectories longer than those in training data without compounding errors. We propose an augmentation method that iteratively generates longer trajectories by stitching shorter ones. HM-Diffuser trains on these extended trajectories using a hierarchical structure, efficiently handling tasks across multiple temporal scales.
arXiv Detail & Related papers (2025-03-25T22:52:46Z)
Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning [0.0]
We propose a method combining reinforcement learning and automated planning. Our approach uses short goal-conditioned policies organized hierarchically, with Monte Carlo Tree Search (MCTS) planning using high-level actions (HLAs) A single plan-tree, maintained during the agent's lifetime, holds knowledge about goal achievement.
arXiv Detail & Related papers (2025-01-03T09:37:54Z)
Encoding Reusable Multi-Robot Planning Strategies as Abstract Hypergraphs [27.791001793093805]
Multi-Robot Task Planning (MR-TP) is the search for a discrete-action plan a team of robots should take to complete a task. To accelerate MR-TP over a system's lifetime, this work looks at combining two recent advances.
arXiv Detail & Related papers (2024-09-16T19:39:52Z)
Nl2Hltl2Plan: Scaling Up Natural Language Understanding for Multi-Robots Through Hierarchical Temporal Logic Task Representation [8.180994118420053]
Nl2Hltl2Plan is a framework that translates natural language commands into hierarchical Linear Temporal Logic (LTL)<n>First, an LLM transforms instructions into a Hierarchical Task Tree, capturing logical and temporal relations.<n>Next, a fine-tuned LLM converts sub-tasks into flat formulas, which are aggregated into hierarchical specifications.
arXiv Detail & Related papers (2024-08-15T14:46:13Z)
Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver. We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications. We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z)
Simple Hierarchical Planning with Diffusion [54.48129192534653]
Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. We introduce the Hierarchical diffuser, a fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning. Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost.
arXiv Detail & Related papers (2024-01-05T05:28:40Z)
Reinforcement Learning in Robotic Motion Planning by Combined Experience-based Planning and Self-Imitation Learning [7.919213739992465]
High-quality and representative data is essential for both Imitation Learning (IL)- and Reinforcement Learning (RL)-based motion planning tasks. We propose self-imitation learning by planning plus (SILP+) algorithm, which embeds experience-based planning into the learning architecture. Various experimental results show that SILP+ achieves better training efficiency higher and more stable success rate in complex motion planning tasks.
arXiv Detail & Related papers (2023-06-11T19:47:46Z)
PEAR: Primitive enabled Adaptive Relabeling for boosting Hierarchical Reinforcement Learning [25.84621883831624]
Hierarchical reinforcement learning has the potential to solve complex long horizon tasks using temporal abstraction and increased exploration. We present primitive enabled adaptive relabeling (PEAR) We first perform adaptive relabeling on a few expert demonstrations to generate efficient subgoal supervision. We then jointly optimize HRL agents by employing reinforcement learning (RL) and imitation learning (IL)
arXiv Detail & Related papers (2023-06-10T09:41:30Z)
Efficient Learning of High Level Plans from Play [57.29562823883257]
We present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL. We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks.
arXiv Detail & Related papers (2023-03-16T20:09:47Z)
Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories. We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning. In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z)
DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV [65.07776277630228]
We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF) Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs. We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
arXiv Detail & Related papers (2022-08-04T04:35:53Z)
Hierarchies of Planning and Reinforcement Learning for Robot Navigation [22.08479169489373]
In many navigation tasks, high-level (HL) task representations, like a rough floor plan, are available. Previous work has demonstrated efficient learning by hierarchal approaches consisting of path planning in the HL representation. This work proposes a novel hierarchical framework that utilizes a trainable planning policy for the HL representation.
arXiv Detail & Related papers (2021-09-23T07:18:15Z)
Efficient Feature Transformations for Discriminative and Generative Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning. Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture. We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.