Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion
- URL: http://arxiv.org/abs/2503.20102v2
- Date: Thu, 10 Apr 2025 19:53:26 GMT
- Title: Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion
- Authors: Chang Chen, Hany Hamed, Doojin Baek, Taegu Kang, Yoshua Bengio, Sungjin Ahn,
- Abstract summary: This paper tackles a novel problem, extendable long-horizon planning-enabling agents to plan trajectories longer than those in training data without compounding errors.<n>We propose an augmentation method that iteratively generates longer trajectories by stitching shorter ones.<n>HM-Diffuser trains on these extended trajectories using a hierarchical structure, efficiently handling tasks across multiple temporal scales.
- Score: 62.91968752955649
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper tackles a novel problem, extendable long-horizon planning-enabling agents to plan trajectories longer than those in training data without compounding errors. To tackle this, we propose the Hierarchical Multiscale Diffuser (HM-Diffuser) and Progressive Trajectory Extension (PTE), an augmentation method that iteratively generates longer trajectories by stitching shorter ones. HM-Diffuser trains on these extended trajectories using a hierarchical structure, efficiently handling tasks across multiple temporal scales. Additionally, we introduce Adaptive Plan Pondering and the Recursive HM-Diffuser, which consolidate hierarchical layers into a single model to process temporal scales recursively. Experimental results demonstrate the effectiveness of our approach, advancing diffusion-based planners for scalable long-horizon planning.
Related papers
- Latent Diffusion Planning for Imitation Learning [78.56207566743154]
Latent Diffusion Planning (LDP) is a modular approach consisting of a planner and inverse dynamics model.
By separating planning from action prediction, LDP can benefit from the denser supervision signals of suboptimal and action-free data.
On simulated visual robotic manipulation tasks, LDP outperforms state-of-the-art imitation learning approaches.
arXiv Detail & Related papers (2025-04-23T17:53:34Z) - Variable Time-Step MPC for Agile Multi-Rotor UAV Interception of Dynamic Targets [6.0967385124149756]
Agile planning using existing non-linear model predictive control methods is limited by the number of planning steps as it becomes increasingly demanding.<n>In this paper, we propose to address these limitations by introducing variable time steps and coupling them with the prediction horizon length.<n>A simplified point-mass motion primitive is used to leverage the differential flatness of quadrotor dynamics and the trajectory generation of feasible trajectories in the flat output space.
arXiv Detail & Related papers (2025-03-18T11:59:24Z) - Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning [17.989467671223043]
We construct an efficient multi-stage HRL-based multi-robot task planner for hyper scale MRTP in RMFS.<n>To ensure optimality, the planner is designed with a centralized architecture, but it also brings the challenges of scaling up and generalization.<n>Our planner can successfully scale up to hyper scale MRTP instances in RMFS with up to 200 robots and 1000 retrieval racks on unlearned maps.
arXiv Detail & Related papers (2024-12-27T09:07:11Z) - SCoTT: Wireless-Aware Path Planning with Vision Language Models and Strategic Chains-of-Thought [78.53885607559958]
A novel approach using vision language models (VLMs) is proposed for enabling path planning in complex wireless-aware environments.<n>To this end, insights from a digital twin with real-world wireless ray tracing data are explored.<n>Results show that SCoTT achieves very close average path gains compared to DP-WA* while at the same time yielding consistently shorter path lengths.
arXiv Detail & Related papers (2024-11-27T10:45:49Z) - Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning [17.760679318994384]
We present a novel hierarchical transformer-based approach leveraging a learned quantizer of the space.
This quantization enables the training of a simpler zone-conditioned low-level policy and simplifies planning.
Our proposed approach achieves state-of-the-art results in complex long-distance navigation environments.
arXiv Detail & Related papers (2024-11-12T12:49:41Z) - Diffusion Meets Options: Hierarchical Generative Skill Composition for Temporally-Extended Tasks [12.239868705130178]
We propose a data-driven hierarchical framework that generates and updates plans based on instruction specified by linear temporal logic (LTL)
Our method decomposes temporal tasks into chain of options with hierarchical reinforcement learning from offline non-expert datasets.
We devise a determinantal-guided posterior sampling technique during batch generation, which improves the speed and diversity of diffusion generated options.
arXiv Detail & Related papers (2024-10-03T11:10:37Z) - Planning Transformer: Long-Horizon Offline Reinforcement Learning with Planning Tokens [1.8416014644193066]
We introduce Planning Tokens, which contain high-level, long time-scale information about the agent's future.
We demonstrate that Planning Tokens improve the interpretability of the model's policy through the interpretable plan visualisations and attention map.
arXiv Detail & Related papers (2024-09-14T19:30:53Z) - Simple Hierarchical Planning with Diffusion [54.48129192534653]
Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets.
We introduce the Hierarchical diffuser, a fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning.
Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost.
arXiv Detail & Related papers (2024-01-05T05:28:40Z) - On the Long Range Abilities of Transformers [69.3021852589771]
We demonstrate that minimal modifications to the transformer architecture can significantly enhance performance on the Long Range Arena benchmark.
We identify that two key principles for long-range tasks are (i.e. incorporating an inductive bias towards smoothness, and (ii.e.) locality.
As we show, integrating these ideas into the attention mechanism improves results with a negligible amount of additional computation and without any additional trainable parameters.
arXiv Detail & Related papers (2023-11-28T09:21:48Z) - Model-Based Reinforcement Learning via Latent-Space Collocation [110.04005442935828]
We argue that it is easier to solve long-horizon tasks by planning sequences of states rather than just actions.
We adapt the idea of collocation, which has shown good results on long-horizon tasks in optimal control literature, to the image-based setting by utilizing learned latent state space models.
arXiv Detail & Related papers (2021-06-24T17:59:18Z) - Haar Wavelet based Block Autoregressive Flows for Trajectories [129.37479472754083]
Prediction of trajectories such as that of pedestrians is crucial to the performance of autonomous agents.
We introduce a novel Haar wavelet based block autoregressive model leveraging split couplings.
We illustrate the advantages of our approach for generating diverse and accurate trajectories on two real-world datasets.
arXiv Detail & Related papers (2020-09-21T13:57:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.