Local Manifold Approximation and Projection for Manifold-Aware Diffusion Planning
- URL: http://arxiv.org/abs/2506.00867v1
- Date: Sun, 01 Jun 2025 07:16:39 GMT
- Title: Local Manifold Approximation and Projection for Manifold-Aware Diffusion Planning
- Authors: Kyowoon Lee, Jaesik Choi,
- Abstract summary: Local Manifold Approximation and Projection (LoMAP) is a training-free method that projects the guided sample onto a low-rank subspace approximated from offline datasets.<n>We show that LoMAP can be incorporated into the hierarchical diffusion planner, providing further performance enhancements.
- Score: 23.945423041112036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in diffusion-based generative modeling have demonstrated significant promise in tackling long-horizon, sparse-reward tasks by leveraging offline datasets. While these approaches have achieved promising results, their reliability remains inconsistent due to the inherent stochastic risk of producing infeasible trajectories, limiting their applicability in safety-critical applications. We identify that the primary cause of these failures is inaccurate guidance during the sampling procedure, and demonstrate the existence of manifold deviation by deriving a lower bound on the guidance gap. To address this challenge, we propose Local Manifold Approximation and Projection (LoMAP), a training-free method that projects the guided sample onto a low-rank subspace approximated from offline datasets, preventing infeasible trajectory generation. We validate our approach on standard offline reinforcement learning benchmarks that involve challenging long-horizon planning. Furthermore, we show that, as a standalone module, LoMAP can be incorporated into the hierarchical diffusion planner, providing further performance enhancements.
Related papers
- Prior-Guided Diffusion Planning for Offline Reinforcement Learning [4.760537994346813]
Prior Guidance (PG) is a novel guided sampling framework that replaces the standard Gaussian prior-of-cloned diffusion model.<n>PG directly generates high-value trajectories without costly reward optimization of the diffusion model itself.<n>We present an efficient training strategy that applies behavior regularization in latent space, and empirically demonstrate that PG outperforms state-the-art diffusion policies and planners across diverse long-horizon offline RL benchmarks.
arXiv Detail & Related papers (2025-05-16T05:39:02Z) - Latent Diffusion Planning for Imitation Learning [78.56207566743154]
Latent Diffusion Planning (LDP) is a modular approach consisting of a planner and inverse dynamics model.<n>By separating planning from action prediction, LDP can benefit from the denser supervision signals of suboptimal and action-free data.<n>On simulated visual robotic manipulation tasks, LDP outperforms state-of-the-art imitation learning approaches.
arXiv Detail & Related papers (2025-04-23T17:53:34Z) - Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion [62.91968752955649]
This paper tackles a novel problem, extendable long-horizon planning-enabling agents to plan trajectories longer than those in training data without compounding errors.<n>We propose an augmentation method that iteratively generates longer trajectories by stitching shorter ones.<n>HM-Diffuser trains on these extended trajectories using a hierarchical structure, efficiently handling tasks across multiple temporal scales.
arXiv Detail & Related papers (2025-03-25T22:52:46Z) - Directed Exploration in Reinforcement Learning from Linear Temporal Logic [59.707408697394534]
Linear temporal logic (LTL) is a powerful language for task specification in reinforcement learning.
We show that the synthesized reward signal remains fundamentally sparse, making exploration challenging.
We show how better exploration can be achieved by further leveraging the specification and casting its corresponding Limit Deterministic B"uchi Automaton (LDBA) as a Markov reward process.
arXiv Detail & Related papers (2024-08-18T14:25:44Z) - Validity Learning on Failures: Mitigating the Distribution Shift in Autonomous Vehicle Planning [2.3558144417896583]
The planning problem constitutes a fundamental aspect of the autonomous driving framework.
We propose Validity Learning on Failures, VL(on failure) as a remedy to address this issue.
We show that VL(on failure) outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2024-06-03T17:25:18Z) - Simple Hierarchical Planning with Diffusion [54.48129192534653]
Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets.
We introduce the Hierarchical diffuser, a fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning.
Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost.
arXiv Detail & Related papers (2024-01-05T05:28:40Z) - Refining Diffusion Planner for Reliable Behavior Synthesis by Automatic
Detection of Infeasible Plans [25.326624139426514]
Diffusion-based planning has shown promising results in long-horizon, sparse-reward tasks.
However, due to their nature as generative models, diffusion models are not guaranteed to generate feasible plans.
We propose a novel approach to refine unreliable plans generated by diffusion models by providing refining guidance to error-prone plans.
arXiv Detail & Related papers (2023-10-30T10:35:42Z) - SafeDiffuser: Safe Planning with Diffusion Probabilistic Models [97.80042457099718]
Diffusion model-based approaches have shown promise in data-driven planning, but there are no safety guarantees.
We propose a new method, called SafeDiffuser, to ensure diffusion probabilistic models satisfy specifications.
We test our method on a series of safe planning tasks, including maze path generation, legged robot locomotion, and 3D space manipulation.
arXiv Detail & Related papers (2023-05-31T19:38:12Z) - Distilling Model Failures as Directions in Latent Space [87.30726685335098]
We present a scalable method for automatically distilling a model's failure modes.
We harness linear classifiers to identify consistent error patterns, and induce a natural representation of these failure modes as directions within the feature space.
We demonstrate that this framework allows us to discover and automatically caption challenging subpopulations within the training dataset, and intervene to improve the model's performance on these subpopulations.
arXiv Detail & Related papers (2022-06-29T16:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.