State-Covering Trajectory Stitching for Diffusion Planners
- URL: http://arxiv.org/abs/2506.00895v2
- Date: Fri, 06 Jun 2025 09:08:55 GMT
- Title: State-Covering Trajectory Stitching for Diffusion Planners
- Authors: Kyowoon Lee, Jaesik Choi,
- Abstract summary: State-Covering Trajectory Stitching (SCoTS) is a reward-free trajectory augmentation method that stitches together short trajectory segments.<n>We demonstrate that SCoTS significantly improves the performance and generalization capabilities of diffusion planners on offline goal-conditioned benchmarks.
- Score: 23.945423041112036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion-based generative models are emerging as powerful tools for long-horizon planning in reinforcement learning (RL), particularly with offline datasets. However, their performance is fundamentally limited by the quality and diversity of training data. This often restricts their generalization to tasks outside their training distribution or longer planning horizons. To overcome this challenge, we propose State-Covering Trajectory Stitching (SCoTS), a novel reward-free trajectory augmentation method that incrementally stitches together short trajectory segments, systematically generating diverse and extended trajectories. SCoTS first learns a temporal distance-preserving latent representation that captures the underlying temporal structure of the environment, then iteratively stitches trajectory segments guided by directional exploration and novelty to effectively cover and expand this latent space. We demonstrate that SCoTS significantly improves the performance and generalization capabilities of diffusion planners on offline goal-conditioned benchmarks requiring stitching and long-horizon reasoning. Furthermore, augmented trajectories generated by SCoTS significantly improve the performance of widely used offline goal-conditioned RL algorithms across diverse environments.
Related papers
- Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning [5.274804664403783]
Strict Subgoal Execution (SSE) is a graph-based hierarchical RL framework that enforces single-step subgoal reachability.<n>We show that SSE consistently outperforms existing goal-conditioned RL and hierarchical RL approaches in both efficiency and success rate.
arXiv Detail & Related papers (2025-06-26T06:35:42Z) - Prior-Guided Diffusion Planning for Offline Reinforcement Learning [4.760537994346813]
Prior Guidance (PG) is a novel guided sampling framework that replaces the standard Gaussian prior-of-cloned diffusion model.<n>PG directly generates high-value trajectories without costly reward optimization of the diffusion model itself.<n>We present an efficient training strategy that applies behavior regularization in latent space, and empirically demonstrate that PG outperforms state-the-art diffusion policies and planners across diverse long-horizon offline RL benchmarks.
arXiv Detail & Related papers (2025-05-16T05:39:02Z) - Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion [62.91968752955649]
This paper tackles a novel problem, extendable long-horizon planning-enabling agents to plan trajectories longer than those in training data without compounding errors.<n>We propose an augmentation method that iteratively generates longer trajectories by stitching shorter ones.<n>HM-Diffuser trains on these extended trajectories using a hierarchical structure, efficiently handling tasks across multiple temporal scales.
arXiv Detail & Related papers (2025-03-25T22:52:46Z) - Exploring Representation-Aligned Latent Space for Better Generation [86.45670422239317]
We introduce ReaLS, which integrates semantic priors to improve generation performance.<n>We show that fundamental DiT and SiT trained on ReaLS can achieve a 15% improvement in FID metric.<n>The enhanced semantic latent space enables more perceptual downstream tasks, such as segmentation and depth estimation.
arXiv Detail & Related papers (2025-02-01T07:42:12Z) - SCoTT: Strategic Chain-of-Thought Tasking for Wireless-Aware Robot Navigation in Digital Twins [78.53885607559958]
We propose SCoTT, a wireless-aware path planning framework.<n>We show that SCoTT achieves path gains within 2% of DP-WA* while consistently generating shorter trajectories.<n>We also show the practical viability of our approach by deploying SCoTT as a ROS node within Gazebo simulations.
arXiv Detail & Related papers (2024-11-27T10:45:49Z) - Diffusion Meets Options: Hierarchical Generative Skill Composition for Temporally-Extended Tasks [12.239868705130178]
We propose a data-driven hierarchical framework that generates and updates plans based on instruction specified by linear temporal logic (LTL)
Our method decomposes temporal tasks into chain of options with hierarchical reinforcement learning from offline non-expert datasets.
We devise a determinantal-guided posterior sampling technique during batch generation, which improves the speed and diversity of diffusion generated options.
arXiv Detail & Related papers (2024-10-03T11:10:37Z) - Stitching Sub-Trajectories with Conditional Diffusion Model for
Goal-Conditioned Offline RL [18.31263353823447]
We propose a model-based offline Goal-Conditioned Reinforcement Learning (Offline GCRL) method to acquire diverse goal-oriented skills.
In this paper, we use the diffusion model that generates future plans conditioned on the target goal and value, with the target value estimated from the goal-relabeled offline dataset.
We report state-of-the-art performance in the standard benchmark set of GCRL tasks, and demonstrate the capability to successfully stitch the segments of suboptimal trajectories in the offline data to generate high-quality plans.
arXiv Detail & Related papers (2024-02-11T15:23:13Z) - Simple Hierarchical Planning with Diffusion [54.48129192534653]
Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets.
We introduce the Hierarchical diffuser, a fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning.
Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost.
arXiv Detail & Related papers (2024-01-05T05:28:40Z) - Dynamic Scheduling for Federated Edge Learning with Streaming Data [56.91063444859008]
We consider a Federated Edge Learning (FEEL) system where training data are randomly generated over time at a set of distributed edge devices with long-term energy constraints.
Due to limited communication resources and latency requirements, only a subset of devices is scheduled for participating in the local training process in every iteration.
arXiv Detail & Related papers (2023-05-02T07:41:16Z) - Successor Feature Landmarks for Long-Horizon Goal-Conditioned
Reinforcement Learning [54.378444600773875]
We introduce Successor Feature Landmarks (SFL), a framework for exploring large, high-dimensional environments.
SFL drives exploration by estimating state-novelty and enables high-level planning by abstracting the state-space as a non-parametric landmark-based graph.
We show in our experiments on MiniGrid and ViZDoom that SFL enables efficient exploration of large, high-dimensional state spaces.
arXiv Detail & Related papers (2021-11-18T18:36:05Z) - Model-Based Reinforcement Learning via Latent-Space Collocation [110.04005442935828]
We argue that it is easier to solve long-horizon tasks by planning sequences of states rather than just actions.
We adapt the idea of collocation, which has shown good results on long-horizon tasks in optimal control literature, to the image-based setting by utilizing learned latent state space models.
arXiv Detail & Related papers (2021-06-24T17:59:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.