Related papers: Simple Hierarchical Planning with Diffusion

Simple Hierarchical Planning with Diffusion

URL: http://arxiv.org/abs/2401.02644v1
Date: Fri, 5 Jan 2024 05:28:40 GMT
Title: Simple Hierarchical Planning with Diffusion
Authors: Chang Chen, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sungjin Ahn
Abstract summary: Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. We introduce the Hierarchical diffuser, a fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning. Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost.
Score: 54.48129192534653
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. However, they often face computational challenges and can falter in generalization, especially in capturing temporal abstractions for long-horizon tasks. To overcome this, we introduce the Hierarchical Diffuser, a simple, fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning. Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost -- a crucial factor for diffusion-based planning methods, as we have empirically verified. Additionally, the jumpy sub-goals guide our low-level planner, facilitating a fine-tuning stage and further improving our approach's effectiveness. We conducted empirical evaluations on standard offline reinforcement learning benchmarks, demonstrating our method's superior performance and efficiency in terms of training and planning speed compared to the non-hierarchical Diffuser as well as other hierarchical planning methods. Moreover, we explore our model's generalization capability, particularly on how our method improves generalization capabilities on compositional out-of-distribution tasks.

Related papers

Extendable Long-Horizon Planning via Hierarchical Multiscale Diffusion [62.91968752955649]
This paper tackles a novel problem, extendable long-horizon planning-enabling agents to plan trajectories longer than those in training data without compounding errors. We propose an augmentation method that iteratively generates longer trajectories by stitching shorter ones. HM-Diffuser trains on these extended trajectories using a hierarchical structure, efficiently handling tasks across multiple temporal scales.
arXiv Detail & Related papers (2025-03-25T22:52:46Z)
DHP: Discrete Hierarchical Planning for Hierarchical Reinforcement Learning Agents [2.1438108757511958]
Our key contribution is a Discrete Hierarchical Planning (DHP) method, an alternative to traditional distance-based approaches. We provide theoretical foundations for the method and demonstrate its effectiveness through extensive empirical evaluations. We evaluate our method on long-horizon visual planning tasks in a 25-room environment, where it significantly outperforms previous benchmarks at success rate and average episode length.
arXiv Detail & Related papers (2025-02-04T03:05:55Z)
Practical Performative Policy Learning with Strategic Agents [8.361090623217246]
We study the performative policy learning problem, where agents adjust their features in response to a released policy to improve their potential outcomes. We propose a gradient-based policy optimization algorithm with a differentiable classifier as a substitute for the high-dimensional distribution map.
arXiv Detail & Related papers (2024-12-02T10:09:44Z)
Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs) We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios. We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z)
DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning [36.50275602760051]
We introduce DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning. It is an efficient hierarchical approach that leverages direct preference optimization to learn a higher-level policy and reinforcement learning to learn a lower-level policy. It enjoys improved computational efficiency due to its use of direct preference optimization instead of standard preference-based approaches.
arXiv Detail & Related papers (2024-06-16T10:49:41Z)
Guiding Language Model Reasoning with Planning Tokens [122.43639723387516]
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks. We propose a hierarchical generation scheme to encourage a more structural generation of chain-of-thought steps. Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme.
arXiv Detail & Related papers (2023-10-09T13:29:37Z)
Efficient Planning with Latent Diffusion [18.678459478837976]
Temporal abstraction and efficient planning pose significant challenges in offline reinforcement learning. Latent action spaces offer a more flexible paradigm, capturing only possible actions within the behavior policy support. This paper presents a unified framework for continuous latent action space representation learning and planning by leveraging latent, score-based diffusion models.
arXiv Detail & Related papers (2023-09-30T08:50:49Z)
Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning [83.41487567765871]
Skipper is a model-based reinforcement learning framework. It automatically generalizes the task given into smaller, more manageable subtasks. It enables sparse decision-making and focused abstractions on the relevant parts of the environment.
arXiv Detail & Related papers (2023-09-30T02:25:18Z)
Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories. We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning. In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z)
Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem. The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z)
FAPE: a Constraint-based Planner for Generative and Hierarchical Temporal Planning [2.771897351607068]
We propose a temporal planner, called FAPE, which supports many of the expressive temporal features of the ANML modeling language without loosing efficiency. FAPE's representation coherently integrates flexible timelines with hierarchical refinement methods that can provide efficient control knowledge.
arXiv Detail & Related papers (2020-10-25T13:46:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.