Simple Hierarchical Planning with Diffusion
- URL: http://arxiv.org/abs/2401.02644v1
- Date: Fri, 5 Jan 2024 05:28:40 GMT
- Title: Simple Hierarchical Planning with Diffusion
- Authors: Chang Chen, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sungjin Ahn
- Abstract summary: Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets.
We introduce the Hierarchical diffuser, a fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning.
Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost.
- Score: 54.48129192534653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion-based generative methods have proven effective in modeling
trajectories with offline datasets. However, they often face computational
challenges and can falter in generalization, especially in capturing temporal
abstractions for long-horizon tasks. To overcome this, we introduce the
Hierarchical Diffuser, a simple, fast, yet surprisingly effective planning
method combining the advantages of hierarchical and diffusion-based planning.
Our model adopts a "jumpy" planning strategy at the higher level, which allows
it to have a larger receptive field but at a lower computational cost -- a
crucial factor for diffusion-based planning methods, as we have empirically
verified. Additionally, the jumpy sub-goals guide our low-level planner,
facilitating a fine-tuning stage and further improving our approach's
effectiveness. We conducted empirical evaluations on standard offline
reinforcement learning benchmarks, demonstrating our method's superior
performance and efficiency in terms of training and planning speed compared to
the non-hierarchical Diffuser as well as other hierarchical planning methods.
Moreover, we explore our model's generalization capability, particularly on how
our method improves generalization capabilities on compositional
out-of-distribution tasks.
Related papers
- Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs)
We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios.
We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z) - DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning [36.50275602760051]
We introduce DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning.
It is an efficient hierarchical approach that leverages direct preference optimization to learn a higher-level policy and reinforcement learning to learn a lower-level policy.
It enjoys improved computational efficiency due to its use of direct preference optimization instead of standard preference-based approaches.
arXiv Detail & Related papers (2024-06-16T10:49:41Z) - Guiding Language Model Reasoning with Planning Tokens [122.43639723387516]
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks.
We propose a hierarchical generation scheme to encourage a more structural generation of chain-of-thought steps.
Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme.
arXiv Detail & Related papers (2023-10-09T13:29:37Z) - Efficient Planning with Latent Diffusion [18.678459478837976]
Temporal abstraction and efficient planning pose significant challenges in offline reinforcement learning.
Latent action spaces offer a more flexible paradigm, capturing only possible actions within the behavior policy support.
This paper presents a unified framework for continuous latent action space representation learning and planning by leveraging latent, score-based diffusion models.
arXiv Detail & Related papers (2023-09-30T08:50:49Z) - Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning [83.41487567765871]
Skipper is a model-based reinforcement learning framework.
It automatically generalizes the task given into smaller, more manageable subtasks.
It enables sparse decision-making and focused abstractions on the relevant parts of the environment.
arXiv Detail & Related papers (2023-09-30T02:25:18Z) - Hierarchical Imitation Learning with Vector Quantized Models [77.67190661002691]
We propose to use reinforcement learning to identify subgoals in expert trajectories.
We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning.
In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art.
arXiv Detail & Related papers (2023-01-30T15:04:39Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z) - FAPE: a Constraint-based Planner for Generative and Hierarchical
Temporal Planning [2.771897351607068]
We propose a temporal planner, called FAPE, which supports many of the expressive temporal features of the ANML modeling language without loosing efficiency.
FAPE's representation coherently integrates flexible timelines with hierarchical refinement methods that can provide efficient control knowledge.
arXiv Detail & Related papers (2020-10-25T13:46:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.