Diffusion Models as Optimizers for Efficient Planning in Offline RL
- URL: http://arxiv.org/abs/2407.16142v1
- Date: Tue, 23 Jul 2024 03:00:01 GMT
- Title: Diffusion Models as Optimizers for Efficient Planning in Offline RL
- Authors: Renming Huang, Yunqiang Pei, Guoqing Wang, Yangming Zhang, Yang Yang, Peng Wang, Hengtao Shen,
- Abstract summary: Diffusion models have shown strong competitiveness in offline reinforcement learning tasks.
We propose a faster autoregressive model to handle the generation of feasible trajectories.
This allows us to achieve more efficient planning without sacrificing capability.
- Score: 47.0835433289033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have shown strong competitiveness in offline reinforcement learning tasks by formulating decision-making as sequential generation. However, the practicality of these methods is limited due to the lengthy inference processes they require. In this paper, we address this problem by decomposing the sampling process of diffusion models into two decoupled subprocesses: 1) generating a feasible trajectory, which is a time-consuming process, and 2) optimizing the trajectory. With this decomposition approach, we are able to partially separate efficiency and quality factors, enabling us to simultaneously gain efficiency advantages and ensure quality assurance. We propose the Trajectory Diffuser, which utilizes a faster autoregressive model to handle the generation of feasible trajectories while retaining the trajectory optimization process of diffusion models. This allows us to achieve more efficient planning without sacrificing capability. To evaluate the effectiveness and efficiency of the Trajectory Diffuser, we conduct experiments on the D4RL benchmarks. The results demonstrate that our method achieves $\it 3$-$\it 10 \times$ faster inference speed compared to previous sequence modeling methods, while also outperforming them in terms of overall performance. https://github.com/RenMing-Huang/TrajectoryDiffuser Keywords: Reinforcement Learning and Efficient Planning and Diffusion Model
Related papers
- DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization [22.546989373687655]
We propose a novel pruning method that derives an efficient diffusion model via a more intelligent and differentiable pruner.
Our approach achieves 4.4 x speedup for SD-1.5 without any loss of accuracy, significantly outperforming the previous state-of-the-art methods.
arXiv Detail & Related papers (2024-10-22T12:18:24Z) - Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation [49.49868273653921]
Diffusion models are promising for joint trajectory prediction and controllable generation in autonomous driving.
We introduce Optimal Gaussian Diffusion (OGD) and Estimated Clean Manifold (ECM) Guidance.
Our methodology streamlines the generative process, enabling practical applications with reduced computational overhead.
arXiv Detail & Related papers (2024-08-01T17:59:59Z) - Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis [82.72941975704374]
Non-autoregressive Transformers (NATs) have been recognized for their rapid generation.
We re-evaluate the full potential of NATs by revisiting the design of their training and inference strategies.
We propose to go beyond existing methods by directly solving the optimal strategies in an automatic framework.
arXiv Detail & Related papers (2024-06-08T13:52:20Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following [21.81411085058986]
Reward-gradient guided denoising generates trajectories that maximize both a differentiable reward function and the likelihood under the data distribution captured by a diffusion model.
We propose DiffusionES, a method that combines gradient-free optimization with trajectory denoising.
We show that DiffusionES achieves state-of-the-art performance on nuPlan, an established closed-loop planning benchmark for autonomous driving.
arXiv Detail & Related papers (2024-02-09T17:18:33Z) - A-SDM: Accelerating Stable Diffusion through Redundancy Removal and
Performance Optimization [54.113083217869516]
In this work, we first explore the computational redundancy part of the network.
We then prune the redundancy blocks of the model and maintain the network performance.
Thirdly, we propose a global-regional interactive (GRI) attention to speed up the computationally intensive attention part.
arXiv Detail & Related papers (2023-12-24T15:37:47Z) - The Missing U for Efficient Diffusion Models [3.712196074875643]
Diffusion Probabilistic Models yield record-breaking performance in tasks such as image synthesis, video generation, and molecule design.
Despite their capabilities, their efficiency, especially in the reverse process, remains a challenge due to slow convergence rates and high computational costs.
We introduce an approach that leverages continuous dynamical systems to design a novel denoising network for diffusion models.
arXiv Detail & Related papers (2023-10-31T00:12:14Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.