Related papers: Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation

Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation

URL: http://arxiv.org/abs/2506.07822v1
Date: Mon, 09 Jun 2025 14:48:19 GMT
Title: Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation
Authors: Xintong Duan, Yutong He, Fahim Tajwar, Ruslan Salakhutdinov, J. Zico Kolter, Jeff Schneider,
Abstract summary: We propose a novel approach to consistency distillation for offline reinforcement learning.<n>Our method enables single-step generation while maintaining higher performance and simpler training.
Score: 88.4955839930215
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although diffusion models have achieved strong results in decision-making tasks, their slow inference speed remains a key limitation. While the consistency model offers a potential solution, its applications to decision-making often struggle with suboptimal demonstrations or rely on complex concurrent training of multiple networks. In this work, we propose a novel approach to consistency distillation for offline reinforcement learning that directly incorporates reward optimization into the distillation process. Our method enables single-step generation while maintaining higher performance and simpler training. Empirical evaluations on the Gym MuJoCo benchmarks and long horizon planning demonstrate that our approach can achieve an 8.7% improvement over previous state-of-the-art while offering up to 142x speedup over diffusion counterparts in inference time.

Related papers

CHORDS: Diffusion Sampling Accelerator with Multi-core Hierarchical ODE Solvers [72.23291099555459]
Diffusion-based generative models have become dominant generators of high-fidelity images and videos but remain limited by their computationally expensive inference procedures.<n>This paper explores a general, training-free, and model-agnostic acceleration strategy via multi-core parallelism.<n>ChoRDS significantly accelerates sampling across diverse large-scale image and video diffusion models, yielding up to 2.1x speedup with four cores, improving by 50% over baselines, and 2.9x speedup with eight cores, all without quality degradation.
arXiv Detail & Related papers (2025-07-21T05:48:47Z)
Enhanced DACER Algorithm with High Diffusion Efficiency [26.268226121403515]
We introduce a temporal weighting mechanism that enables the model to efficiently eliminate large-scale noise in the early stages.<n>We show that the DACER2 algorithm achieves state-of-the-art performance in most MuJoCo control tasks with only five diffusion steps.
arXiv Detail & Related papers (2025-05-29T13:21:58Z)
VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL [28.95582264086289]
VAlue-based Reinforced Diffusion (VARD) is a novel approach that first learns a value function predicting expection of rewards from intermediate states.<n>Our method maintains proximity to the pretrained model while enabling effective and stable training via backpropagation.
arXiv Detail & Related papers (2025-05-21T17:44:37Z)
Habitizing Diffusion Planning for Efficient and Effective Decision Making [41.128266491447334]
We introduce Habi, a framework that transforms powerful but slow diffusion planning models into fast decision-making models.<n>Even using a laptop CPU, the habitized model can achieve an average 800+ Hz decision-making frequency.
arXiv Detail & Related papers (2025-02-10T12:40:32Z)
SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance [12.973835034100428]
This paper presents SNOOPI, a novel framework designed to enhance the guidance in one-step diffusion models during both training and inference.<n>By varying the guidance scale of both teacher models, we broaden their output distributions, resulting in a more robust VSD loss that enables SB to perform effectively across diverse backbones while maintaining competitive performance.<n>Second, we propose a training-free method called Negative-Away Steer Attention (NASA), which integrates negative prompts into one-step diffusion models via cross-attention to suppress undesired elements in generated images.
arXiv Detail & Related papers (2024-12-03T18:56:32Z)
Unleashing the Power of One-Step Diffusion based Image Super-Resolution via a Large-Scale Diffusion Discriminator [81.81748032199813]
Diffusion models have demonstrated excellent performance for real-world image super-resolution (Real-ISR)<n>We propose a new One-Step textbfDiffusion model with a larger-scale textbfDiscriminator for SR.<n>Our discriminator is able to distill noisy features from any time step of diffusion models in the latent space.
arXiv Detail & Related papers (2024-10-05T16:41:36Z)
One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts. Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation. We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z)
Diffusion Models as Optimizers for Efficient Planning in Offline RL [47.0835433289033]
Diffusion models have shown strong competitiveness in offline reinforcement learning tasks. We propose a faster autoregressive model to handle the generation of feasible trajectories. This allows us to achieve more efficient planning without sacrificing capability.
arXiv Detail & Related papers (2024-07-23T03:00:01Z)
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training [53.93563224892207]
We introduce a novel speed-up method for diffusion model training, called, which is based on a closer look at time steps.<n>As a plug-and-play and architecture-agnostic approach, SpeeD consistently achieves 3-times acceleration across various diffusion architectures, datasets, and tasks.
arXiv Detail & Related papers (2024-05-27T17:51:36Z)
One-Step Diffusion Distillation via Deep Equilibrium Models [64.11782639697883]
We introduce a simple yet effective means of distilling diffusion models directly from initial noise to the resulting image. Our method enables fully offline training with just noise/image pairs from the diffusion model. We demonstrate that the DEQ architecture is crucial to this capability, as GET matches a $5times$ larger ViT in terms of FID scores.
arXiv Detail & Related papers (2023-12-12T07:28:40Z)
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images. Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few. We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.