Related papers: OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models

OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models

URL: http://arxiv.org/abs/2508.16212v2
Date: Mon, 25 Aug 2025 03:07:02 GMT
Title: OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models
Authors: Huanpeng Chu, Wei Wu, Guanyu Fen, Yutao Zhang,
Abstract summary: DiffusionTransformers-stemming from a large number of sampling steps and complex per-step computations-presents significant challenges for real-time deployment.<n>We introduce OmniCache, a training-free acceleration method that exploits the global redundancy inherent in the denoising process.
Score: 5.2258248597807535
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models have emerged as a powerful paradigm for generative tasks such as image synthesis and video generation, with Transformer architectures further enhancing performance. However, the high computational cost of diffusion Transformers-stemming from a large number of sampling steps and complex per-step computations-presents significant challenges for real-time deployment. In this paper, we introduce OmniCache, a training-free acceleration method that exploits the global redundancy inherent in the denoising process. Unlike existing methods that determine caching strategies based on inter-step similarities and tend to prioritize reusing later sampling steps, our approach originates from the sampling perspective of DIT models. We systematically analyze the model's sampling trajectories and strategically distribute cache reuse across the entire sampling process. This global perspective enables more effective utilization of cached computations throughout the diffusion trajectory, rather than concentrating reuse within limited segments of the sampling procedure. In addition, during cache reuse, we dynamically estimate the corresponding noise and filter it out to reduce its impact on the sampling direction. Extensive experiments demonstrate that our approach accelerates the sampling process while maintaining competitive generative quality, offering a promising and practical solution for efficient deployment of diffusion-based generative models.

Related papers

Learning To Sample From Diffusion Models Via Inverse Reinforcement Learning [43.678382510171986]
Diffusion models generate samples through an iterative denoising process, guided by a neural network.<n>We introduce an inverse reinforcement learning framework for learning sampling strategies without retraining the denoiser.<n>We provide experimental evidence that this approach can improve the quality of samples generated by pretrained diffusion models.
arXiv Detail & Related papers (2026-02-09T14:10:44Z)
LSGQuant: Layer-Sensitivity Guided Quantization for One-Step Diffusion Real-World Video Super-Resolution [52.627063566555194]
We introduce LSGQuant, a layer-sensitivity guided quantizing approach for one-step diffusion-based real-world VSR.<n>Our method incorporates a Dynamic Range Adaptive Quantizer (DRAQ) to fit video token activations.<n>Our method has nearly performance to origin model with full-precision and significantly exceeds existing quantization techniques.
arXiv Detail & Related papers (2026-02-03T06:53:19Z)
Quantizing Diffusion Models from a Sampling-Aware Perspective [43.95032520555463]
We propose a sampling-aware quantization strategy, wherein a Mixed-Order Trajectory Alignment technique is devised.<n>Experiments on sparse-step fast sampling across multiple datasets demonstrate that our approach preserves the rapid convergence characteristics of high-speed samplers.
arXiv Detail & Related papers (2025-05-04T20:50:44Z)
Arbitrary-steps Image Super-resolution via Diffusion Inversion [68.78628844966019]
This study presents a new image super-resolution (SR) technique based on diffusion inversion, aiming at harnessing the rich image priors encapsulated in large pre-trained diffusion models to improve SR performance.<n>We design a Partial noise Prediction strategy to construct an intermediate state of the diffusion model, which serves as the starting sampling point.<n>Once trained, this noise predictor can be used to initialize the sampling process partially along the diffusion trajectory, generating the desirable high-resolution result.
arXiv Detail & Related papers (2024-12-12T07:24:13Z)
Diffusing Differentiable Representations [60.72992910766525]
We introduce a novel, training-free method for sampling differentiable representations (diffreps) using pretrained diffusion models.<n>We identify an implicit constraint on the samples induced by the diffrep and demonstrate that addressing this constraint significantly improves the consistency and detail of the generated objects.
arXiv Detail & Related papers (2024-12-09T20:42:58Z)
Adaptive Non-uniform Timestep Sampling for Accelerating Diffusion Model Training [6.694752081172194]
As data distributions grow more complex, training diffusion models to convergence becomes increasingly intensive.<n>We introduce a non-uniform timestep sampling method that prioritizes these more critical timesteps.<n>Our method shows robust performance across various datasets, scheduling strategies, and diffusion architectures.
arXiv Detail & Related papers (2024-11-15T07:12:18Z)
Sequential Posterior Sampling with Diffusion Models [15.028061496012924]
We propose a novel approach that models the transition dynamics to improve the efficiency of sequential diffusion posterior sampling in conditional image synthesis. We demonstrate the effectiveness of our approach on a real-world dataset of high frame rate cardiac ultrasound images. Our method opens up new possibilities for real-time applications of diffusion models in imaging and other domains requiring real-time inference.
arXiv Detail & Related papers (2024-09-09T07:55:59Z)
Diffusion Spectral Representation for Reinforcement Learning [17.701625371409644]
We propose to leverage the flexibility of diffusion models for reinforcement learning from a representation learning perspective. By exploiting the connection between diffusion models and energy-based models, we develop Diffusion Spectral Representation (Diff-SR) We show how Diff-SR facilitates efficient policy optimization and practical algorithms while explicitly bypassing the difficulty and inference cost of sampling from the diffusion model.
arXiv Detail & Related papers (2024-06-23T14:24:14Z)
Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching [56.286064975443026]
We make an interesting and somehow surprising observation: the computation of a large proportion of layers in the diffusion transformer, through a caching mechanism, can be readily removed even without updating the model parameters. We introduce a novel scheme, named Learningto-Cache (L2C), that learns to conduct caching in a dynamic manner for diffusion transformers. Experimental results show that L2C largely outperforms samplers such as DDIM and DPM-r, alongside prior cache-based methods at the same inference speed.
arXiv Detail & Related papers (2024-06-03T18:49:57Z)
Improved off-policy training of diffusion samplers [93.66433483772055]
We study the problem of training diffusion models to sample from a distribution with an unnormalized density or energy function.<n>We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods.<n>Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work.
arXiv Detail & Related papers (2024-02-07T18:51:49Z)
Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time [49.598085130313514]
We propose discrete non-Markov diffusion models (DNDM), which naturally induce the predetermined transition time set.<n>This enables a training-free sampling algorithm that significantly reduces the number of function evaluations.<n>We study the transition from finite to infinite step sampling, offering new insights into bridging the gap between discrete and continuous-time processes.
arXiv Detail & Related papers (2023-12-14T18:14:11Z)
Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments. Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z)
Fast Sampling of Diffusion Models via Operator Learning [74.37531458470086]
We use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models. Compared to other fast sampling methods that have a sequential nature, we are the first to propose a parallel decoding method. We show our method achieves state-of-the-art FID of 3.78 for CIFAR-10 and 7.83 for ImageNet-64 in the one-model-evaluation setting.
arXiv Detail & Related papers (2022-11-24T07:30:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.