SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching
- URL: http://arxiv.org/abs/2602.24208v1
- Date: Fri, 27 Feb 2026 17:36:09 GMT
- Title: SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching
- Authors: Yasaman Haghighi, Alexandre Alahi,
- Abstract summary: Caching reduces computation by reusing previously computed model outputs across timesteps.<n>We propose Sensitivity-Aware Caching (SenCache), a dynamic caching policy that adaptively selects caching timesteps on a per-sample basis.<n>SenCache achieves better visual quality than existing caching methods under similar computational budgets.
- Score: 75.02865981328509
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models achieve state-of-the-art video generation quality, but their inference remains expensive due to the large number of sequential denoising steps. This has motivated a growing line of research on accelerating diffusion inference. Among training-free acceleration methods, caching reduces computation by reusing previously computed model outputs across timesteps. Existing caching methods rely on heuristic criteria to choose cache/reuse timesteps and require extensive tuning. We address this limitation with a principled sensitivity-aware caching framework. Specifically, we formalize the caching error through an analysis of the model output sensitivity to perturbations in the denoising inputs, i.e., the noisy latent and the timestep, and show that this sensitivity is a key predictor of caching error. Based on this analysis, we propose Sensitivity-Aware Caching (SenCache), a dynamic caching policy that adaptively selects caching timesteps on a per-sample basis. Our framework provides a theoretical basis for adaptive caching, explains why prior empirical heuristics can be partially effective, and extends them to a dynamic, sample-specific approach. Experiments on Wan 2.1, CogVideoX, and LTX-Video show that SenCache achieves better visual quality than existing caching methods under similar computational budgets.
Related papers
- PreciseCache: Precise Feature Caching for Efficient and High-fidelity Video Generation [35.47114707080758]
High computational costs and slow inference hinder the practical application of video generation models.<n>We propose textbfPreciseCache, a plug-and-play framework that precisely detects and skips truly redundant computations.
arXiv Detail & Related papers (2026-03-01T08:08:49Z) - DiCache: Let Diffusion Model Determine Its Own Cache [62.954717254728166]
DiCache is a training-free adaptive caching strategy for accelerating diffusion models at runtime.<n>Online Probe Profiling Scheme leverages a shallow-layer online probe to obtain an on-the-fly indicator for the caching error in real time.<n> Dynamic Cache Trajectory Alignment approximates the deep-layer feature output from multi-step historical caches.
arXiv Detail & Related papers (2025-08-24T13:30:00Z) - MixCache: Mixture-of-Cache for Video Diffusion Transformer Acceleration [15.22288174114487]
Caching is a widely adopted optimization method in DiT models.<n>We propose MixCache, a training-free caching-based framework for efficient video DiT inference.
arXiv Detail & Related papers (2025-08-18T07:49:33Z) - PromptTea: Let Prompts Tell TeaCache the Optimal Threshold [1.0665410339553834]
A common acceleration strategy involves reusing model outputs via caching mechanisms at fixed intervals.<n>We propose Prompt-Complexity-Aware (PCA) caching, a method that automatically adjusts reuse thresholds based on scene complexity estimated directly from the input prompt.
arXiv Detail & Related papers (2025-07-09T10:53:05Z) - Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching [57.7533917467934]
EasyCache is a training-free acceleration framework for video diffusion models.<n>We conduct comprehensive studies on various large-scale video generation models, including OpenSora, Wan2.1, and HunyuanVideo.<n>Our method achieves leading acceleration performance, reducing inference time by up to 2.1-3.3$times$ compared to the original baselines.
arXiv Detail & Related papers (2025-07-03T17:59:54Z) - MagCache: Fast Video Generation with Magnitude-Aware Cache [91.2771453279713]
We introduce a novel and robust discovery: a unified magnitude law observed across different models and prompts.<n>We introduce a Magnitude-aware Cache (MagCache) that adaptively skips unimportant timesteps using an error modeling mechanism and adaptive caching strategy.<n> Experimental results show that MagCache achieves 2.10x-2.68x speedups on Open-Sora, CogVideoX, Wan 2.1, and HunyuanVideo, while preserving superior visual fidelity.
arXiv Detail & Related papers (2025-06-10T17:59:02Z) - Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model [55.64316746098431]
Timestep Embedding Aware Cache (TeaCache) is a training-free caching approach that estimates and leverages the fluctuating differences among model outputs across timesteps.<n>TeaCache achieves up to 4.41x acceleration over Open-Sora-Plan with negligible degradation of visual quality.
arXiv Detail & Related papers (2024-11-28T12:50:05Z) - FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality [58.80996741843102]
FasterCache is a training-free strategy designed to accelerate the inference of video diffusion models with high-quality generation.<n>We show that FasterCache can significantly accelerate video generation while keeping video quality comparable to the baseline.
arXiv Detail & Related papers (2024-10-25T07:24:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.