HiCache: Training-free Acceleration of Diffusion Models via Hermite Polynomial-based Feature Caching
- URL: http://arxiv.org/abs/2508.16984v1
- Date: Sat, 23 Aug 2025 10:35:16 GMT
- Title: HiCache: Training-free Acceleration of Diffusion Models via Hermite Polynomial-based Feature Caching
- Authors: Liang Feng, Shikang Zheng, Jiacheng Liu, Yuqi Lin, Qinming Zhou, Peiliang Cai, Xinyu Wang, Junjie Chen, Chang Zou, Yue Ma, Linfeng Zhang,
- Abstract summary: HiCache is a training-free acceleration framework that improves feature prediction.<n>We introduce a dual-scaling mechanism that ensures numerical stability while preserving predictive accuracy.
- Score: 19.107716099809707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have achieved remarkable success in content generation but suffer from prohibitive computational costs due to iterative sampling. While recent feature caching methods tend to accelerate inference through temporal extrapolation, these methods still suffer from server quality loss due to the failure in modeling the complex dynamics of feature evolution. To solve this problem, this paper presents HiCache, a training-free acceleration framework that fundamentally improves feature prediction by aligning mathematical tools with empirical properties. Our key insight is that feature derivative approximations in Diffusion Transformers exhibit multivariate Gaussian characteristics, motivating the use of Hermite polynomials-the potentially theoretically optimal basis for Gaussian-correlated processes. Besides, We further introduce a dual-scaling mechanism that ensures numerical stability while preserving predictive accuracy. Extensive experiments demonstrate HiCache's superiority: achieving 6.24x speedup on FLUX.1-dev while exceeding baseline quality, maintaining strong performance across text-to-image, video generation, and super-resolution tasks. Core implementation is provided in the appendix, with complete code to be released upon acceptance.
Related papers
- SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching [75.02865981328509]
Caching reduces computation by reusing previously computed model outputs across timesteps.<n>We propose Sensitivity-Aware Caching (SenCache), a dynamic caching policy that adaptively selects caching timesteps on a per-sample basis.<n>SenCache achieves better visual quality than existing caching methods under similar computational budgets.
arXiv Detail & Related papers (2026-02-27T17:36:09Z) - A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation [15.689880312464004]
Diffusion Models have become a cornerstone of modern generative AI for their exceptional generation quality and controllability.<n>Diffusion Caching offers a training-free, architecture-agnostic, and efficient inference paradigm.<n>By enabling computation feature-level cross-step reuse and inter-layer scheduling, it reduces without modifying model parameters.
arXiv Detail & Related papers (2025-10-22T16:46:05Z) - Predictive Feature Caching for Training-free Acceleration of Molecular Geometry Generation [67.20779609022108]
Flow matching models generate high-fidelity molecular geometries but incur significant computational costs during inference.<n>This work discusses a training-free caching strategy that accelerates molecular geometry generation.<n> Experiments on the GEOM-Drugs dataset demonstrate that caching achieves a twofold reduction in wall-clock inference time.
arXiv Detail & Related papers (2025-10-06T09:49:14Z) - SpeCa: Accelerating Diffusion Transformers with Speculative Feature Caching [17.724549528455317]
Diffusion models have revolutionized high-fidelity image and video synthesis, yet their computational demands remain prohibitive for real-time applications.<n>We present SpeCa, a novel 'Forecast-then-verify' acceleration framework that effectively addresses both limitations.<n>Our approach implements a parameter-free verification mechanism that efficiently evaluates prediction reliability, enabling real-time decisions to accept or reject each prediction.
arXiv Detail & Related papers (2025-09-15T06:46:22Z) - DiCache: Let Diffusion Model Determine Its Own Cache [63.73224201922458]
We present DiCache, a training-free adaptive caching strategy for accelerating diffusion models at runtime.<n>Online Probe Profiling Scheme leverages a shallow-layer online probe to obtain a stable prior for the caching error in real time.<n> Dynamic Cache Trajectory Alignment combines multi-step caches based on shallow-layer probe feature trajectory to better approximate the current feature.
arXiv Detail & Related papers (2025-08-24T13:30:00Z) - Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models [57.49136894315871]
New paradigm of test-time scaling has yielded remarkable breakthroughs in reasoning models and generative vision models.<n>We propose one solution to the problem of integrating test-time scaling knowledge into a model during post-training.<n>We replace reward guided test-time noise optimization in diffusion models with a Noise Hypernetwork that modulates initial input noise.
arXiv Detail & Related papers (2025-08-13T17:33:37Z) - AB-Cache: Training-Free Acceleration of Diffusion Models via Adams-Bashforth Cached Feature Reuse [19.13826316844611]
Diffusion models have demonstrated remarkable success in generative tasks, yet their iterative denoising process results in slow inference.<n>We provide a theoretical understanding by analyzing the denoising process through the second-order Adams-Bashforth method.<n>We propose a novel caching-based acceleration approach for diffusion models, instead of directly reusing cached results.
arXiv Detail & Related papers (2025-04-13T08:29:58Z) - Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models [41.11005178050448]
ProfilingDiT is a novel adaptive caching strategy that explicitly disentangles foreground and background-focused blocks.<n>Our framework achieves significant acceleration while maintaining visual fidelity across comprehensive quality metrics.
arXiv Detail & Related papers (2025-04-04T03:30:15Z) - Implicit Neural Differential Model for Spatiotemporal Dynamics [5.1854032131971195]
We introduce Im-PiNDiff, a novel implicit physics-integrated neural differentiable solver for stabletemporal dynamics.<n>Inspired by deep equilibrium models, Im-PiNDiff advances the state using implicit fixed-point layers, enabling robust long-term simulation.<n>Im-PiNDiff achieves superior predictive performance, enhanced numerical stability, and substantial reductions in memory and cost.
arXiv Detail & Related papers (2025-04-03T04:07:18Z) - Towards Scalable and Deep Graph Neural Networks via Noise Masking [59.058558158296265]
Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks.<n> scaling them to large graphs is challenging due to the high computational and storage costs.<n>We present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works.
arXiv Detail & Related papers (2024-12-19T07:48:14Z) - Temporal Feature Matters: A Framework for Diffusion Model Quantization [105.3033493564844]
Diffusion models rely on the time-step for the multi-round denoising.<n>We introduce a novel quantization framework that includes three strategies.<n>This framework preserves most of the temporal information and ensures high-quality end-to-end generation.
arXiv Detail & Related papers (2024-07-28T17:46:15Z) - DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for
Accelerated Seq2Seq Diffusion Models [58.450152413700586]
We introduce a soft absorbing state that facilitates the diffusion model in learning to reconstruct discrete mutations based on the underlying Gaussian space.
We employ state-of-the-art ODE solvers within the continuous space to expedite the sampling process.
Our proposed method effectively accelerates the training convergence by 4x and generates samples of similar quality 800x faster.
arXiv Detail & Related papers (2023-10-09T15:29:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.