Related papers: Diffusion Model Quantization: A Review

Diffusion Model Quantization: A Review

URL: http://arxiv.org/abs/2505.05215v1
Date: Thu, 08 May 2025 13:09:34 GMT
Title: Diffusion Model Quantization: A Review
Authors: Qian Zeng, Chenggong Hu, Mingli Song, Jie Song,
Abstract summary: Recent success of large text-to-image models has underscored the exceptional performance of diffusion models in generative tasks.<n>Diffusion model quantization has emerged as a pivotal technique for both compression and acceleration.
Score: 36.22019054372206
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent success of large text-to-image models has empirically underscored the exceptional performance of diffusion models in generative tasks. To facilitate their efficient deployment on resource-constrained edge devices, model quantization has emerged as a pivotal technique for both compression and acceleration. This survey offers a thorough review of the latest advancements in diffusion model quantization, encapsulating and analyzing the current state of the art in this rapidly advancing domain. First, we provide an overview of the key challenges encountered in the quantization of diffusion models, including those based on U-Net architectures and Diffusion Transformers (DiT). We then present a comprehensive taxonomy of prevalent quantization techniques, engaging in an in-depth discussion of their underlying principles. Subsequently, we perform a meticulous analysis of representative diffusion model quantization schemes from both qualitative and quantitative perspectives. From a quantitative standpoint, we rigorously benchmark a variety of methods using widely recognized datasets, delivering an extensive evaluation of the most recent and impactful research in the field. From a qualitative standpoint, we categorize and synthesize the effects of quantization errors, elucidating these impacts through both visual analysis and trajectory examination. In conclusion, we outline prospective avenues for future research, proposing novel directions for the quantization of generative models in practical applications. The list of related papers, corresponding codes, pre-trained models and comparison results are publicly available at the survey project homepage https://github.com/TaylorJocelyn/Diffusion-Model-Quantization.

Related papers

Latent Guidance in Diffusion Models for Perceptual Evaluations [33.915594693285556]
latent diffusion models implicitly exhibit perceptually consistent local regions within the data manifold.<n>We propose Perceptual Manifold Guidance (PMG), an algorithm that utilizes pretrained latent diffusion models and perceptual quality features.<n>Our method achieves state-of-the-art performance, underscoring the superior generalization capabilities of diffusion models for NR-IQA tasks.
arXiv Detail & Related papers (2025-05-31T00:41:59Z)
Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation [7.193483612237862]
In this study, we analyze the impact of quantization on model merging through the lens of error barriers.<n>We propose a novel post-training quantization, HDRQ - Hessian and distant regularizing quantization, that is designed to consider model merging for multi-target domain adaptation.<n>Our approach ensures that the quantization process incurs minimal deviation from the source pre-trained model while flattening the loss surface to facilitate smooth model merging.
arXiv Detail & Related papers (2025-05-29T17:00:56Z)
Accelerated Diffusion Models via Speculative Sampling [89.43940130493233]
Speculative sampling is a popular technique for accelerating inference in Large Language Models.<n>We extend speculative sampling to diffusion models, which generate samples via continuous, vector-valued Markov chains.<n>We propose various drafting strategies, including a simple and effective approach that does not require training a draft model.
arXiv Detail & Related papers (2025-01-09T16:50:16Z)
Alignment of Diffusion Models: Fundamentals, Challenges, and Future [28.64041196069495]
Diffusion models have emerged as the leading paradigm in generative modeling, excelling in various applications. Despite their success, these models often misalign with human intentions, generating outputs that may not match text prompts or possess desired properties. Inspired by the success of alignment in tuning large language models, recent studies have investigated aligning diffusion models with human expectations and preferences.
arXiv Detail & Related papers (2024-09-11T13:21:32Z)
Diffusion Models in Low-Level Vision: A Survey [82.77962165415153]
diffusion model-based solutions have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity.<n>We present three generic diffusion modeling frameworks and explore their correlations with other deep generative models.<n>We summarize extended diffusion models applied in other tasks, including medical, remote sensing, and video scenarios.
arXiv Detail & Related papers (2024-06-17T01:49:27Z)
An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization [59.63880337156392]
Diffusion models have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology. Despite the significant empirical success, theory of diffusion models is very limited. This paper provides a well-rounded theoretical exposure for stimulating forward-looking theories and methods of diffusion models.
arXiv Detail & Related papers (2024-04-11T14:07:25Z)
A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models. They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space. This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution. We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.