Q-Diffusion: Quantizing Diffusion Models
- URL: http://arxiv.org/abs/2302.04304v3
- Date: Thu, 8 Jun 2023 09:21:05 GMT
- Title: Q-Diffusion: Quantizing Diffusion Models
- Authors: Xiuyu Li, Yijiang Liu, Long Lian, Huanrui Yang, Zhen Dong, Daniel
Kang, Shanghang Zhang, Kurt Keutzer
- Abstract summary: Post-training quantization (PTQ) is considered a go-to compression method for other tasks.
We propose a novel PTQ method specifically tailored towards the unique multi-timestep pipeline and model architecture.
We show that our proposed method is able to quantize full-precision unconditional diffusion models into 4-bit while maintaining comparable performance.
- Score: 52.978047249670276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have achieved great success in image synthesis through
iterative noise estimation using deep neural networks. However, the slow
inference, high memory consumption, and computation intensity of the noise
estimation model hinder the efficient adoption of diffusion models. Although
post-training quantization (PTQ) is considered a go-to compression method for
other tasks, it does not work out-of-the-box on diffusion models. We propose a
novel PTQ method specifically tailored towards the unique multi-timestep
pipeline and model architecture of the diffusion models, which compresses the
noise estimation network to accelerate the generation process. We identify the
key difficulty of diffusion model quantization as the changing output
distributions of noise estimation networks over multiple time steps and the
bimodal activation distribution of the shortcut layers within the noise
estimation network. We tackle these challenges with timestep-aware calibration
and split shortcut quantization in this work. Experimental results show that
our proposed method is able to quantize full-precision unconditional diffusion
models into 4-bit while maintaining comparable performance (small FID change of
at most 2.34 compared to >100 for traditional PTQ) in a training-free manner.
Our approach can also be applied to text-guided image generation, where we can
run stable diffusion in 4-bit weights with high generation quality for the
first time.
Related papers
- TCAQ-DM: Timestep-Channel Adaptive Quantization for Diffusion Models [49.65286242048452]
We propose a novel method dubbed Timestep-Channel Adaptive Quantization for Diffusion Models (TCAQ-DM)
The proposed method substantially outperforms the state-of-the-art approaches in most cases.
arXiv Detail & Related papers (2024-12-21T16:57:54Z) - MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models [37.061975191553]
This paper presents MPQ-DM, a Mixed-Precision Quantization method for Diffusion Models.
To mitigate the quantization error caused by outlier severe weight channels, we propose an Outlier-Driven Mixed Quantization technique.
To robustly learn representations crossing time steps, we construct a Time-Smoothed Relation Distillation scheme.
arXiv Detail & Related papers (2024-12-16T08:31:55Z) - Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion [9.8078769718432]
We propose an efficient quantization framework for Stable Diffusion models.
Our approach features a Serial-to-Parallel calibration pipeline that addresses the consistency of both the calibration and inference processes.
Under W4A8 quantization settings, our approach enhances both distribution similarity and visual similarity by 45%-60%.
arXiv Detail & Related papers (2024-12-09T17:00:20Z) - Timestep-Aware Correction for Quantized Diffusion Models [28.265582848911574]
We propose a timestep-aware correction method for quantized diffusion model, which dynamically corrects the quantization error.
By leveraging the proposed method in low-precision diffusion models, substantial enhancement of output quality could be achieved with only negligible overhead.
arXiv Detail & Related papers (2024-07-04T13:22:31Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning [52.157939524815866]
In this paper, we empirically unravel three properties in quantized diffusion models that compromise the efficacy of current methods.
We identify two critical types of quantized layers: those holding vital temporal information and those sensitive to reduced bit-width.
Our method is evaluated over three high-resolution image generation tasks and achieves state-of-the-art performance under various bit-width settings.
arXiv Detail & Related papers (2024-02-06T03:39:44Z) - Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing [49.800746112114375]
We propose a novel post-training quantization method (Progressive and Relaxing) for text-to-image diffusion models.
We are the first to achieve quantization for Stable Diffusion XL while maintaining the performance.
arXiv Detail & Related papers (2023-11-10T09:10:09Z) - PTQD: Accurate Post-Training Quantization for Diffusion Models [22.567863065523902]
Post-training quantization of diffusion models can significantly reduce the model size and accelerate the sampling process without re-training.
Applying existing PTQ methods directly to low-bit diffusion models can significantly impair the quality of generated samples.
We propose a unified formulation for the quantization noise and diffusion perturbed noise in the quantized denoising process.
arXiv Detail & Related papers (2023-05-18T02:28:42Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.