Q-Diffusion: Quantizing Diffusion Models
- URL: http://arxiv.org/abs/2302.04304v3
- Date: Thu, 8 Jun 2023 09:21:05 GMT
- Title: Q-Diffusion: Quantizing Diffusion Models
- Authors: Xiuyu Li, Yijiang Liu, Long Lian, Huanrui Yang, Zhen Dong, Daniel
Kang, Shanghang Zhang, Kurt Keutzer
- Abstract summary: Post-training quantization (PTQ) is considered a go-to compression method for other tasks.
We propose a novel PTQ method specifically tailored towards the unique multi-timestep pipeline and model architecture.
We show that our proposed method is able to quantize full-precision unconditional diffusion models into 4-bit while maintaining comparable performance.
- Score: 52.978047249670276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have achieved great success in image synthesis through
iterative noise estimation using deep neural networks. However, the slow
inference, high memory consumption, and computation intensity of the noise
estimation model hinder the efficient adoption of diffusion models. Although
post-training quantization (PTQ) is considered a go-to compression method for
other tasks, it does not work out-of-the-box on diffusion models. We propose a
novel PTQ method specifically tailored towards the unique multi-timestep
pipeline and model architecture of the diffusion models, which compresses the
noise estimation network to accelerate the generation process. We identify the
key difficulty of diffusion model quantization as the changing output
distributions of noise estimation networks over multiple time steps and the
bimodal activation distribution of the shortcut layers within the noise
estimation network. We tackle these challenges with timestep-aware calibration
and split shortcut quantization in this work. Experimental results show that
our proposed method is able to quantize full-precision unconditional diffusion
models into 4-bit while maintaining comparable performance (small FID change of
at most 2.34 compared to >100 for traditional PTQ) in a training-free manner.
Our approach can also be applied to text-guided image generation, where we can
run stable diffusion in 4-bit weights with high generation quality for the
first time.
Related papers
- Timestep-Aware Correction for Quantized Diffusion Models [28.265582848911574]
We propose a timestep-aware correction method for quantized diffusion model, which dynamically corrects the quantization error.
By leveraging the proposed method in low-precision diffusion models, substantial enhancement of output quality could be achieved with only negligible overhead.
arXiv Detail & Related papers (2024-07-04T13:22:31Z) - 2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution [83.09117439860607]
Low-bit quantization has become widespread for compressing image super-resolution (SR) models for edge deployment.
It is notorious that low-bit quantization degrades the accuracy of SR models compared to their full-precision (FP) counterparts.
We present a dual-stage low-bit post-training quantization (PTQ) method for image super-resolution, namely 2DQuant, which achieves efficient and accurate SR under low-bit quantization.
arXiv Detail & Related papers (2024-06-10T06:06:11Z) - Lossy Image Compression with Foundation Diffusion Models [10.407650300093923]
In this work we formulate the removal of quantization error as a denoising task, using diffusion to recover lost information in the transmitted image latent.
Our approach allows us to perform less than 10% of the full diffusion generative process and requires no architectural changes to the diffusion model.
arXiv Detail & Related papers (2024-04-12T16:23:42Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning [52.157939524815866]
In this paper, we empirically unravel three properties in quantized diffusion models that compromise the efficacy of current methods.
We identify two critical types of quantized layers: those holding vital temporal information and those sensitive to reduced bit-width.
Our method is evaluated over three high-resolution image generation tasks and achieves state-of-the-art performance under various bit-width settings.
arXiv Detail & Related papers (2024-02-06T03:39:44Z) - Memory-Efficient Fine-Tuning for Quantized Diffusion Model [12.875837358532422]
We introduce TuneQDM, a memory-efficient fine-tuning method for quantized diffusion models.
Our method consistently outperforms the baseline in both single-/multi-subject generations.
arXiv Detail & Related papers (2024-01-09T03:42:08Z) - Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing [49.800746112114375]
We propose a novel post-training quantization method (Progressive and Relaxing) for text-to-image diffusion models.
We are the first to achieve quantization for Stable Diffusion XL while maintaining the performance.
arXiv Detail & Related papers (2023-11-10T09:10:09Z) - PTQD: Accurate Post-Training Quantization for Diffusion Models [22.567863065523902]
Post-training quantization of diffusion models can significantly reduce the model size and accelerate the sampling process without re-training.
Applying existing PTQ methods directly to low-bit diffusion models can significantly impair the quality of generated samples.
We propose a unified formulation for the quantization noise and diffusion perturbed noise in the quantized denoising process.
arXiv Detail & Related papers (2023-05-18T02:28:42Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.