PTQD: Accurate Post-Training Quantization for Diffusion Models
- URL: http://arxiv.org/abs/2305.10657v4
- Date: Wed, 1 Nov 2023 08:40:41 GMT
- Title: PTQD: Accurate Post-Training Quantization for Diffusion Models
- Authors: Yefei He, Luping Liu, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang
- Abstract summary: Post-training quantization of diffusion models can significantly reduce the model size and accelerate the sampling process without re-training.
Applying existing PTQ methods directly to low-bit diffusion models can significantly impair the quality of generated samples.
We propose a unified formulation for the quantization noise and diffusion perturbed noise in the quantized denoising process.
- Score: 22.567863065523902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have recently dominated image synthesis tasks. However, the
iterative denoising process is expensive in computations at inference time,
making diffusion models less practical for low-latency and scalable real-world
applications. Post-training quantization (PTQ) of diffusion models can
significantly reduce the model size and accelerate the sampling process without
re-training. Nonetheless, applying existing PTQ methods directly to low-bit
diffusion models can significantly impair the quality of generated samples.
Specifically, for each denoising step, quantization noise leads to deviations
in the estimated mean and mismatches with the predetermined variance schedule.
As the sampling process proceeds, the quantization noise may accumulate,
resulting in a low signal-to-noise ratio (SNR) during the later denoising
steps. To address these challenges, we propose a unified formulation for the
quantization noise and diffusion perturbed noise in the quantized denoising
process. Specifically, we first disentangle the quantization noise into its
correlated and residual uncorrelated parts regarding its full-precision
counterpart. The correlated part can be easily corrected by estimating the
correlation coefficient. For the uncorrelated part, we subtract the bias from
the quantized results to correct the mean deviation and calibrate the denoising
variance schedule to absorb the excess variance resulting from quantization.
Moreover, we introduce a mixed-precision scheme for selecting the optimal
bitwidth for each denoising step. Extensive experiments demonstrate that our
method outperforms previous post-training quantized diffusion models, with only
a 0.06 increase in FID score compared to full-precision LDM-4 on ImageNet
256x256, while saving 19.9x bit operations. Code is available at
https://github.com/ziplab/PTQD.
Related papers
- Timestep-Aware Correction for Quantized Diffusion Models [28.265582848911574]
We propose a timestep-aware correction method for quantized diffusion model, which dynamically corrects the quantization error.
By leveraging the proposed method in low-precision diffusion models, substantial enhancement of output quality could be achieved with only negligible overhead.
arXiv Detail & Related papers (2024-07-04T13:22:31Z) - QNCD: Quantization Noise Correction for Diffusion Models [15.189069680672239]
Diffusion models have revolutionized image synthesis, setting new benchmarks in quality and creativity.
Post-training quantization presents a solution to accelerate sampling, aibeit at the expense of sample quality.
We introduce a unified Quantization Noise Correction Scheme (QNCD) aimed at minishing quantization noise throughout the sampling process.
arXiv Detail & Related papers (2024-03-28T04:24:56Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing [49.800746112114375]
We propose a novel post-training quantization method (Progressive and Relaxing) for text-to-image diffusion models.
We are the first to achieve quantization for Stable Diffusion XL while maintaining the performance.
arXiv Detail & Related papers (2023-11-10T09:10:09Z) - Towards Accurate Post-training Quantization for Diffusion Models [73.19871905102545]
We propose an accurate data-free post-training quantization framework of diffusion models (ADP-DM) for efficient image generation.
Our method outperforms the state-of-the-art post-training quantization of diffusion model by a sizable margin with similar computational cost.
arXiv Detail & Related papers (2023-05-30T04:00:35Z) - Parallel Sampling of Diffusion Models [76.3124029406809]
Diffusion models are powerful generative models but suffer from slow sampling.
We present ParaDiGMS, a novel method to accelerate the sampling of pretrained diffusion models by denoising multiple steps in parallel.
arXiv Detail & Related papers (2023-05-25T17:59:42Z) - Q-Diffusion: Quantizing Diffusion Models [52.978047249670276]
Post-training quantization (PTQ) is considered a go-to compression method for other tasks.
We propose a novel PTQ method specifically tailored towards the unique multi-timestep pipeline and model architecture.
We show that our proposed method is able to quantize full-precision unconditional diffusion models into 4-bit while maintaining comparable performance.
arXiv Detail & Related papers (2023-02-08T19:38:59Z) - Error-aware Quantization through Noise Tempering [43.049102196902844]
Quantization-aware training (QAT) optimize model parameters with respect to the end task while simulating quantization error.
In this work, we incorporate exponentially decaying quantization-error-aware noise together with a learnable scale of task loss gradient to approximate the effect of a quantization operator.
Our method obtains state-of-the-art top-1 classification accuracy for uniform (non mixed-precision) quantization, out-performing previous methods by 0.5-1.2% absolute.
arXiv Detail & Related papers (2022-12-11T20:37:50Z) - Denoising Diffusion Implicit Models [117.03720513930335]
We present denoising diffusion implicit models (DDIMs) for iterative implicit probabilistic models with the same training procedure as DDPMs.
DDIMs can produce high quality samples $10 times$ to $50 times$ faster in terms of wall-clock time compared to DDPMs.
arXiv Detail & Related papers (2020-10-06T06:15:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.