Towards Accurate Post-training Quantization for Diffusion Models
- URL: http://arxiv.org/abs/2305.18723v4
- Date: Tue, 30 Apr 2024 08:53:45 GMT
- Title: Towards Accurate Post-training Quantization for Diffusion Models
- Authors: Changyuan Wang, Ziwei Wang, Xiuwei Xu, Yansong Tang, Jie Zhou, Jiwen Lu,
- Abstract summary: We propose an accurate data-free post-training quantization framework of diffusion models (ADP-DM) for efficient image generation.
Our method outperforms the state-of-the-art post-training quantization of diffusion model by a sizable margin with similar computational cost.
- Score: 73.19871905102545
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose an accurate data-free post-training quantization framework of diffusion models (ADP-DM) for efficient image generation. Conventional data-free quantization methods learn shared quantization functions for tensor discretization regardless of the generation timesteps, while the activation distribution differs significantly across various timesteps. The calibration images are acquired in random timesteps which fail to provide sufficient information for generalizable quantization function learning. Both issues cause sizable quantization errors with obvious image generation performance degradation. On the contrary, we design group-wise quantization functions for activation discretization in different timesteps and sample the optimal timestep for informative calibration image generation, so that our quantized diffusion model can reduce the discretization errors with negligible computational overhead. Specifically, we partition the timesteps according to the importance weights of quantization functions in different groups, which are optimized by differentiable search algorithms. We also select the optimal timestep for calibration image generation by structural risk minimizing principle in order to enhance the generalization ability in the deployment of quantized diffusion model. Extensive experimental results show that our method outperforms the state-of-the-art post-training quantization of diffusion model by a sizable margin with similar computational cost.
Related papers
- Timestep-Aware Correction for Quantized Diffusion Models [28.265582848911574]
We propose a timestep-aware correction method for quantized diffusion model, which dynamically corrects the quantization error.
By leveraging the proposed method in low-precision diffusion models, substantial enhancement of output quality could be achieved with only negligible overhead.
arXiv Detail & Related papers (2024-07-04T13:22:31Z) - TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models [40.5153344875351]
We introduce TMPQ-DM, which jointly optimize timestep reduction and quantization to achieve a superior performance-efficiency trade-off.
For timestep reduction, we devise a non-uniform grouping scheme tailored to the non-uniform nature of the denoising process.
In terms of quantization, we adopt a fine-grained layer-wise approach to allocate varying bit-widths to different layers based on their respective contributions to the final generative performance.
arXiv Detail & Related papers (2024-04-15T07:51:40Z) - QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning [52.157939524815866]
In this paper, we empirically unravel three properties in quantized diffusion models that compromise the efficacy of current methods.
We identify two critical types of quantized layers: those holding vital temporal information and those sensitive to reduced bit-width.
Our method is evaluated over three high-resolution image generation tasks and achieves state-of-the-art performance under various bit-width settings.
arXiv Detail & Related papers (2024-02-06T03:39:44Z) - Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing [49.800746112114375]
We propose a novel post-training quantization method (Progressive and Relaxing) for text-to-image diffusion models.
We are the first to achieve quantization for Stable Diffusion XL while maintaining the performance.
arXiv Detail & Related papers (2023-11-10T09:10:09Z) - Temporal Dynamic Quantization for Diffusion Models [18.184163233551292]
We introduce a novel quantization method that dynamically adjusts the quantization interval based on time step information.
Unlike conventional dynamic quantization techniques, our approach has no computational overhead during inference.
Our experiments demonstrate substantial improvements in output quality with the quantized diffusion model across various datasets.
arXiv Detail & Related papers (2023-06-04T09:49:43Z) - Regularized Vector Quantization for Tokenized Image Synthesis [126.96880843754066]
Quantizing images into discrete representations has been a fundamental problem in unified generative modeling.
deterministic quantization suffers from severe codebook collapse and misalignment with inference stage while quantization suffers from low codebook utilization and reconstruction objective.
This paper presents a regularized vector quantization framework that allows to mitigate perturbed above issues effectively by applying regularization from two perspectives.
arXiv Detail & Related papers (2023-03-11T15:20:54Z) - Q-Diffusion: Quantizing Diffusion Models [52.978047249670276]
Post-training quantization (PTQ) is considered a go-to compression method for other tasks.
We propose a novel PTQ method specifically tailored towards the unique multi-timestep pipeline and model architecture.
We show that our proposed method is able to quantize full-precision unconditional diffusion models into 4-bit while maintaining comparable performance.
arXiv Detail & Related papers (2023-02-08T19:38:59Z) - Moment evolution equations and moment matching for stochastic image
EPDiff [68.97335984455059]
Models of image deformation allow study of time-continuous effects transforming images by deforming the image domain.
Applications include medical image analysis with both population trends and random subject specific variation.
We use moment approximations of the corresponding Ito diffusion to construct estimators for statistical inference in the parameters full model.
arXiv Detail & Related papers (2021-10-07T11:08:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.