Memory-Efficient Fine-Tuning for Quantized Diffusion Model
- URL: http://arxiv.org/abs/2401.04339v2
- Date: Thu, 18 Jul 2024 11:38:17 GMT
- Title: Memory-Efficient Fine-Tuning for Quantized Diffusion Model
- Authors: Hyogon Ryu, Seohyun Lim, Hyunjung Shim,
- Abstract summary: We introduce TuneQDM, a memory-efficient fine-tuning method for quantized diffusion models.
Our method consistently outperforms the baseline in both single-/multi-subject generations.
- Score: 12.875837358532422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The emergence of billion-parameter diffusion models such as Stable Diffusion XL, Imagen, and DALL-E 3 has significantly propelled the domain of generative AI. However, their large-scale architecture presents challenges in fine-tuning and deployment due to high resource demands and slow inference speed. This paper explores the relatively unexplored yet promising realm of fine-tuning quantized diffusion models. Our analysis revealed that the baseline neglects the distinct patterns in model weights and the different roles throughout time steps when finetuning the diffusion model. To address these limitations, we introduce a novel memory-efficient fine-tuning method specifically designed for quantized diffusion models, dubbed TuneQDM. Our approach introduces quantization scales as separable functions to consider inter-channel weight patterns. Then, it optimizes these scales in a timestep-specific manner for effective reflection of the role of each time step. TuneQDM achieves performance on par with its full-precision counterpart while simultaneously offering significant memory efficiency. Experimental results demonstrate that our method consistently outperforms the baseline in both single-/multi-subject generations, exhibiting high subject fidelity and prompt fidelity comparable to the full precision model.
Related papers
- Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Distillation of Discrete Diffusion through Dimensional Correlations [21.078500510691747]
"Mixture" models in discrete diffusion are capable of treating dimensional correlations while remaining scalable.
We empirically demonstrate that our proposed method for discrete diffusions work in practice, by distilling a continuous-time discrete diffusion model pretrained on the CIFAR-10 dataset.
arXiv Detail & Related papers (2024-10-11T10:53:03Z) - Timestep-Aware Correction for Quantized Diffusion Models [28.265582848911574]
We propose a timestep-aware correction method for quantized diffusion model, which dynamically corrects the quantization error.
By leveraging the proposed method in low-precision diffusion models, substantial enhancement of output quality could be achieved with only negligible overhead.
arXiv Detail & Related papers (2024-07-04T13:22:31Z) - Provable Statistical Rates for Consistency Diffusion Models [87.28777947976573]
Despite the state-of-the-art performance, diffusion models are known for their slow sample generation due to the extensive number of steps involved.
This paper contributes towards the first statistical theory for consistency models, formulating their training as a distribution discrepancy minimization problem.
arXiv Detail & Related papers (2024-06-23T20:34:18Z) - Improving Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architectures [12.703947839247693]
Diffusion models, emerging as powerful deep generative tools, excel in various applications.
However, their remarkable generative performance is hindered by slow training and sampling.
This is due to the necessity of tracking extensive forward and reverse diffusion trajectories.
We present a multi-stage framework inspired by our empirical findings to tackle these challenges.
arXiv Detail & Related papers (2023-12-14T17:48:09Z) - Semi-Implicit Denoising Diffusion Models (SIDDMs) [50.30163684539586]
Existing models such as Denoising Diffusion Probabilistic Models (DDPM) deliver high-quality, diverse samples but are slowed by an inherently high number of iterative steps.
We introduce a novel approach that tackles the problem by matching implicit and explicit factors.
We demonstrate that our proposed method obtains comparable generative performance to diffusion-based models and vastly superior results to models with a small number of sampling steps.
arXiv Detail & Related papers (2023-06-21T18:49:22Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - Q-Diffusion: Quantizing Diffusion Models [52.978047249670276]
Post-training quantization (PTQ) is considered a go-to compression method for other tasks.
We propose a novel PTQ method specifically tailored towards the unique multi-timestep pipeline and model architecture.
We show that our proposed method is able to quantize full-precision unconditional diffusion models into 4-bit while maintaining comparable performance.
arXiv Detail & Related papers (2023-02-08T19:38:59Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.