Related papers: BiDM: Pushing the Limit of Quantization for Diffusion Models

BiDM: Pushing the Limit of Quantization for Diffusion Models

URL: http://arxiv.org/abs/2412.05926v1
Date: Sun, 08 Dec 2024 12:45:21 GMT
Title: BiDM: Pushing the Limit of Quantization for Diffusion Models
Authors: Xingyu Zheng, Xianglong Liu, Yichen Bian, Xudong Ma, Yulun Zhang, Jiakai Wang, Jinyang Guo, Haotong Qin,
Abstract summary: This paper proposes a novel method, namely BiDM, for fully binarizing weights and activations of DMs, pushing quantization to the 1-bit limit.<n>As the first work to fully binarize DMs, the W1A1 BiDM on the LDM-4 model for LSUN-Bedrooms 256$times$256 achieves a remarkable FID of 22.74.
Score: 60.018246440536814
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models (DMs) have been significantly developed and widely used in various applications due to their excellent generative qualities. However, the expensive computation and massive parameters of DMs hinder their practical use in resource-constrained scenarios. As one of the effective compression approaches, quantization allows DMs to achieve storage saving and inference acceleration by reducing bit-width while maintaining generation performance. However, as the most extreme quantization form, 1-bit binarization causes the generation performance of DMs to face severe degradation or even collapse. This paper proposes a novel method, namely BiDM, for fully binarizing weights and activations of DMs, pushing quantization to the 1-bit limit. From a temporal perspective, we introduce the Timestep-friendly Binary Structure (TBS), which uses learnable activation binarizers and cross-timestep feature connections to address the highly timestep-correlated activation features of DMs. From a spatial perspective, we propose Space Patched Distillation (SPD) to address the difficulty of matching binary features during distillation, focusing on the spatial locality of image generation tasks and noise estimation networks. As the first work to fully binarize DMs, the W1A1 BiDM on the LDM-4 model for LSUN-Bedrooms 256$\times$256 achieves a remarkable FID of 22.74, significantly outperforming the current state-of-the-art general binarization methods with an FID of 59.44 and invalid generative samples, and achieves up to excellent 28.0 times storage and 52.7 times OPs savings. The code is available at https://github.com/Xingyu-Zheng/BiDM .

Related papers

Adaptive Gate-Aware Mamba Networks for Magnetic Resonance Fingerprinting [8.673589204148115]
Magnetic Resonance Fingerprinting (MRF) enables fast quantitative imaging by matching signal evolutions to a predefined dictionary.<n>Recent work has explored deep learning-based approaches as alternatives to DM.<n>We propose GAST-Mamba, an end-to-end framework that combines a dual Mamba-based encoder with a Gate-Aware Spatial-Temporal (GAST) processor.
arXiv Detail & Related papers (2025-07-04T08:04:01Z)
Sparse-to-Sparse Training of Diffusion Models [13.443846454835867]
This paper introduces, for the first time, the paradigm of sparse-to-sparse training to DMs. We focus on unconditional generation and train sparse DMs from scratch on six datasets. Our experiments show that sparse DMs are able to match and often outperform their counterparts, while substantially reducing the number of trainable parameters and FLOPs.
arXiv Detail & Related papers (2025-04-30T07:28:11Z)
BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution [63.777210548110425]
We propose BiMaCoSR, which combines binarization and one-step distillation to obtain extreme compression and acceleration. BiMaCoSR achieves a 23.8x compression ratio and a 27.4x speedup ratio compared to FP counterpart.
arXiv Detail & Related papers (2025-02-01T06:34:55Z)
MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models [37.061975191553]
This paper presents MPQ-DM, a Mixed-Precision Quantization method for Diffusion Models. To mitigate the quantization error caused by outlier severe weight channels, we propose an Outlier-Driven Mixed Quantization technique. To robustly learn representations crossing time steps, we construct a Time-Smoothed Relation Distillation scheme.
arXiv Detail & Related papers (2024-12-16T08:31:55Z)
ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction [55.03585818289934]
Autoregressive models (ARMs) and diffusion models (DMs) represent two leading paradigms in generative modeling. We introduce Autoregressive Coherent multimodal generation with Diffusion Correction (ACDC) ACDC combines the strengths of both ARMs and DMs at the inference stage without the need for additional fine-tuning.
arXiv Detail & Related papers (2024-10-07T03:22:51Z)
DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture [69.58440626023541]
Diffusion models (DMs) have demonstrated exceptional generative capabilities across various areas. The most common way to accelerate DMs involves reducing the number of denoising steps during generation. We propose a novel method that transfers the capability of large pretrained DMs to faster architectures.
arXiv Detail & Related papers (2024-09-05T14:12:22Z)
Binarized Diffusion Model for Image Super-Resolution [61.963833405167875]
Binarization, an ultra-compression algorithm, offers the potential for effectively accelerating advanced diffusion models (DMs) Existing binarization methods result in significant performance degradation. We introduce a novel binarized diffusion model, BI-DiffSR, for image SR.
arXiv Detail & Related papers (2024-06-09T10:30:25Z)
BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models [39.287947829085155]
This paper proposes a novel weight binarization approach for DMs, namely BinaryDM, pushing binarized DMs to be accurate and efficient. From the representation perspective, we present an Evolvable-Basis Binarizer (EBB) to enable a smooth evolution of DMs from full-precision to accurately binarized. Experiments demonstrate that BinaryDM achieves significant accuracy and efficiency gains compared to SOTA quantization methods of DMs under ultra-low bit-widths.
arXiv Detail & Related papers (2024-04-08T16:46:25Z)
Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks [82.18396309806577]
We propose a novel activation quantizer, referred to as Dynamic Dual Trainable Bounds (DDTB) Our DDTB exhibits significant performance improvements in ultra-low precision. For example, our DDTB achieves a 0.70dB PSNR increase on Urban100 benchmark when quantizing EDSR to 2-bit and scaling up output images to x4.
arXiv Detail & Related papers (2022-03-08T04:26:18Z)
PAMS: Quantized Super-Resolution via Parameterized Max Scale [84.55675222525608]
Deep convolutional neural networks (DCNNs) have shown dominant performance in the task of super-resolution (SR) We propose a new quantization scheme termed PArameterized Max Scale (PAMS), which applies the trainable truncated parameter to explore the upper bound of the quantization range adaptively. Experiments demonstrate that the proposed PAMS scheme can well compress and accelerate the existing SR models such as EDSR and RDN.
arXiv Detail & Related papers (2020-11-09T06:16:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.