TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
- URL: http://arxiv.org/abs/2311.16503v3
- Date: Mon, 11 Mar 2024 10:40:40 GMT
- Title: TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
- Authors: Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu
- Abstract summary: Diffusion models heavily depend on the time-step $t$ to achieve satisfactory multi-round denoising.
We propose a Temporal Feature Maintenance Quantization (TFMQ) framework building upon a Temporal Information Block.
Powered by the pioneering block design, we devise temporal information aware reconstruction (TIAR) and finite set calibration (FSC) to align the full-precision temporal features.
- Score: 52.454274602380124
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Diffusion model, a prevalent framework for image generation, encounters
significant challenges in terms of broad applicability due to its extended
inference times and substantial memory requirements. Efficient Post-training
Quantization (PTQ) is pivotal for addressing these issues in traditional
models. Different from traditional models, diffusion models heavily depend on
the time-step $t$ to achieve satisfactory multi-round denoising. Usually, $t$
from the finite set $\{1, \ldots, T\}$ is encoded to a temporal feature by a
few modules totally irrespective of the sampling data. However, existing PTQ
methods do not optimize these modules separately. They adopt inappropriate
reconstruction targets and complex calibration methods, resulting in a severe
disturbance of the temporal feature and denoising trajectory, as well as a low
compression efficiency. To solve these, we propose a Temporal Feature
Maintenance Quantization (TFMQ) framework building upon a Temporal Information
Block which is just related to the time-step $t$ and unrelated to the sampling
data. Powered by the pioneering block design, we devise temporal information
aware reconstruction (TIAR) and finite set calibration (FSC) to align the
full-precision temporal features in a limited time. Equipped with the
framework, we can maintain the most temporal information and ensure the
end-to-end generation quality. Extensive experiments on various datasets and
diffusion models prove our state-of-the-art results. Remarkably, our
quantization approach, for the first time, achieves model performance nearly on
par with the full-precision model under 4-bit weight quantization.
Additionally, our method incurs almost no extra computational cost and
accelerates quantization time by $2.0 \times$ on LSUN-Bedrooms $256 \times 256$
compared to previous works. Our code is publicly available at
https://github.com/ModelTC/TFMQ-DM.
Related papers
- Retrieval-Augmented Diffusion Models for Time Series Forecasting [19.251274915003265]
We propose a Retrieval- Augmented Time series Diffusion model (RATD)
RATD consists of two parts: an embedding-based retrieval process and a reference-guided diffusion model.
Our approach allows leveraging meaningful samples within the database to aid in sampling, thus maximizing the utilization of datasets.
arXiv Detail & Related papers (2024-10-24T13:14:39Z) - MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters [6.733646592789575]
Long-term Time Series Forecasting (LTSF) involves predicting long-term values by analyzing a large amount of historical time-series data to identify patterns and trends.
Transformer-based models offer high forecasting accuracy, but they are often too compute-intensive to be deployed on devices with hardware constraints.
We propose MixLinear, an ultra-lightweight time series forecasting model specifically designed for resource-constrained devices.
arXiv Detail & Related papers (2024-10-02T23:04:57Z) - Temporal Feature Matters: A Framework for Diffusion Model Quantization [105.3033493564844]
Diffusion models rely on the time-step for the multi-round denoising.
We introduce a novel quantization framework that includes three strategies.
This framework preserves most of the temporal information and ensures high-quality end-to-end generation.
arXiv Detail & Related papers (2024-07-28T17:46:15Z) - TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models [40.5153344875351]
We introduce TMPQ-DM, which jointly optimize timestep reduction and quantization to achieve a superior performance-efficiency trade-off.
For timestep reduction, we devise a non-uniform grouping scheme tailored to the non-uniform nature of the denoising process.
In terms of quantization, we adopt a fine-grained layer-wise approach to allocate varying bit-widths to different layers based on their respective contributions to the final generative performance.
arXiv Detail & Related papers (2024-04-15T07:51:40Z) - One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion
Schedule Flaws and Enhancing Low-Frequency Controls [77.42510898755037]
One More Step (OMS) is a compact network that incorporates an additional simple yet effective step during inference.
OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.
Once trained, various pre-trained diffusion models with the same latent domain can share the same OMS module.
arXiv Detail & Related papers (2023-11-27T12:02:42Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Q-Diffusion: Quantizing Diffusion Models [52.978047249670276]
Post-training quantization (PTQ) is considered a go-to compression method for other tasks.
We propose a novel PTQ method specifically tailored towards the unique multi-timestep pipeline and model architecture.
We show that our proposed method is able to quantize full-precision unconditional diffusion models into 4-bit while maintaining comparable performance.
arXiv Detail & Related papers (2023-02-08T19:38:59Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.