Fast Sampling Through The Reuse Of Attention Maps In Diffusion Models
- URL: http://arxiv.org/abs/2401.01008v2
- Date: Fri, 24 May 2024 16:23:38 GMT
- Title: Fast Sampling Through The Reuse Of Attention Maps In Diffusion Models
- Authors: Rosco Hunter, Ćukasz Dudziak, Mohamed S. Abdelfattah, Abhinav Mehrotra, Sourav Bhattacharya, Hongkai Wen,
- Abstract summary: Text-to-image diffusion models have demonstrated unprecedented capabilities for flexible and realistic image synthesis.
These models rely on a time-consuming sampling procedure, which has motivated attempts to reduce their latency.
Our approach seeks to reduce latency directly, without any retraining, fine-tuning, or knowledge distillation.
We empirically compare these reuse strategies with few-step sampling procedures of comparable latency, finding that reuse generates images that are closer to those produced by the original high-latency diffusion model.
- Score: 11.257468339231362
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Text-to-image diffusion models have demonstrated unprecedented capabilities for flexible and realistic image synthesis. Nevertheless, these models rely on a time-consuming sampling procedure, which has motivated attempts to reduce their latency. When improving efficiency, researchers often use the original diffusion model to train an additional network designed specifically for fast image generation. In contrast, our approach seeks to reduce latency directly, without any retraining, fine-tuning, or knowledge distillation. In particular, we find the repeated calculation of attention maps to be costly yet redundant, and instead suggest reusing them during sampling. Our specific reuse strategies are based on ODE theory, which implies that the later a map is reused, the smaller the distortion in the final image. We empirically compare these reuse strategies with few-step sampling procedures of comparable latency, finding that reuse generates images that are closer to those produced by the original high-latency diffusion model.
Related papers
- Lossy Image Compression with Foundation Diffusion Models [10.407650300093923]
In this work we formulate the removal of quantization error as a denoising task, using diffusion to recover lost information in the transmitted image latent.
Our approach allows us to perform less than 10% of the full diffusion generative process and requires no architectural changes to the backbone.
arXiv Detail & Related papers (2024-04-12T16:23:42Z) - ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations.
We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z) - Efficient Diffusion Model for Image Restoration by Residual Shifting [63.02725947015132]
This study proposes a novel and efficient diffusion model for image restoration.
Our method avoids the need for post-acceleration during inference, thereby avoiding the associated performance deterioration.
Our method achieves superior or comparable performance to current state-of-the-art methods on three classical IR tasks.
arXiv Detail & Related papers (2024-03-12T05:06:07Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - Deceptive-NeRF/3DGS: Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction [60.52716381465063]
We introduce Deceptive-NeRF/3DGS to enhance sparse-view reconstruction with only a limited set of input images.
Specifically, we propose a deceptive diffusion model turning noisy images rendered from few-view reconstructions into high-quality pseudo-observations.
Our system progressively incorporates diffusion-generated pseudo-observations into the training image sets, ultimately densifying the sparse input observations by 5 to 10 times.
arXiv Detail & Related papers (2023-05-24T14:00:32Z) - ReDi: Efficient Learning-Free Diffusion Inference via Trajectory
Retrieval [68.7008281316644]
ReDi is a learning-free Retrieval-based Diffusion sampling framework.
We show that ReDi improves the model inference efficiency by 2x speedup.
arXiv Detail & Related papers (2023-02-05T03:01:28Z) - Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models
for Inverse Problems through Stochastic Contraction [31.61199061999173]
Diffusion models have a critical downside - they are inherently slow to sample from, needing few thousand steps of iteration to generate images from pure Gaussian noise.
We show that starting from Gaussian noise is unnecessary. Instead, starting from a single forward diffusion with better initialization significantly reduces the number of sampling steps in the reverse conditional diffusion.
New sampling strategy, dubbed ComeCloser-DiffuseFaster (CCDF), also reveals a new insight on how the existing feedforward neural network approaches for inverse problems can be synergistically combined with the diffusion models.
arXiv Detail & Related papers (2021-12-09T04:28:41Z) - Deblurring via Stochastic Refinement [85.42730934561101]
We present an alternative framework for blind deblurring based on conditional diffusion models.
Our method is competitive in terms of distortion metrics such as PSNR.
arXiv Detail & Related papers (2021-12-05T04:36:09Z) - Inverting Adversarially Robust Networks for Image Synthesis [37.927552662984034]
We propose the use of robust representations as a perceptual primitive for feature inversion models.
We empirically show that adopting robust representations as an image prior significantly improves the reconstruction accuracy of CNN-based feature inversion models.
Following these findings, we propose an encoding-decoding network based on robust representations and show its advantages for applications such as anomaly detection, style transfer and image denoising.
arXiv Detail & Related papers (2021-06-13T05:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.