Related papers: Fast Sampling Through The Reuse Of Attention Maps In Diffusion Models

Fast Sampling Through The Reuse Of Attention Maps In Diffusion Models

URL: http://arxiv.org/abs/2401.01008v2
Date: Fri, 24 May 2024 16:23:38 GMT
Title: Fast Sampling Through The Reuse Of Attention Maps In Diffusion Models
Authors: Rosco Hunter, Łukasz Dudziak, Mohamed S. Abdelfattah, Abhinav Mehrotra, Sourav Bhattacharya, Hongkai Wen,
Abstract summary: Text-to-image diffusion models have demonstrated unprecedented capabilities for flexible and realistic image synthesis. These models rely on a time-consuming sampling procedure, which has motivated attempts to reduce their latency. Our approach seeks to reduce latency directly, without any retraining, fine-tuning, or knowledge distillation. We empirically compare these reuse strategies with few-step sampling procedures of comparable latency, finding that reuse generates images that are closer to those produced by the original high-latency diffusion model.
Score: 11.257468339231362
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Text-to-image diffusion models have demonstrated unprecedented capabilities for flexible and realistic image synthesis. Nevertheless, these models rely on a time-consuming sampling procedure, which has motivated attempts to reduce their latency. When improving efficiency, researchers often use the original diffusion model to train an additional network designed specifically for fast image generation. In contrast, our approach seeks to reduce latency directly, without any retraining, fine-tuning, or knowledge distillation. In particular, we find the repeated calculation of attention maps to be costly yet redundant, and instead suggest reusing them during sampling. Our specific reuse strategies are based on ODE theory, which implies that the later a map is reused, the smaller the distortion in the final image. We empirically compare these reuse strategies with few-step sampling procedures of comparable latency, finding that reuse generates images that are closer to those produced by the original high-latency diffusion model.

Related papers

PWD: Prior-Guided and Wavelet-Enhanced Diffusion Model for Limited-Angle CT [6.532073662427578]
We propose a prior information embedding and wavelet feature fusion fast sampling diffusion model for LACT reconstruction.<n>The PWD enables efficient sampling while preserving reconstruction fidelity in LACT.<n>Using only 50 sampling steps, PWD achieves at least 1.7 dB improvement in PSNR and 10% gain in SSIM.
arXiv Detail & Related papers (2025-06-30T08:28:32Z)
Time Step Generating: A Universal Synthesized Deepfake Image Detector [0.4488895231267077]
We propose a universal synthetic image detector Time Step Generating (TSG) TSG does not rely on pre-trained models' reconstructing ability, specific datasets, or sampling algorithms. We test the proposed TSG on the large-scale GenImage benchmark and it achieves significant improvements in both accuracy and generalizability.
arXiv Detail & Related papers (2024-11-17T09:39:50Z)
Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
Diffusion models have dominated the field of large, generative image models. We propose an algorithm for fast-constrained sampling in large pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-24T14:52:38Z)
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization [97.35427957922714]
We present an algorithm named pairwise sample optimization (PSO), which enables the direct fine-tuning of an arbitrary timestep-distilled diffusion model. PSO introduces additional reference images sampled from the current time-step distilled model, and increases the relative likelihood margin between the training images and reference images. We show that PSO can directly adapt distilled models to human-preferred generation with both offline and online-generated pairwise preference image data.
arXiv Detail & Related papers (2024-10-04T07:05:16Z)
Sequential Posterior Sampling with Diffusion Models [15.028061496012924]
We propose a novel approach that models the transition dynamics to improve the efficiency of sequential diffusion posterior sampling in conditional image synthesis. We demonstrate the effectiveness of our approach on a real-world dataset of high frame rate cardiac ultrasound images. Our method opens up new possibilities for real-time applications of diffusion models in imaging and other domains requiring real-time inference.
arXiv Detail & Related papers (2024-09-09T07:55:59Z)
One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts. Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation. We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z)
Lossy Image Compression with Foundation Diffusion Models [10.407650300093923]
In this work we formulate the removal of quantization error as a denoising task, using diffusion to recover lost information in the transmitted image latent. Our approach allows us to perform less than 10% of the full diffusion generative process and requires no architectural changes to the diffusion model.
arXiv Detail & Related papers (2024-04-12T16:23:42Z)
ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations. We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z)
Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z)
ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval [68.7008281316644]
ReDi is a learning-free Retrieval-based Diffusion sampling framework. We show that ReDi improves the model inference efficiency by 2x speedup.
arXiv Detail & Related papers (2023-02-05T03:01:28Z)
Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction [31.61199061999173]
Diffusion models have a critical downside - they are inherently slow to sample from, needing few thousand steps of iteration to generate images from pure Gaussian noise. We show that starting from Gaussian noise is unnecessary. Instead, starting from a single forward diffusion with better initialization significantly reduces the number of sampling steps in the reverse conditional diffusion. New sampling strategy, dubbed ComeCloser-DiffuseFaster (CCDF), also reveals a new insight on how the existing feedforward neural network approaches for inverse problems can be synergistically combined with the diffusion models.
arXiv Detail & Related papers (2021-12-09T04:28:41Z)
Deblurring via Stochastic Refinement [85.42730934561101]
We present an alternative framework for blind deblurring based on conditional diffusion models. Our method is competitive in terms of distortion metrics such as PSNR.
arXiv Detail & Related papers (2021-12-05T04:36:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.