Clockwork Diffusion: Efficient Generation With Model-Step Distillation
- URL: http://arxiv.org/abs/2312.08128v2
- Date: Tue, 20 Feb 2024 14:50:23 GMT
- Title: Clockwork Diffusion: Efficient Generation With Model-Step Distillation
- Authors: Amirhossein Habibian, Amir Ghodrati, Noor Fathima, Guillaume Sautiere,
Risheek Garrepalli, Fatih Porikli, Jens Petersen
- Abstract summary: Clockwork Diffusion is a method that periodically reuses computation from preceding denoising steps to approximate low-res feature maps at one or more subsequent steps.
For both text-to-image generation and image editing, we demonstrate that Clockwork leads to comparable or improved perceptual scores with drastically reduced computational complexity.
- Score: 42.01130983628078
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work aims to improve the efficiency of text-to-image diffusion models.
While diffusion models use computationally expensive UNet-based denoising
operations in every generation step, we identify that not all operations are
equally relevant for the final output quality. In particular, we observe that
UNet layers operating on high-res feature maps are relatively sensitive to
small perturbations. In contrast, low-res feature maps influence the semantic
layout of the final image and can often be perturbed with no noticeable change
in the output. Based on this observation, we propose Clockwork Diffusion, a
method that periodically reuses computation from preceding denoising steps to
approximate low-res feature maps at one or more subsequent steps. For multiple
baselines, and for both text-to-image generation and image editing, we
demonstrate that Clockwork leads to comparable or improved perceptual scores
with drastically reduced computational complexity. As an example, for Stable
Diffusion v1.5 with 8 DPM++ steps we save 32% of FLOPs with negligible FID and
CLIP change.
Related papers
- WiNet: Wavelet-based Incremental Learning for Efficient Medical Image Registration [68.25711405944239]
Deep image registration has demonstrated exceptional accuracy and fast inference.
Recent advances have adopted either multiple cascades or pyramid architectures to estimate dense deformation fields in a coarse-to-fine manner.
We introduce a model-driven WiNet that incrementally estimates scale-wise wavelet coefficients for the displacement/velocity field across various scales.
arXiv Detail & Related papers (2024-07-18T11:51:01Z) - LighTDiff: Surgical Endoscopic Image Low-Light Enhancement with T-Diffusion [23.729378821117123]
Denoising Diffusion Probabilistic Model (DDPM) holds promise for low-light image enhancement in medical field.
DDPMs are computationally demanding and slow, limiting their practical medical applications.
We propose a lightweight DDPM, dubbed LighTDiff, to capture global structural information using low-resolution images.
arXiv Detail & Related papers (2024-05-17T05:31:19Z) - Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference [95.42299246592756]
We study the UNet encoder and empirically analyze the encoder features.
We find that encoder features change minimally, whereas the decoder features exhibit substantial variations across different time-steps.
We validate our approach on other tasks: text-to-video, personalized generation and reference-guided generation.
arXiv Detail & Related papers (2023-12-15T08:46:43Z) - Cache Me if You Can: Accelerating Diffusion Models through Block Caching [67.54820800003375]
A large image-to-image network has to be applied many times to iteratively refine an image from random noise.
We investigate the behavior of the layers within the network and find that 1) the layers' output changes smoothly over time, 2) the layers show distinct patterns of change, and 3) the change from step to step is often very small.
We propose a technique to automatically determine caching schedules based on each block's changes over timesteps.
arXiv Detail & Related papers (2023-12-06T00:51:38Z) - Improving Denoising Diffusion Models via Simultaneous Estimation of
Image and Noise [15.702941058218196]
This paper introduces two key contributions aimed at improving the speed and quality of images generated through inverse diffusion processes.
The first contribution involves re parameterizing the diffusion process in terms of the angle on a quarter-circular arc between the image and noise.
The second contribution is to directly estimate both the image ($mathbfx_0$) and noise ($mathbfepsilon$) using our network.
arXiv Detail & Related papers (2023-10-26T05:43:07Z) - Towards More Accurate Diffusion Model Acceleration with A Timestep
Aligner [84.97253871387028]
A diffusion model, which is formulated to produce an image using thousands of denoising steps, usually suffers from a slow inference speed.
We propose a timestep aligner that helps find a more accurate integral direction for a particular interval at the minimum cost.
Experiments show that our plug-in design can be trained efficiently and boost the inference performance of various state-of-the-art acceleration methods.
arXiv Detail & Related papers (2023-10-14T02:19:07Z) - Gradpaint: Gradient-Guided Inpainting with Diffusion Models [71.47496445507862]
Denoising Diffusion Probabilistic Models (DDPMs) have recently achieved remarkable results in conditional and unconditional image generation.
We present GradPaint, which steers the generation towards a globally coherent image.
We generalizes well to diffusion models trained on various datasets, improving upon current state-of-the-art supervised and unsupervised methods.
arXiv Detail & Related papers (2023-09-18T09:36:24Z) - Effective Real Image Editing with Accelerated Iterative Diffusion
Inversion [6.335245465042035]
It is still challenging to edit and manipulate natural images with modern generative models.
Existing approaches that have tackled the problem of inversion stability often incur in significant trade-offs in computational efficiency.
We propose an Accelerated Iterative Diffusion Inversion method, dubbed AIDI, that significantly improves reconstruction accuracy with minimal additional overhead in space and time complexity.
arXiv Detail & Related papers (2023-09-10T01:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.