SpeedUpNet: A Plug-and-Play Hyper-Network for Accelerating Text-to-Image
Diffusion Models
- URL: http://arxiv.org/abs/2312.08887v3
- Date: Wed, 20 Dec 2023 08:02:09 GMT
- Title: SpeedUpNet: A Plug-and-Play Hyper-Network for Accelerating Text-to-Image
Diffusion Models
- Authors: Weilong Chai, DanDan Zheng, Jiajiong Cao, Zhiquan Chen, Changbao Wang,
Chenguang Ma
- Abstract summary: We propose a novel Stable-Diffusion (SD) acceleration module called SpeedUpNet(SUN)
SUN can be directly plugged into various fine-tuned SD models without extra training.
SUN significantly reduces the number of inference steps to just 4 steps and eliminates the need for classifier-free guidance.
- Score: 4.484567783139048
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Text-to-image diffusion models (SD) exhibit significant advancements while
requiring extensive computational resources. Though many acceleration methods
have been proposed, they suffer from generation quality degradation or extra
training cost generalizing to new fine-tuned models. To address these
limitations, we propose a novel and universal Stable-Diffusion (SD)
acceleration module called SpeedUpNet(SUN). SUN can be directly plugged into
various fine-tuned SD models without extra training. This technique utilizes
cross-attention layers to learn the relative offsets in the generated image
results between negative and positive prompts achieving classifier-free
guidance distillation with negative prompts controllable, and introduces a
Multi-Step Consistency (MSC) loss to ensure a harmonious balance between
reducing inference steps and maintaining consistency in the generated output.
Consequently, SUN significantly reduces the number of inference steps to just 4
steps and eliminates the need for classifier-free guidance. It leads to an
overall speedup of more than 10 times for SD models compared to the
state-of-the-art 25-step DPM-solver++, and offers two extra advantages: (1)
classifier-free guidance distillation with controllable negative prompts and
(2) seamless integration into various fine-tuned Stable-Diffusion models
without training. The effectiveness of the SUN has been verified through
extensive experimentation. Project Page:
https://williechai.github.io/speedup-plugin-for-stable-diffusions.github.io
Related papers
- FORA: Fast-Forward Caching in Diffusion Transformer Acceleration [39.51519525071639]
Diffusion transformers (DiT) have become the de facto choice for generating high-quality images and videos.
Fast-FORward CAching (FORA) is designed to accelerate DiT by exploiting the repetitive nature of the diffusion process.
arXiv Detail & Related papers (2024-07-01T16:14:37Z) - Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis [20.2271205957037]
Hyper-SD is a novel framework that amalgamates the advantages of ODE Trajectory Preservation and Reformulation.
We introduce Trajectory Segmented Consistency Distillation to progressively perform consistent distillation within pre-defined time-step segments.
We incorporate human feedback learning to boost the performance of the model in a low-step regime and mitigate the performance loss incurred by the distillation process.
arXiv Detail & Related papers (2024-04-21T15:16:05Z) - Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation [24.236841051249243]
Distillation methods aim to shift the model from many-shot to single-step inference.
We introduce Latent Adversarial Diffusion Distillation (LADD), a novel distillation approach overcoming the limitations of ADD.
In contrast to pixel-based ADD, LADD utilizes generative features from pretrained latent diffusion models.
arXiv Detail & Related papers (2024-03-18T17:51:43Z) - T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with
Trajectory Stitching [143.72720563387082]
Trajectory Stitching T-Stitch is a simple yet efficient technique to improve the sampling efficiency with little or no generation degradation.
Our key insight is that different diffusion models learn similar encodings under the same training data distribution.
Our method can also be used as a drop-in technique to accelerate the popular pretrained stable diffusion (SD) models.
arXiv Detail & Related papers (2024-02-21T23:08:54Z) - A-SDM: Accelerating Stable Diffusion through Redundancy Removal and
Performance Optimization [54.113083217869516]
In this work, we first explore the computational redundancy part of the network.
We then prune the redundancy blocks of the model and maintain the network performance.
Thirdly, we propose a global-regional interactive (GRI) attention to speed up the computationally intensive attention part.
arXiv Detail & Related papers (2023-12-24T15:37:47Z) - ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models [59.90959789767886]
We show that optimizing consistency training loss minimizes the Wasserstein distance between target and generated distributions.
By incorporating a discriminator into the consistency training framework, our method achieves improved FID scores on CIFAR10 and ImageNet 64$times$64 and LSUN Cat 256$times$256 datasets.
arXiv Detail & Related papers (2023-11-23T16:49:06Z) - SinSR: Diffusion-Based Image Super-Resolution in a Single Step [119.18813219518042]
Super-resolution (SR) methods based on diffusion models exhibit promising results.
But their practical application is hindered by the substantial number of required inference steps.
We propose a simple yet effective method for achieving single-step SR generation, named SinSR.
arXiv Detail & Related papers (2023-11-23T16:21:29Z) - Towards More Accurate Diffusion Model Acceleration with A Timestep
Aligner [84.97253871387028]
A diffusion model, which is formulated to produce an image using thousands of denoising steps, usually suffers from a slow inference speed.
We propose a timestep aligner that helps find a more accurate integral direction for a particular interval at the minimum cost.
Experiments show that our plug-in design can be trained efficiently and boost the inference performance of various state-of-the-art acceleration methods.
arXiv Detail & Related papers (2023-10-14T02:19:07Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - Wavelet Diffusion Models are fast and scalable Image Generators [3.222802562733787]
Diffusion models are a powerful solution for high-fidelity image generation, which exceeds GANs in quality in many circumstances.
Recent DiffusionGAN method significantly decreases the models' running time by reducing the number of sampling steps from thousands to several, but their speeds still largely lag behind the GAN counterparts.
This paper aims to reduce the speed gap by proposing a novel wavelet-based diffusion scheme.
We extract low-and-high frequency components from both image and feature levels via wavelet decomposition and adaptively handle these components for faster processing while maintaining good generation quality.
arXiv Detail & Related papers (2022-11-29T12:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.