Optimal Linear Subspace Search: Learning to Construct Fast and
High-Quality Schedulers for Diffusion Models
- URL: http://arxiv.org/abs/2305.14677v2
- Date: Fri, 11 Aug 2023 03:11:41 GMT
- Title: Optimal Linear Subspace Search: Learning to Construct Fast and
High-Quality Schedulers for Diffusion Models
- Authors: Zhongjie Duan, Chengyu Wang, Cen Chen, Jun Huang and Weining Qian
- Abstract summary: Key issue currently limiting the application of diffusion models is its extremely slow generation process.
We propose a novel method called Optimal Linear Subspace Search (OLSS)
OLSS is able to generate high-quality images with a very small number of steps.
- Score: 18.026820439151404
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, diffusion models have become the most popular and powerful
methods in the field of image synthesis, even rivaling human artists in
artistic creativity. However, the key issue currently limiting the application
of diffusion models is its extremely slow generation process. Although several
methods were proposed to speed up the generation process, there still exists a
trade-off between efficiency and quality. In this paper, we first provide a
detailed theoretical and empirical analysis of the generation process of the
diffusion models based on schedulers. We transform the designing problem of
schedulers into the determination of several parameters, and further transform
the accelerated generation process into an expansion process of the linear
subspace. Based on these analyses, we consequently propose a novel method
called Optimal Linear Subspace Search (OLSS), which accelerates the generation
process by searching for the optimal approximation process of the complete
generation process in the linear subspaces spanned by latent variables. OLSS is
able to generate high-quality images with a very small number of steps. To
demonstrate the effectiveness of our method, we conduct extensive comparative
experiments on open-source diffusion models. Experimental results show that
with a given number of steps, OLSS can significantly improve the quality of
generated images. Using an NVIDIA A100 GPU, we make it possible to generate a
high-quality image by Stable Diffusion within only one second without other
optimization techniques.
Related papers
- Saliency Guided Optimization of Diffusion Latents [9.237421522280819]
The key to text-to-image generation is how to optimize the results of a text-to-image generation model so that they can be better aligned with human intentions or prompts.
These methods overlook the fact that when viewing an image, the human visual system naturally prioritizes attention toward salient areas, often neglecting less or non-salient regions.
We propose Saliency Guided Optimization Of Latents (SGOOL) to address this alignment challenge effectively and efficiently.
arXiv Detail & Related papers (2024-10-14T08:12:42Z) - Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization [97.35427957922714]
We present an algorithm named pairwise sample optimization (PSO), which enables the direct fine-tuning of an arbitrary timestep-distilled diffusion model.
PSO introduces additional reference images sampled from the current time-step distilled model, and increases the relative likelihood margin between the training images and reference images.
We show that PSO can directly adapt distilled models to human-preferred generation with both offline and online-generated pairwise preference image data.
arXiv Detail & Related papers (2024-10-04T07:05:16Z) - Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR)
In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks.
We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z) - One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts.
Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation.
We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z) - Beta Sampling is All You Need: Efficient Image Generation Strategy for Diffusion Models using Stepwise Spectral Analysis [22.02829139522153]
We propose an efficient time step sampling method based on an image spectral analysis of the diffusion process.
Instead of the traditional uniform distribution-based time step sampling, we introduce a Beta distribution-like sampling technique.
Our hypothesis is that certain steps exhibit significant changes in image content, while others contribute minimally.
arXiv Detail & Related papers (2024-07-16T20:53:06Z) - OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control [66.03885917320189]
OrientDream is a camera orientation conditioned framework for efficient and multi-view consistent 3D generation from textual prompts.
Our strategy emphasizes the implementation of an explicit camera orientation conditioned feature in the pre-training of a 2D text-to-image diffusion module.
Our experiments reveal that our method not only produces high-quality NeRF models with consistent multi-view properties but also achieves an optimization speed significantly greater than existing methods.
arXiv Detail & Related papers (2024-06-14T13:16:18Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - AutoDiffusion: Training-Free Optimization of Time Steps and
Architectures for Automated Diffusion Model Acceleration [57.846038404893626]
We propose to search the optimal time steps sequence and compressed model architecture in a unified framework to achieve effective image generation for diffusion models without any further training.
Experimental results show that our method achieves excellent performance by using only a few time steps, e.g. 17.86 FID score on ImageNet 64 $times$ 64 with only four steps, compared to 138.66 with DDIM.
arXiv Detail & Related papers (2023-09-19T08:57:24Z) - Diffusion Sampling with Momentum for Mitigating Divergence Artifacts [10.181486597424486]
We investigate the potential causes of divergence artifacts and suggest that the small stability regions of numerical methods could be the principal cause.
The first technique involves the incorporation of Heavy Ball (HB) momentum, a well-known technique for improving optimization, into existing diffusion numerical methods to expand their stability regions.
The second technique, called Generalized Heavy Ball (GHVB), constructs a new high-order method that offers a variable trade-off between accuracy and artifact suppression.
arXiv Detail & Related papers (2023-07-20T14:37:30Z) - Nested Diffusion Processes for Anytime Image Generation [38.84966342097197]
We propose an anytime diffusion-based method that can generate viable images when stopped at arbitrary times before completion.
In experiments on ImageNet and Stable Diffusion-based text-to-image generation, we show, both qualitatively and quantitatively, that our method's intermediate generation quality greatly exceeds that of the original diffusion model.
arXiv Detail & Related papers (2023-05-30T14:28:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.