Elucidating the Design Space of Diffusion-Based Generative Models
- URL: http://arxiv.org/abs/2206.00364v1
- Date: Wed, 1 Jun 2022 10:03:24 GMT
- Title: Elucidating the Design Space of Diffusion-Based Generative Models
- Authors: Tero Karras, Miika Aittala, Timo Aila, Samuli Laine
- Abstract summary: We present a design space that clearly separates the concrete design choices.
This lets us identify several changes to both the sampling and training processes, as well as preconditioning of the score networks.
Our improvements yield new state-of-the-art FID of 1.79 for CIFAR-10 in a class-conditional setting and 1.97 in an unconditional setting.
- Score: 37.643953493556765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We argue that the theory and practice of diffusion-based generative models
are currently unnecessarily convoluted and seek to remedy the situation by
presenting a design space that clearly separates the concrete design choices.
This lets us identify several changes to both the sampling and training
processes, as well as preconditioning of the score networks. Together, our
improvements yield new state-of-the-art FID of 1.79 for CIFAR-10 in a
class-conditional setting and 1.97 in an unconditional setting, with much
faster sampling (35 network evaluations per image) than prior designs. To
further demonstrate their modular nature, we show that our design changes
dramatically improve both the efficiency and quality obtainable with
pre-trained score networks from previous work, including improving the FID of
an existing ImageNet-64 model from 2.07 to near-SOTA 1.55.
Related papers
- Stable Consistency Tuning: Understanding and Improving Consistency Models [40.2712218203989]
Diffusion models achieve superior generation quality but suffer from slow generation speed due to iterative nature of denoising.
consistency models, a new generative family, achieve competitive performance with significantly faster sampling.
We propose a novel framework for understanding consistency models by modeling the denoising process of the diffusion model as a Markov Decision Process (MDP) and framing consistency model training as the value estimation through Temporal Difference(TD) Learning.
arXiv Detail & Related papers (2024-10-24T17:55:52Z) - Rethinking Iterative Stereo Matching from Diffusion Bridge Model Perspective [0.0]
We propose a novel training approach that incorporates diffusion models into the iterative optimization process.
Our model ranks first in the Scene Flow dataset, achieving over a 7% improvement compared to competing methods.
arXiv Detail & Related papers (2024-04-13T17:31:11Z) - Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs.
We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z) - Diffusion Model for Data-Driven Black-Box Optimization [54.25693582870226]
We focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization.
We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons.
Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models.
arXiv Detail & Related papers (2024-03-20T00:41:12Z) - Improving Diffusion-Based Generative Models via Approximated Optimal
Transport [41.25847212384836]
We introduce the Approximated Optimal Transport technique, a novel training scheme for diffusion-based generative models.
We achieve superior image quality and reduced sampling steps by employing AOT in training.
arXiv Detail & Related papers (2024-03-08T05:43:00Z) - Analyzing and Improving the Training Dynamics of Diffusion Models [36.37845647984578]
We identify and rectify several causes for uneven and ineffective training in the popular ADM diffusion model architecture.
We find that systematic application of this philosophy eliminates the observed drifts and imbalances, resulting in considerably better networks at equal computational complexity.
arXiv Detail & Related papers (2023-12-05T11:55:47Z) - Systematic Architectural Design of Scale Transformed Attention Condenser
DNNs via Multi-Scale Class Representational Response Similarity Analysis [93.0013343535411]
We propose a novel type of analysis called Multi-Scale Class Representational Response Similarity Analysis (ClassRepSim)
We show that adding STAC modules to ResNet style architectures can result in up to a 1.6% increase in top-1 accuracy.
Results from ClassRepSim analysis can be used to select an effective parameterization of the STAC module resulting in competitive performance.
arXiv Detail & Related papers (2023-06-16T18:29:26Z) - ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders [104.05133094625137]
We propose a fully convolutional masked autoencoder framework and a new Global Response Normalization layer.
This co-design of self-supervised learning techniques and architectural improvement results in a new model family called ConvNeXt V2, which significantly improves the performance of pure ConvNets.
arXiv Detail & Related papers (2023-01-02T18:59:31Z) - Improved Consistency Regularization for GANs [102.17007700413326]
We propose several modifications to the consistency regularization procedure designed to improve its performance.
For unconditional image synthesis on CIFAR-10 and CelebA, our modifications yield the best known FID scores on various GAN architectures.
On ImageNet-2012, we apply our technique to the original BigGAN model and improve the FID from 6.66 to 5.38, which is the best score at that model size.
arXiv Detail & Related papers (2020-02-11T22:53:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.