Diffusion Models in Vision: A Survey
- URL: http://arxiv.org/abs/2209.04747v5
- Date: Sat, 1 Apr 2023 14:27:33 GMT
- Title: Diffusion Models in Vision: A Survey
- Authors: Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, Mubarak Shah
- Abstract summary: A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage.
Diffusion models are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens.
- Score: 80.82832715884597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Denoising diffusion models represent a recent emerging topic in computer
vision, demonstrating remarkable results in the area of generative modeling. A
diffusion model is a deep generative model that is based on two stages, a
forward diffusion stage and a reverse diffusion stage. In the forward diffusion
stage, the input data is gradually perturbed over several steps by adding
Gaussian noise. In the reverse stage, a model is tasked at recovering the
original input data by learning to gradually reverse the diffusion process,
step by step. Diffusion models are widely appreciated for the quality and
diversity of the generated samples, despite their known computational burdens,
i.e. low speeds due to the high number of steps involved during sampling. In
this survey, we provide a comprehensive review of articles on denoising
diffusion models applied in vision, comprising both theoretical and practical
contributions in the field. First, we identify and present three generic
diffusion modeling frameworks, which are based on denoising diffusion
probabilistic models, noise conditioned score networks, and stochastic
differential equations. We further discuss the relations between diffusion
models and other deep generative models, including variational auto-encoders,
generative adversarial networks, energy-based models, autoregressive models and
normalizing flows. Then, we introduce a multi-perspective categorization of
diffusion models applied in computer vision. Finally, we illustrate the current
limitations of diffusion models and envision some interesting directions for
future research.
Related papers
- Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Diffusion Models in Low-Level Vision: A Survey [82.77962165415153]
diffusion model-based solutions have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity.
We present three generic diffusion modeling frameworks and explore their correlations with other deep generative models.
We summarize extended diffusion models applied in other tasks, including medical, remote sensing, and video scenarios.
arXiv Detail & Related papers (2024-06-17T01:49:27Z) - An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization [59.63880337156392]
Diffusion models have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology.
Despite the significant empirical success, theory of diffusion models is very limited.
This paper provides a well-rounded theoretical exposure for stimulating forward-looking theories and methods of diffusion models.
arXiv Detail & Related papers (2024-04-11T14:07:25Z) - The Emergence of Reproducibility and Generalizability in Diffusion Models [10.188731323681575]
Given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs.
We show that diffusion models are learning distinct distributions affected by the training data size.
This valuable property generalizes to many variants of diffusion models, including those for conditional use, solving inverse problems, and model fine-tuning.
arXiv Detail & Related papers (2023-10-08T19:02:46Z) - Diffusion Models for Medical Image Analysis: A Comprehensive Survey [7.272308924113656]
Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems.
Diffusion models are widely appreciated for their strong mode coverage and quality of the generated samples.
This survey intends to provide a comprehensive overview of diffusion models in the discipline of medical image analysis.
arXiv Detail & Related papers (2022-11-14T23:50:52Z) - Unifying Diffusion Models' Latent Space, with Applications to
CycleDiffusion and Guidance [95.12230117950232]
We show that a common latent space emerges from two diffusion models trained independently on related domains.
Applying CycleDiffusion to text-to-image diffusion models, we show that large-scale text-to-image diffusion models can be used as zero-shot image-to-image editors.
arXiv Detail & Related papers (2022-10-11T15:53:52Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z) - Diffusion Models: A Comprehensive Survey of Methods and Applications [10.557289965753437]
Diffusion models are a class of deep generative models that have shown impressive results on various tasks with dense theoretical founding.
Recent studies have shown great enthusiasm on improving the performance of diffusion model.
arXiv Detail & Related papers (2022-09-02T02:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.