Generative AI in Vision: A Survey on Models, Metrics and Applications
- URL: http://arxiv.org/abs/2402.16369v1
- Date: Mon, 26 Feb 2024 07:47:12 GMT
- Title: Generative AI in Vision: A Survey on Models, Metrics and Applications
- Authors: Gaurav Raut and Apoorv Singh
- Abstract summary: Generative AI models have revolutionized various fields by enabling the creation of realistic and diverse data samples.
Among these models, diffusion models have emerged as a powerful approach for generating high-quality images, text, and audio.
This survey paper provides a comprehensive overview of generative AI diffusion and legacy models, focusing on their underlying techniques, applications across different domains, and their challenges.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative AI models have revolutionized various fields by enabling the
creation of realistic and diverse data samples. Among these models, diffusion
models have emerged as a powerful approach for generating high-quality images,
text, and audio. This survey paper provides a comprehensive overview of
generative AI diffusion and legacy models, focusing on their underlying
techniques, applications across different domains, and their challenges. We
delve into the theoretical foundations of diffusion models, including concepts
such as denoising diffusion probabilistic models (DDPM) and score-based
generative modeling. Furthermore, we explore the diverse applications of these
models in text-to-image, image inpainting, and image super-resolution, along
with others, showcasing their potential in creative tasks and data
augmentation. By synthesizing existing research and highlighting critical
advancements in this field, this survey aims to provide researchers and
practitioners with a comprehensive understanding of generative AI diffusion and
legacy models and inspire future innovations in this exciting area of
artificial intelligence.
Related papers
- A Comprehensive Survey on Diffusion Models and Their Applications [0.4218593777811082]
Diffusion Models are probabilistic models that create realistic samples by simulating the diffusion process.
These models have gained popularity in domains such as image processing, speech synthesis, and natural language processing.
This review aims to facilitate a deeper understanding and broader adoption of Diffusion Models.
arXiv Detail & Related papers (2024-07-01T17:10:29Z) - Diffusion Models in Low-Level Vision: A Survey [82.77962165415153]
diffusion model-based solutions have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity.
We present three generic diffusion modeling frameworks and explore their correlations with other deep generative models.
We summarize extended diffusion models applied in other tasks, including medical, remote sensing, and video scenarios.
arXiv Detail & Related papers (2024-06-17T01:49:27Z) - An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization [59.63880337156392]
Diffusion models have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology.
Despite the significant empirical success, theory of diffusion models is very limited.
This paper provides a well-rounded theoretical exposure for stimulating forward-looking theories and methods of diffusion models.
arXiv Detail & Related papers (2024-04-11T14:07:25Z) - State of the Art on Diffusion Models for Visual Computing [191.6168813012954]
This report introduces the basic mathematical concepts of diffusion models, implementation details and design choices of the popular Stable Diffusion model.
We also give a comprehensive overview of the rapidly growing literature on diffusion-based generation and editing.
We discuss available datasets, metrics, open challenges, and social implications.
arXiv Detail & Related papers (2023-10-11T05:32:29Z) - Diffusion Models for Image Restoration and Enhancement -- A
Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration.
We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR.
We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z) - Interpretable ODE-style Generative Diffusion Model via Force Field
Construction [0.0]
This paper aims to identify various physical models that are suitable for constructing ODE-style generative diffusion models accurately from a mathematical perspective.
We perform a case study where we use the theoretical model identified by our method to develop a range of new diffusion model methods.
arXiv Detail & Related papers (2023-03-14T16:58:11Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z) - Diffusion Models: A Comprehensive Survey of Methods and Applications [10.557289965753437]
Diffusion models are a class of deep generative models that have shown impressive results on various tasks with dense theoretical founding.
Recent studies have shown great enthusiasm on improving the performance of diffusion model.
arXiv Detail & Related papers (2022-09-02T02:59:10Z) - DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models.
Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.