A Survey of Diffusion Based Image Generation Models: Issues and Their
Solutions
- URL: http://arxiv.org/abs/2308.13142v1
- Date: Fri, 25 Aug 2023 02:35:54 GMT
- Title: A Survey of Diffusion Based Image Generation Models: Issues and Their
Solutions
- Authors: Tianyi Zhang, Zheng Wang, Jing Huang, Mohiuddin Muhammad Tasnim, Wei
Shi
- Abstract summary: Open-source stable diffusion models have enabled the academic community to extensively analyze the performance of image generation models.
This survey aims to examine the existing issues and the current solutions pertaining to image generation models.
- Score: 14.767446226216494
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Recently, there has been significant progress in the development of large
models. Following the success of ChatGPT, numerous language models have been
introduced, demonstrating remarkable performance. Similar advancements have
also been observed in image generation models, such as Google's Imagen model,
OpenAI's DALL-E 2, and stable diffusion models, which have exhibited impressive
capabilities in generating images. However, similar to large language models,
these models still encounter unresolved challenges. Fortunately, the
availability of open-source stable diffusion models and their underlying
mathematical principles has enabled the academic community to extensively
analyze the performance of current image generation models and make
improvements based on this stable diffusion framework. This survey aims to
examine the existing issues and the current solutions pertaining to image
generation models.
Related papers
- DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception [66.88792390480343]
We propose DEEM, a simple and effective approach that utilizes the generative feedback of diffusion models to align the semantic distributions of the image encoder.
DEEM exhibits enhanced robustness and a superior capacity to alleviate hallucinations while utilizing fewer trainable parameters, less pre-training data, and a smaller base model size.
arXiv Detail & Related papers (2024-05-24T05:46:04Z) - YaART: Yet Another ART Rendering Technology [119.09155882164573]
This study introduces YaART, a novel production-grade text-to-image cascaded diffusion model aligned to human preferences.
We analyze how these choices affect both the efficiency of the training process and the quality of the generated images.
We demonstrate that models trained on smaller datasets of higher-quality images can successfully compete with those trained on larger datasets.
arXiv Detail & Related papers (2024-04-08T16:51:19Z) - Generative AI in Vision: A Survey on Models, Metrics and Applications [0.0]
Generative AI models have revolutionized various fields by enabling the creation of realistic and diverse data samples.
Among these models, diffusion models have emerged as a powerful approach for generating high-quality images, text, and audio.
This survey paper provides a comprehensive overview of generative AI diffusion and legacy models, focusing on their underlying techniques, applications across different domains, and their challenges.
arXiv Detail & Related papers (2024-02-26T07:47:12Z) - Large-scale Reinforcement Learning for Diffusion Models [30.164571425479824]
Text-to-image diffusion models are susceptible to implicit biases that arise from web-scale text-image training pairs.
We present an effective scalable algorithm to improve diffusion models using Reinforcement Learning (RL)
We show how our approach substantially outperforms existing methods for aligning diffusion models with human preferences.
arXiv Detail & Related papers (2024-01-20T08:10:43Z) - Conditional Image Generation with Pretrained Generative Model [1.4685355149711303]
diffusion models have gained popularity for their ability to generate higher-quality images in comparison to GAN models.
These models require a huge amount of data, computational resources, and meticulous tuning for successful training.
We propose methods to leverage pre-trained unconditional diffusion models with additional guidance for the purpose of conditional image generative.
arXiv Detail & Related papers (2023-12-20T18:27:53Z) - A Survey on Video Diffusion Models [107.4734254941333]
The recent wave of AI-generated content (AIGC) has witnessed substantial success in computer vision.
Due to their impressive generative capabilities, diffusion models are gradually superseding methods based on GANs and auto-regressive Transformers.
This paper presents a comprehensive review of video diffusion models in the AIGC era.
arXiv Detail & Related papers (2023-10-16T17:59:28Z) - RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
Large Model [93.8067369210696]
Text-to-image generation (TTI) refers to the usage of models that could process text input and generate high fidelity images based on text descriptions.
Diffusion models are one prominent type of generative model used for the generation of images through the systematic introduction of noises with repeating steps.
In the era of large models, scaling up model size and the integration with large language models have further improved the performance of TTI models.
arXiv Detail & Related papers (2023-09-02T03:27:20Z) - Diffusion Models for Image Restoration and Enhancement -- A
Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration.
We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR.
We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z) - Diffusion Models: A Comprehensive Survey of Methods and Applications [10.557289965753437]
Diffusion models are a class of deep generative models that have shown impressive results on various tasks with dense theoretical founding.
Recent studies have shown great enthusiasm on improving the performance of diffusion model.
arXiv Detail & Related papers (2022-09-02T02:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.