Diffusion idea exploration for art generation
- URL: http://arxiv.org/abs/2307.04978v1
- Date: Tue, 11 Jul 2023 02:35:26 GMT
- Title: Diffusion idea exploration for art generation
- Authors: Nikhil Verma
- Abstract summary: Diffusion models have recently outperformed other generative models in image generation tasks using cross modal data as guiding information.
The initial experiments for this task of novel image generation demonstrated promising qualitative results.
- Score: 0.10152838128195467
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Cross-Modal learning tasks have picked up pace in recent times. With plethora
of applications in diverse areas, generation of novel content using multiple
modalities of data has remained a challenging problem. To address the same,
various generative modelling techniques have been proposed for specific tasks.
Novel and creative image generation is one important aspect for industrial
application which could help as an arm for novel content generation. Techniques
proposed previously used Generative Adversarial Network(GAN), autoregressive
models and Variational Autoencoders (VAE) for accomplishing similar tasks.
These approaches are limited in their capability to produce images guided by
either text instructions or rough sketch images decreasing the overall
performance of image generator. We used state of the art diffusion models to
generate creative art by primarily leveraging text with additional support of
rough sketches. Diffusion starts with a pattern of random dots and slowly
converts that pattern into a design image using the guiding information fed
into the model. Diffusion models have recently outperformed other generative
models in image generation tasks using cross modal data as guiding information.
The initial experiments for this task of novel image generation demonstrated
promising qualitative results.
Related papers
- Many-to-many Image Generation with Auto-regressive Diffusion Models [59.5041405824704]
This paper introduces a domain-general framework for many-to-many image generation, capable of producing interrelated image series from a given set of images.
We present MIS, a novel large-scale multi-image dataset, containing 12M synthetic multi-image samples, each with 25 interconnected images.
We learn M2M, an autoregressive model for many-to-many generation, where each image is modeled within a diffusion framework.
arXiv Detail & Related papers (2024-04-03T23:20:40Z) - Instruct-Imagen: Image Generation with Multi-modal Instruction [90.04481955523514]
instruct-imagen is a model that tackles heterogeneous image generation tasks and generalizes across unseen tasks.
We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision.
Human evaluation on various image generation datasets reveals that instruct-imagen matches or surpasses prior task-specific models in-domain.
arXiv Detail & Related papers (2024-01-03T19:31:58Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
Large Model [93.8067369210696]
Text-to-image generation (TTI) refers to the usage of models that could process text input and generate high fidelity images based on text descriptions.
Diffusion models are one prominent type of generative model used for the generation of images through the systematic introduction of noises with repeating steps.
In the era of large models, scaling up model size and the integration with large language models have further improved the performance of TTI models.
arXiv Detail & Related papers (2023-09-02T03:27:20Z) - StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity
3D Avatar Generation [103.88928334431786]
We present a novel method for generating high-quality, stylized 3D avatars.
We use pre-trained image-text diffusion models for data generation and a Generative Adversarial Network (GAN)-based 3D generation network for training.
Our approach demonstrates superior performance over current state-of-the-art methods in terms of visual quality and diversity of the produced avatars.
arXiv Detail & Related papers (2023-05-30T13:09:21Z) - Textile Pattern Generation Using Diffusion Models [0.0]
This study presents a fine-tuned diffusion model specifically trained for textile pattern generation by text guidance.
The proposed fine-tuned diffusion model outperforms the baseline models in terms of pattern quality and efficiency in textile pattern generation by text guidance.
arXiv Detail & Related papers (2023-04-02T12:12:24Z) - Investigating GANsformer: A Replication Study of a State-of-the-Art
Image Generation Model [0.0]
We reproduce and evaluate a novel variation of the original GAN network, the GANformer.
Due to resources and time limitations, we had to constrain the network's training times, dataset types, and sizes.
arXiv Detail & Related papers (2023-03-15T12:51:16Z) - Text-to-image Diffusion Models in Generative AI: A Survey [75.32882187215394]
We present a review of state-of-the-art methods on text-conditioned image synthesis, i.e., text-to-image.
We discuss applications beyond text-to-image generation: text-guided creative generation and text-guided image editing.
arXiv Detail & Related papers (2023-03-14T13:49:54Z) - Implementing and Experimenting with Diffusion Models for Text-to-Image
Generation [0.0]
Two models, DALL-E 2 and Imagen, have demonstrated that highly photorealistic images could be generated from a simple textual description of an image.
Text-to-image models require exceptionally large amounts of computational resources to train, as well as handling huge datasets collected from the internet.
This thesis contributes by reviewing the different approaches and techniques used by these models, and then by proposing our own implementation of a text-to-image model.
arXiv Detail & Related papers (2022-09-22T12:03:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.