Words as Art Materials: Generating Paintings with Sequential GANs
- URL: http://arxiv.org/abs/2007.04383v1
- Date: Wed, 8 Jul 2020 19:17:14 GMT
- Title: Words as Art Materials: Generating Paintings with Sequential GANs
- Authors: Azmi Can \"Ozgen, Haz{\i}m Kemal Ekenel
- Abstract summary: We investigate the generation of artistic images on a large variance dataset.
This dataset includes images with variations, for example, in shape, color, and content.
We propose a sequential Generative Adversarial Network model.
- Score: 8.249180979158815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Converting text descriptions into images using Generative Adversarial
Networks has become a popular research area. Visually appealing images have
been generated successfully in recent years. Inspired by these studies, we
investigated the generation of artistic images on a large variance dataset.
This dataset includes images with variations, for example, in shape, color, and
content. These variations in images provide originality which is an important
factor for artistic essence. One major characteristic of our work is that we
used keywords as image descriptions, instead of sentences. As the network
architecture, we proposed a sequential Generative Adversarial Network model.
The first stage of this sequential model processes the word vectors and creates
a base image whereas the next stages focus on creating high-resolution
artistic-style images without working on word vectors. To deal with the
unstable nature of GANs, we proposed a mixture of techniques like Wasserstein
loss, spectral normalization, and minibatch discrimination. Ultimately, we were
able to generate painting images, which have a variety of styles. We evaluated
our results by using the Fr\'echet Inception Distance score and conducted a
user study with 186 participants.
Related papers
- Beyond Color and Lines: Zero-Shot Style-Specific Image Variations with Coordinated Semantics [3.9717825324709413]
Style has been primarily considered in terms of artistic elements such as colors, brushstrokes, and lighting.
In this study, we propose a zero-shot scheme for image variation with coordinated semantics.
arXiv Detail & Related papers (2024-10-24T08:34:57Z) - Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - Leveraging Open-Vocabulary Diffusion to Camouflaged Instance
Segmentation [59.78520153338878]
Text-to-image diffusion techniques have shown exceptional capability of producing high-quality images from text descriptions.
We propose a method built upon a state-of-the-art diffusion model, empowered by open-vocabulary to learn multi-scale textual-visual features for camouflaged object representations.
arXiv Detail & Related papers (2023-12-29T07:59:07Z) - Semantic Draw Engineering for Text-to-Image Creation [2.615648035076649]
We propose a method that utilizes artificial intelligence models for thematic creativity.
The method involves converting all visual elements into quantifiable data structures before creating images.
We evaluate the effectiveness of this approach in terms of semantic accuracy, image efficiency, and computational efficiency.
arXiv Detail & Related papers (2023-12-23T05:35:15Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Diffusion idea exploration for art generation [0.10152838128195467]
Diffusion models have recently outperformed other generative models in image generation tasks using cross modal data as guiding information.
The initial experiments for this task of novel image generation demonstrated promising qualitative results.
arXiv Detail & Related papers (2023-07-11T02:35:26Z) - Unsupervised Compositional Concepts Discovery with Text-to-Image
Generative Models [80.75258849913574]
In this paper, we consider the inverse problem -- given a collection of different images, can we discover the generative concepts that represent each image?
We present an unsupervised approach to discover generative concepts from a collection of images, disentangling different art styles in paintings, objects, and lighting from kitchen scenes, and discovering image classes given ImageNet images.
arXiv Detail & Related papers (2023-06-08T17:02:15Z) - Dual Pyramid Generative Adversarial Networks for Semantic Image
Synthesis [94.76988562653845]
The goal of semantic image synthesis is to generate photo-realistic images from semantic label maps.
Current state-of-the-art approaches, however, still struggle to generate realistic objects in images at various scales.
We propose a Dual Pyramid Generative Adversarial Network (DP-GAN) that learns the conditioning of spatially-adaptive normalization blocks at all scales jointly.
arXiv Detail & Related papers (2022-10-08T18:45:44Z) - IR-GAN: Image Manipulation with Linguistic Instruction by Increment
Reasoning [110.7118381246156]
Increment Reasoning Generative Adversarial Network (IR-GAN) aims to reason consistency between visual increment in images and semantic increment in instructions.
First, we introduce the word-level and instruction-level instruction encoders to learn user's intention from history-correlated instructions as semantic increment.
Second, we embed the representation of semantic increment into that of source image for generating target image, where source image plays the role of referring auxiliary.
arXiv Detail & Related papers (2022-04-02T07:48:39Z) - Generating Compositional Color Representations from Text [3.141061579698638]
Motivated by the fact that a significant fraction of user queries on an image search engine follow an (attribute, object) structure, we propose a generative adversarial network that generates color profiles for such bigrams.
We design our pipeline to learn composition - the ability to combine seen attributes and objects to unseen pairs.
arXiv Detail & Related papers (2021-09-22T01:37:13Z) - Learned Spatial Representations for Few-shot Talking-Head Synthesis [68.3787368024951]
We propose a novel approach for few-shot talking-head synthesis.
We show that this disentangled representation leads to a significant improvement over previous methods.
arXiv Detail & Related papers (2021-04-29T17:59:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.