Diverse Single Image Generation with Controllable Global Structure
though Self-Attention
- URL: http://arxiv.org/abs/2102.04780v1
- Date: Tue, 9 Feb 2021 11:52:48 GMT
- Title: Diverse Single Image Generation with Controllable Global Structure
though Self-Attention
- Authors: Sutharsan Mahendren, Chamira Edussooriya, Ranga Rodrigo
- Abstract summary: We show how to generate images that require global context using generative adversarial networks.
Our results are visually better than the state-of-the-art particularly in generating images that require global context.
The diversity of our image generation, measured using the average standard deviation of pixels, is also better.
- Score: 1.2522889958051286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image generation from a single image using generative adversarial networks is
quite interesting due to the realism of generated images. However, recent
approaches need improvement for such realistic and diverse image generation,
when the global context of the image is important such as in face, animal, and
architectural image generation. This is mainly due to the use of fewer
convolutional layers for mainly capturing the patch statistics and, thereby,
not being able to capture global statistics very well. We solve this problem by
using attention blocks at selected scales and feeding a random Gaussian blurred
image to the discriminator for training. Our results are visually better than
the state-of-the-art particularly in generating images that require global
context. The diversity of our image generation, measured using the average
standard deviation of pixels, is also better.
Related papers
- RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection [60.960988614701414]
RIGID is a training-free and model-agnostic method for robust AI-generated image detection.
RIGID significantly outperforms existing trainingbased and training-free detectors.
arXiv Detail & Related papers (2024-05-30T14:49:54Z) - SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial
Network for an end-to-end image translation [18.93434486338439]
SCONE-GAN is shown to be effective for learning to generate realistic and diverse scenery images.
For more realistic and diverse image generation we introduce style reference image.
We validate the proposed algorithm for image-to-image translation and stylizing outdoor images.
arXiv Detail & Related papers (2023-11-07T10:29:16Z) - Traditional Classification Neural Networks are Good Generators: They are
Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models.
We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images.
We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z) - Dual Pyramid Generative Adversarial Networks for Semantic Image
Synthesis [94.76988562653845]
The goal of semantic image synthesis is to generate photo-realistic images from semantic label maps.
Current state-of-the-art approaches, however, still struggle to generate realistic objects in images at various scales.
We propose a Dual Pyramid Generative Adversarial Network (DP-GAN) that learns the conditioning of spatially-adaptive normalization blocks at all scales jointly.
arXiv Detail & Related papers (2022-10-08T18:45:44Z) - Reinforcing Generated Images via Meta-learning for One-Shot Fine-Grained
Visual Recognition [36.02360322125622]
We propose a meta-learning framework to combine generated images with original images, so that the resulting "hybrid" training images improve one-shot learning.
Our experiments demonstrate consistent improvement over baselines on one-shot fine-grained image classification benchmarks.
arXiv Detail & Related papers (2022-04-22T13:11:05Z) - Progressively Unfreezing Perceptual GAN [28.330940021951438]
Generative adversarial networks (GANs) are widely used in image generation tasks, yet the generated images are usually lack of texture details.
We propose a general framework, called Progressively Unfreezing Perceptual GAN (PUPGAN), which can generate images with fine texture details.
arXiv Detail & Related papers (2020-06-18T03:12:41Z) - Enhanced Residual Networks for Context-based Image Outpainting [0.0]
Deep models struggle to understand context and extrapolation through retained information.
Current models use generative adversarial networks to generate results which lack localized image feature consistency and appear fake.
We propose two methods to improve this issue: the use of a local and global discriminator, and the addition of residual blocks within the encoding section of the network.
arXiv Detail & Related papers (2020-05-14T05:14:26Z) - A U-Net Based Discriminator for Generative Adversarial Networks [86.67102929147592]
We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs)
The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images.
The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
arXiv Detail & Related papers (2020-02-28T11:16:54Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z) - Supervised and Unsupervised Learning of Parameterized Color Enhancement [112.88623543850224]
We tackle the problem of color enhancement as an image translation task using both supervised and unsupervised learning.
We achieve state-of-the-art results compared to both supervised (paired data) and unsupervised (unpaired data) image enhancement methods on the MIT-Adobe FiveK benchmark.
We show the generalization capability of our method, by applying it on photos from the early 20th century and to dark video frames.
arXiv Detail & Related papers (2019-12-30T13:57:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.