DeshuffleGAN: A Self-Supervised GAN to Improve Structure Learning
- URL: http://arxiv.org/abs/2006.08694v1
- Date: Mon, 15 Jun 2020 19:06:07 GMT
- Title: DeshuffleGAN: A Self-Supervised GAN to Improve Structure Learning
- Authors: Gulcin Baykal, Gozde Unal
- Abstract summary: We argue that one of the crucial points to improve the GAN performance is to be able to provide the model with a capability to learn the spatial structure in data.
We introduce a deshuffling task that solves a puzzle of randomly shuffled image tiles, which in turn helps the DeshuffleGAN learn to increase its expressive capacity for spatial structure and realistic appearance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) triggered an increased interest in
problem of image generation due to their improved output image quality and
versatility for expansion towards new methods. Numerous GAN-based works attempt
to improve generation by architectural and loss-based extensions. We argue that
one of the crucial points to improve the GAN performance in terms of realism
and similarity to the original data distribution is to be able to provide the
model with a capability to learn the spatial structure in data. To that end, we
propose the DeshuffleGAN to enhance the learning of the discriminator and the
generator, via a self-supervision approach. Specifically, we introduce a
deshuffling task that solves a puzzle of randomly shuffled image tiles, which
in turn helps the DeshuffleGAN learn to increase its expressive capacity for
spatial structure and realistic appearance. We provide experimental evidence
for the performance improvement in generated images, compared to the baseline
methods, which is consistently observed over two different datasets.
Related papers
- A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance.
We propose a simple yet effective data augmentation approach by leveraging advancements in generative models.
Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z) - E$^{2}$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation [69.72194342962615]
We introduce and address a novel research direction: can the process of distilling GANs from diffusion models be made significantly more efficient?
First, we construct a base GAN model with generalized features, adaptable to different concepts through fine-tuning, eliminating the need for training from scratch.
Second, we identify crucial layers within the base GAN model and employ Low-Rank Adaptation (LoRA) with a simple yet effective rank search process, rather than fine-tuning the entire base model.
Third, we investigate the minimal amount of data necessary for fine-tuning, further reducing the overall training time.
arXiv Detail & Related papers (2024-01-11T18:59:14Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - A Simple and Effective Baseline for Attentional Generative Adversarial
Networks [8.63558211869045]
A text-to-image model of high-quality images by guiding the generative model through the Text description is an innovative and challenging task.
In recent years, AttnGAN based on the Attention mechanism to guide GAN training has been proposed, SD-GAN, and Stack-GAN++.
We use the popular simple and effective idea (1) to remove redundancy structure and improve the backbone network of AttnGAN.
Our improvements have significantly improved the model size and training efficiency while ensuring that the model's performance is unchanged.
arXiv Detail & Related papers (2023-06-26T13:55:57Z) - TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual
Vision Transformer for Fast Arbitrary One-Shot Image Generation [11.207512995742999]
One-shot image generation (OSG) with generative adversarial networks that learn from the internal patches of a given image has attracted world wide attention.
We propose a novel structure-preserved method TcGAN with individual vision transformer to overcome the shortcomings of the existing one-shot image generation methods.
arXiv Detail & Related papers (2023-02-16T03:05:59Z) - Local Magnification for Data and Feature Augmentation [53.04028225837681]
We propose an easy-to-implement and model-free data augmentation method called Local Magnification (LOMA)
LOMA generates additional training data by randomly magnifying a local area of the image.
Experiments show that our proposed LOMA, though straightforward, can be combined with standard data augmentation to significantly improve the performance on image classification and object detection.
arXiv Detail & Related papers (2022-11-15T02:51:59Z) - Dynamically Grown Generative Adversarial Networks [111.43128389995341]
We propose a method to dynamically grow a GAN during training, optimizing the network architecture and its parameters together with automation.
The method embeds architecture search techniques as an interleaving step with gradient-based training to periodically seek the optimal architecture-growing strategy for the generator and discriminator.
arXiv Detail & Related papers (2021-06-16T01:25:51Z) - Evolving GAN Formulations for Higher Quality Image Synthesis [15.861807854144228]
Generative Adversarial Networks (GANs) have extended deep learning to complex generation and translation tasks.
GANs are notoriously difficult to train: Mode collapse and other instabilities in the training process often degrade the quality of the generated results.
This paper presents a new technique called TaylorGAN for improving GANs by discovering customized loss functions for each of its two networks.
arXiv Detail & Related papers (2021-02-17T05:11:21Z) - Efficient texture-aware multi-GAN for image inpainting [5.33024001730262]
Recent GAN-based (Generative adversarial networks) inpainting methods show remarkable improvements.
We propose a multi-GAN architecture improving both the performance and rendering efficiency.
arXiv Detail & Related papers (2020-09-30T14:58:03Z) - InfoMax-GAN: Improved Adversarial Image Generation via Information
Maximization and Contrastive Learning [39.316605441868944]
Generative Adversarial Networks (GANs) are fundamental to many generative modelling applications.
We propose a principled framework to simultaneously mitigate two fundamental issues in GANs: catastrophic forgetting of the discriminator and mode collapse of the generator.
Our approach significantly stabilizes GAN training and improves GAN performance for image synthesis across five datasets.
arXiv Detail & Related papers (2020-07-09T06:56:11Z) - High-Fidelity Synthesis with Disentangled Representation [60.19657080953252]
We propose an Information-Distillation Generative Adrial Network (ID-GAN) for disentanglement learning and high-fidelity synthesis.
Our method learns disentangled representation using VAE-based models, and distills the learned representation with an additional nuisance variable to the separate GAN-based generator for high-fidelity synthesis.
Despite the simplicity, we show that the proposed method is highly effective, achieving comparable image generation quality to the state-of-the-art methods using the disentangled representation.
arXiv Detail & Related papers (2020-01-13T14:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.