A Simple and Effective Baseline for Attentional Generative Adversarial
Networks
- URL: http://arxiv.org/abs/2306.14708v2
- Date: Thu, 6 Jul 2023 14:07:35 GMT
- Title: A Simple and Effective Baseline for Attentional Generative Adversarial
Networks
- Authors: Mingyu Jin, Chong Zhang, Qinkai Yu, Haochen Xue, Xiaobo Jin, Xi Yang
- Abstract summary: A text-to-image model of high-quality images by guiding the generative model through the Text description is an innovative and challenging task.
In recent years, AttnGAN based on the Attention mechanism to guide GAN training has been proposed, SD-GAN, and Stack-GAN++.
We use the popular simple and effective idea (1) to remove redundancy structure and improve the backbone network of AttnGAN.
Our improvements have significantly improved the model size and training efficiency while ensuring that the model's performance is unchanged.
- Score: 8.63558211869045
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Synthesising a text-to-image model of high-quality images by guiding the
generative model through the Text description is an innovative and challenging
task. In recent years, AttnGAN based on the Attention mechanism to guide GAN
training has been proposed, SD-GAN, which adopts a self-distillation technique
to improve the performance of the generator and the quality of image
generation, and Stack-GAN++, which gradually improves the details and quality
of the image by stacking multiple generators and discriminators. However, this
series of improvements to GAN all have redundancy to a certain extent, which
affects the generation performance and complexity to a certain extent. We use
the popular simple and effective idea (1) to remove redundancy structure and
improve the backbone network of AttnGAN. (2) to integrate and reconstruct
multiple losses of DAMSM. Our improvements have significantly improved the
model size and training efficiency while ensuring that the model's performance
is unchanged and finally proposed our SEAttnGAN. Code is avalilable at
https://github.com/jmyissb/SEAttnGAN.
Related papers
- E$^{2}$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation [69.72194342962615]
We introduce and address a novel research direction: can the process of distilling GANs from diffusion models be made significantly more efficient?
First, we construct a base GAN model with generalized features, adaptable to different concepts through fine-tuning, eliminating the need for training from scratch.
Second, we identify crucial layers within the base GAN model and employ Low-Rank Adaptation (LoRA) with a simple yet effective rank search process, rather than fine-tuning the entire base model.
Third, we investigate the minimal amount of data necessary for fine-tuning, further reducing the overall training time.
arXiv Detail & Related papers (2024-01-11T18:59:14Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - GIU-GANs: Global Information Utilization for Generative Adversarial
Networks [3.3945834638760948]
In this paper, we propose a new GANs called Involution Generative Adversarial Networks (GIU-GANs)
GIU-GANs leverages a brand new module called the Global Information Utilization (GIU) module, which integrates Squeeze-and-Excitation Networks (SENet) and involution.
Batch Normalization(BN) inevitably ignores the representation differences among noise sampled by the generator, and thus degrades the generated image quality.
arXiv Detail & Related papers (2022-01-25T17:17:15Z) - A Generic Approach for Enhancing GANs by Regularized Latent Optimization [79.00740660219256]
We introduce a generic framework called em generative-model inference that is capable of enhancing pre-trained GANs effectively and seamlessly.
Our basic idea is to efficiently infer the optimal latent distribution for the given requirements using Wasserstein gradient flow techniques.
arXiv Detail & Related papers (2021-12-07T05:22:50Z) - Improved Transformer for High-Resolution GANs [69.42469272015481]
We introduce two key ingredients to Transformer to address this challenge.
We show in the experiments that the proposed HiT achieves state-of-the-art FID scores of 31.87 and 2.95 on unconditional ImageNet $128 times 128$ and FFHQ $256 times 256$, respectively.
arXiv Detail & Related papers (2021-06-14T17:39:49Z) - Evolving GAN Formulations for Higher Quality Image Synthesis [15.861807854144228]
Generative Adversarial Networks (GANs) have extended deep learning to complex generation and translation tasks.
GANs are notoriously difficult to train: Mode collapse and other instabilities in the training process often degrade the quality of the generated results.
This paper presents a new technique called TaylorGAN for improving GANs by discovering customized loss functions for each of its two networks.
arXiv Detail & Related papers (2021-02-17T05:11:21Z) - InfoMax-GAN: Improved Adversarial Image Generation via Information
Maximization and Contrastive Learning [39.316605441868944]
Generative Adversarial Networks (GANs) are fundamental to many generative modelling applications.
We propose a principled framework to simultaneously mitigate two fundamental issues in GANs: catastrophic forgetting of the discriminator and mode collapse of the generator.
Our approach significantly stabilizes GAN training and improves GAN performance for image synthesis across five datasets.
arXiv Detail & Related papers (2020-07-09T06:56:11Z) - DeshuffleGAN: A Self-Supervised GAN to Improve Structure Learning [0.0]
We argue that one of the crucial points to improve the GAN performance is to be able to provide the model with a capability to learn the spatial structure in data.
We introduce a deshuffling task that solves a puzzle of randomly shuffled image tiles, which in turn helps the DeshuffleGAN learn to increase its expressive capacity for spatial structure and realistic appearance.
arXiv Detail & Related papers (2020-06-15T19:06:07Z) - Iterative Network for Image Super-Resolution [69.07361550998318]
Single image super-resolution (SISR) has been greatly revitalized by the recent development of convolutional neural networks (CNN)
This paper provides a new insight on conventional SISR algorithm, and proposes a substantially different approach relying on the iterative optimization.
A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.
arXiv Detail & Related papers (2020-05-20T11:11:47Z) - High-Fidelity Synthesis with Disentangled Representation [60.19657080953252]
We propose an Information-Distillation Generative Adrial Network (ID-GAN) for disentanglement learning and high-fidelity synthesis.
Our method learns disentangled representation using VAE-based models, and distills the learned representation with an additional nuisance variable to the separate GAN-based generator for high-fidelity synthesis.
Despite the simplicity, we show that the proposed method is highly effective, achieving comparable image generation quality to the state-of-the-art methods using the disentangled representation.
arXiv Detail & Related papers (2020-01-13T14:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.