Dual Contrastive Loss and Attention for GANs
- URL: http://arxiv.org/abs/2103.16748v1
- Date: Wed, 31 Mar 2021 01:10:26 GMT
- Title: Dual Contrastive Loss and Attention for GANs
- Authors: Ning Yu, Guilin Liu, Aysegul Dundar, Andrew Tao, Bryan Catanzaro,
Larry Davis, Mario Fritz
- Abstract summary: We propose a novel dual contrastive loss and show that, with this loss, discriminator learns more generalized and distinguishable representations to incentivize generation.
We find attention to be still an important module for successful image generation even though it was not used in the recent state-of-the-art models.
By combining the strengths of these remedies, we improve the compelling state-of-the-art Fr'echet Inception Distance (FID) by at least 17.5% on several benchmark datasets.
- Score: 82.713118646294
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) produce impressive results on
unconditional image generation when powered with large-scale image datasets.
Yet generated images are still easy to spot especially on datasets with high
variance (e.g. bedroom, church). In this paper, we propose various improvements
to further push the boundaries in image generation. Specifically, we propose a
novel dual contrastive loss and show that, with this loss, discriminator learns
more generalized and distinguishable representations to incentivize generation.
In addition, we revisit attention and extensively experiment with different
attention blocks in the generator. We find attention to be still an important
module for successful image generation even though it was not used in the
recent state-of-the-art models. Lastly, we study different attention
architectures in the discriminator, and propose a reference attention
mechanism. By combining the strengths of these remedies, we improve the
compelling state-of-the-art Fr\'{e}chet Inception Distance (FID) by at least
17.5% on several benchmark datasets. We obtain even more significant
improvements on compositional synthetic scenes (up to 47.5% in FID).
Related papers
- Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors [62.63467652611788]
We introduce SEMI-TRUTHS, featuring 27,600 real images, 223,400 masks, and 1,472,700 AI-augmented images.
Each augmented image is accompanied by metadata for standardized and targeted evaluation of detector robustness.
Our findings suggest that state-of-the-art detectors exhibit varying sensitivities to the types and degrees of perturbations, data distributions, and augmentation methods used.
arXiv Detail & Related papers (2024-11-12T01:17:27Z) - DA-HFNet: Progressive Fine-Grained Forgery Image Detection and Localization Based on Dual Attention [12.36906630199689]
We construct a DA-HFNet forged image dataset guided by text or image-assisted GAN and Diffusion model.
Our goal is to utilize a hierarchical progressive network to capture forged artifacts at different scales for detection and localization.
arXiv Detail & Related papers (2024-06-03T16:13:33Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - FA-GAN: Feature-Aware GAN for Text to Image Synthesis [7.0168039268464]
We propose a Generative Adversarial Network (GAN) to synthesize a high-quality image by integrating two techniques.
First, we design a self-supervised discriminator with an auxiliary decoder so that the discriminator can extract better representation.
Secondly, we introduce a feature-aware loss to provide the generator more direct supervision by employing the feature representation from the self-supervised discriminator.
arXiv Detail & Related papers (2021-09-02T13:05:36Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - UMLE: Unsupervised Multi-discriminator Network for Low Light Enhancement [8.887169648516844]
Low-light scenarios will have serious implications for vision-based applications.
We propose a real-time unsupervised generative adversarial network (GAN) containing multiple discriminators.
Experiments indicate that our method is superior to the state-of-the-art methods in qualitative and quantitative evaluations.
arXiv Detail & Related papers (2020-12-24T09:48:56Z) - You Only Need Adversarial Supervision for Semantic Image Synthesis [84.83711654797342]
We propose a novel, simplified GAN model, which needs only adversarial supervision to achieve high quality results.
We show that images synthesized by our model are more diverse and follow the color and texture of real images more closely.
arXiv Detail & Related papers (2020-12-08T23:00:48Z) - Robust Data Hiding Using Inverse Gradient Attention [82.73143630466629]
In the data hiding task, each pixel of cover images should be treated differently since they have divergent tolerabilities.
We propose a novel deep data hiding scheme with Inverse Gradient Attention (IGA), combing the ideas of adversarial learning and attention mechanism.
Empirically, extensive experiments show that the proposed model outperforms the state-of-the-art methods on two prevalent datasets.
arXiv Detail & Related papers (2020-11-21T19:08:23Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.