A U-Net Based Discriminator for Generative Adversarial Networks
- URL: http://arxiv.org/abs/2002.12655v2
- Date: Fri, 19 Mar 2021 23:22:06 GMT
- Title: A U-Net Based Discriminator for Generative Adversarial Networks
- Authors: Edgar Sch\"onfeld, Bernt Schiele, Anna Khoreva
- Abstract summary: We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs)
The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images.
The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
- Score: 86.67102929147592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Among the major remaining challenges for generative adversarial networks
(GANs) is the capacity to synthesize globally and locally coherent images with
object shapes and textures indistinguishable from real images. To target this
issue we propose an alternative U-Net based discriminator architecture,
borrowing the insights from the segmentation literature. The proposed U-Net
based architecture allows to provide detailed per-pixel feedback to the
generator while maintaining the global coherence of synthesized images, by
providing the global image feedback as well. Empowered by the per-pixel
response of the discriminator, we further propose a per-pixel consistency
regularization technique based on the CutMix data augmentation, encouraging the
U-Net discriminator to focus more on semantic and structural changes between
real and fake images. This improves the U-Net discriminator training, further
enhancing the quality of generated samples. The novel discriminator improves
over the state of the art in terms of the standard distribution and image
quality metrics, enabling the generator to synthesize images with varying
structure, appearance and levels of detail, maintaining global and local
realism. Compared to the BigGAN baseline, we achieve an average improvement of
2.7 FID points across FFHQ, CelebA, and the newly introduced COCO-Animals
dataset. The code is available at https://github.com/boschresearch/unetgan.
Related papers
- United Domain Cognition Network for Salient Object Detection in Optical Remote Sensing Images [21.76732661032257]
We propose a novel United Domain Cognition Network (UDCNet) to jointly explore the global-local information in the frequency and spatial domains.
Experimental results demonstrate the superiority of the proposed UDCNet over 24 state-of-the-art models.
arXiv Detail & Related papers (2024-11-11T04:12:27Z) - Spectral Normalization and Dual Contrastive Regularization for
Image-to-Image Translation [9.029227024451506]
We propose a new unpaired I2I translation framework based on dual contrastive regularization and spectral normalization.
We conduct comprehensive experiments to evaluate the effectiveness of SN-DCR, and the results prove that our method achieves SOTA in multiple tasks.
arXiv Detail & Related papers (2023-04-22T05:22:24Z) - Efficient and Explicit Modelling of Image Hierarchies for Image
Restoration [120.35246456398738]
We propose a mechanism to efficiently and explicitly model image hierarchies in the global, regional, and local range for image restoration.
Inspired by that, we propose the anchored stripe self-attention which achieves a good balance between the space and time complexity of self-attention.
Then we propose a new network architecture dubbed GRL to explicitly model image hierarchies in the Global, Regional, and Local range.
arXiv Detail & Related papers (2023-03-01T18:59:29Z) - TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual
Vision Transformer for Fast Arbitrary One-Shot Image Generation [11.207512995742999]
One-shot image generation (OSG) with generative adversarial networks that learn from the internal patches of a given image has attracted world wide attention.
We propose a novel structure-preserved method TcGAN with individual vision transformer to overcome the shortcomings of the existing one-shot image generation methods.
arXiv Detail & Related papers (2023-02-16T03:05:59Z) - High-Quality Pluralistic Image Completion via Code Shared VQGAN [51.7805154545948]
We present a novel framework for pluralistic image completion that can achieve both high quality and diversity at much faster inference speed.
Our framework is able to learn semantically-rich discrete codes efficiently and robustly, resulting in much better image reconstruction quality.
arXiv Detail & Related papers (2022-04-05T01:47:35Z) - Global and Local Alignment Networks for Unpaired Image-to-Image
Translation [170.08142745705575]
The goal of unpaired image-to-image translation is to produce an output image reflecting the target domain's style.
Due to the lack of attention to the content change in existing methods, semantic information from source images suffers from degradation during translation.
We introduce a novel approach, Global and Local Alignment Networks (GLA-Net)
Our method effectively generates sharper and more realistic images than existing approaches.
arXiv Detail & Related papers (2021-11-19T18:01:54Z) - Global and Local Texture Randomization for Synthetic-to-Real Semantic
Segmentation [40.556020857447535]
We propose two simple yet effective texture randomization mechanisms, Global Randomization (GTR) and Local Texture Randomization (LTR)
GTR is proposed to randomize the texture of source images into diverse texture styles.
LTR is proposed to generate diverse local regions for partially stylizing the source images.
arXiv Detail & Related papers (2021-08-05T05:14:49Z) - Low Light Image Enhancement via Global and Local Context Modeling [164.85287246243956]
We introduce a context-aware deep network for low-light image enhancement.
First, it features a global context module that models spatial correlations to find complementary cues over full spatial domain.
Second, it introduces a dense residual block that captures local context with a relatively large receptive field.
arXiv Detail & Related papers (2021-01-04T09:40:54Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.