Improving GAN Equilibrium by Raising Spatial Awareness
- URL: http://arxiv.org/abs/2112.00718v1
- Date: Wed, 1 Dec 2021 18:55:51 GMT
- Title: Improving GAN Equilibrium by Raising Spatial Awareness
- Authors: Jianyuan Wang, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei
Zhou
- Abstract summary: Generative Adversarial Networks (GANs) are built upon the adversarial training between a generator (G) and a discriminator (D)
In practice it is difficult to achieve such an equilibrium in GAN training, instead, D almost always surpasses G.
We propose to align the spatial awareness of G with the attention map induced from D.
- Score: 80.71970464638585
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of Generative Adversarial Networks (GANs) is largely built upon
the adversarial training between a generator (G) and a discriminator (D). They
are expected to reach a certain equilibrium where D cannot distinguish the
generated images from the real ones. However, in practice it is difficult to
achieve such an equilibrium in GAN training, instead, D almost always surpasses
G. We attribute this phenomenon to the information asymmetry between D and G.
Specifically, we observe that D learns its own visual attention when
determining whether an image is real or fake, but G has no explicit clue on
which regions to focus on for a particular synthesis. To alleviate the issue of
D dominating the competition in GANs, we aim to raise the spatial awareness of
G. Randomly sampled multi-level heatmaps are encoded into the intermediate
layers of G as an inductive bias. Thus G can purposefully improve the synthesis
of certain image regions. We further propose to align the spatial awareness of
G with the attention map induced from D. Through this way we effectively lessen
the information gap between D and G. Extensive results show that our method
pushes the two-player game in GANs closer to the equilibrium, leading to a
better synthesis performance. As a byproduct, the introduced spatial awareness
facilitates interactive editing over the output synthesis. Demo video and more
results are at https://genforce.github.io/eqgan/.
Related papers
- GeoGuide: Geometric guidance of diffusion models [8.34616719984217]
GeoGuide is a guidance model based on tracing the distance of the diffusion model's trajectory from the data manifold.
It surpasses the probabilistic approach ADM-G with respect to both the FID scores and the quality of the generated images.
arXiv Detail & Related papers (2024-07-17T07:56:27Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - GLeaD: Improving GANs with A Generator-Leading Task [44.14659523033865]
Generative adversarial network (GAN) is formulated as a two-player game between a generator (G) and a discriminator (D)
We propose a new paradigm for adversarial training, which makes G assign a task to D as well.
arXiv Detail & Related papers (2022-12-07T16:25:19Z) - DGL-GAN: Discriminator Guided Learning for GAN Compression [57.6150859067392]
Generative Adversarial Networks (GANs) with high computation costs have achieved remarkable results in synthesizing high-resolution images from random noise.
We propose a novel yet simple bf Discriminator bf Guided bf Learning approach for compressing vanilla bf GAN, dubbed bf DGL-GAN.
arXiv Detail & Related papers (2021-12-13T09:24:45Z) - Positional Encoding as Spatial Inductive Bias in GANs [97.6622154941448]
SinGAN shows impressive capability in learning internal patch distribution despite its limited effective receptive field.
In this work, we show that such capability, to a large extent, is brought by the implicit positional encoding when using zero padding in the generators.
We propose a new multi-scale training strategy and demonstrate its effectiveness in the state-of-the-art unconditional generator StyleGAN2.
arXiv Detail & Related papers (2020-12-09T18:27:16Z) - Interpreting Galaxy Deblender GAN from the Discriminator's Perspective [50.12901802952574]
This research focuses on behaviors of one of the network's major components, the Discriminator, which plays a vital role but is often overlooked.
We demonstrate that our method clearly reveals attention areas of the Discriminator when differentiating generated galaxy images from ground truth images.
arXiv Detail & Related papers (2020-01-17T04:05:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.