Positional Encoding as Spatial Inductive Bias in GANs
- URL: http://arxiv.org/abs/2012.05217v1
- Date: Wed, 9 Dec 2020 18:27:16 GMT
- Title: Positional Encoding as Spatial Inductive Bias in GANs
- Authors: Rui Xu, Xintao Wang, Kai Chen, Bolei Zhou, Chen Change Loy
- Abstract summary: SinGAN shows impressive capability in learning internal patch distribution despite its limited effective receptive field.
In this work, we show that such capability, to a large extent, is brought by the implicit positional encoding when using zero padding in the generators.
We propose a new multi-scale training strategy and demonstrate its effectiveness in the state-of-the-art unconditional generator StyleGAN2.
- Score: 97.6622154941448
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: SinGAN shows impressive capability in learning internal patch distribution
despite its limited effective receptive field. We are interested in knowing how
such a translation-invariant convolutional generator could capture the global
structure with just a spatially i.i.d. input. In this work, taking SinGAN and
StyleGAN2 as examples, we show that such capability, to a large extent, is
brought by the implicit positional encoding when using zero padding in the
generators. Such positional encoding is indispensable for generating images
with high fidelity. The same phenomenon is observed in other generative
architectures such as DCGAN and PGGAN. We further show that zero padding leads
to an unbalanced spatial bias with a vague relation between locations. To offer
a better spatial inductive bias, we investigate alternative positional
encodings and analyze their effects. Based on a more flexible positional
encoding explicitly, we propose a new multi-scale training strategy and
demonstrate its effectiveness in the state-of-the-art unconditional generator
StyleGAN2. Besides, the explicit spatial inductive bias substantially improve
SinGAN for more versatile image manipulation.
Related papers
- In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - GIFD: A Generative Gradient Inversion Method with Feature Domain
Optimization [52.55628139825667]
Federated Learning (FL) has emerged as a promising distributed machine learning framework to preserve clients' privacy.
Recent studies find that an attacker can invert the shared gradients and recover sensitive data against an FL system by leveraging pre-trained generative adversarial networks (GAN) as prior knowledge.
We propose textbfGradient textbfInversion over textbfFeature textbfDomains (GIFD), which disassembles the GAN model and searches the feature domains of the intermediate layers.
arXiv Detail & Related papers (2023-08-09T04:34:21Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - High-fidelity GAN Inversion with Padding Space [38.9258619444968]
Inverting a Generative Adversarial Network (GAN) facilitates a wide range of image editing tasks using pre-trained generators.
Existing methods typically employ the latent space of GANs as the inversion space yet observe the insufficient recovery of spatial details.
We propose to involve the padding space of the generator to complement the latent space with spatial information.
arXiv Detail & Related papers (2022-03-21T16:32:12Z) - Toward Spatially Unbiased Generative Models [19.269719158344508]
Recent image generation models show remarkable generation performance.
However, they mirror strong location preference in datasets, which we call spatial bias.
We argue that the generators rely on their implicit positional encoding to render spatial content.
arXiv Detail & Related papers (2021-08-03T04:13:03Z) - Low-Rank Subspaces in GANs [101.48350547067628]
This work introduces low-rank subspaces that enable more precise control of GAN generation.
LowRankGAN is able to find the low-dimensional representation of attribute manifold.
Experiments on state-of-the-art GAN models (including StyleGAN2 and BigGAN) trained on various datasets demonstrate the effectiveness of our LowRankGAN.
arXiv Detail & Related papers (2021-06-08T16:16:32Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z) - Optimizing Generative Adversarial Networks for Image Super Resolution
via Latent Space Regularization [4.529132742139768]
Generative Adversarial Networks (GANs) try to learn the distribution of the real images in the manifold to generate samples that look real.
We probe for ways to alleviate these problems for supervised GANs in this paper.
arXiv Detail & Related papers (2020-01-22T16:27:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.