AE-StyleGAN: Improved Training of Style-Based Auto-Encoders
- URL: http://arxiv.org/abs/2110.08718v1
- Date: Sun, 17 Oct 2021 04:25:51 GMT
- Title: AE-StyleGAN: Improved Training of Style-Based Auto-Encoders
- Authors: Ligong Han, Sri Harsha Musunuri, Martin Renqiang Min, Ruijiang Gao, Yu
Tian, Dimitris Metaxas
- Abstract summary: StyleGANs have shown impressive results on data generation and manipulation in recent years.
In this paper, we focus on style-based generators asking a scientific question: Does forcing such a generator to reconstruct real data lead to more disentangled latent space and make the inversion process from image to latent space easy?
We describe a new methodology to train a style-based autoencoder where the encoder and generator are optimized end-to-end.
- Score: 21.51697087024866
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: StyleGANs have shown impressive results on data generation and manipulation
in recent years, thanks to its disentangled style latent space. A lot of
efforts have been made in inverting a pretrained generator, where an encoder is
trained ad hoc after the generator is trained in a two-stage fashion. In this
paper, we focus on style-based generators asking a scientific question: Does
forcing such a generator to reconstruct real data lead to more disentangled
latent space and make the inversion process from image to latent space easy? We
describe a new methodology to train a style-based autoencoder where the encoder
and generator are optimized end-to-end. We show that our proposed model
consistently outperforms baselines in terms of image inversion and generation
quality. Supplementary, code, and pretrained models are available on the
project website.
Related papers
- StyleInV: A Temporal Style Modulated Inversion Network for Unconditional
Video Generation [73.54398908446906]
We introduce a novel motion generator design that uses a learning-based inversion network for GAN.
Our method supports style transfer with simple fine-tuning when the encoder is paired with a pretrained StyleGAN generator.
arXiv Detail & Related papers (2023-08-31T17:59:33Z) - Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z) - Towards Accurate Image Coding: Improved Autoregressive Image Generation
with Dynamic Vector Quantization [73.52943587514386]
Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm.
We propose a novel two-stage framework: (1) Dynamic-Quantization VAE (DQ-VAE) which encodes image regions into variable-length codes based their information densities for accurate representation.
arXiv Detail & Related papers (2023-05-19T14:56:05Z) - Gradient Adjusting Networks for Domain Inversion [82.72289618025084]
StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing.
We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights.
Our experiments show a sizable gap in performance over the current state of the art in this very active domain.
arXiv Detail & Related papers (2023-02-22T14:47:57Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Talking Head from Speech Audio using a Pre-trained Image Generator [5.659018934205065]
We propose a novel method for generating high-resolution videos of talking-heads from speech audio and a single 'identity' image.
We model each frame as a point in the latent space of StyleGAN so that a video corresponds to a trajectory through the latent space.
We train a recurrent neural network to map from speech utterances to displacements in the latent space of the image generator.
arXiv Detail & Related papers (2022-09-09T11:20:37Z) - Feature-Style Encoder for Style-Based GAN Inversion [1.9116784879310027]
We propose a novel architecture for GAN inversion, which we call Feature-Style encoder.
Our model achieves accurate inversion of real images from the latent space of a pre-trained style-based GAN model.
Thanks to its encoder structure, the model allows fast and accurate image editing.
arXiv Detail & Related papers (2022-02-04T15:19:34Z) - Autoencoding Video Latents for Adversarial Video Generation [0.0]
AVLAE is a two stream latent autoencoder where the video distribution is learned by adversarial training.
We demonstrate that our approach learns to disentangle motion and appearance codes even without the explicit structural composition in the generator.
arXiv Detail & Related papers (2022-01-18T11:42:14Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Toward Spatially Unbiased Generative Models [19.269719158344508]
Recent image generation models show remarkable generation performance.
However, they mirror strong location preference in datasets, which we call spatial bias.
We argue that the generators rely on their implicit positional encoding to render spatial content.
arXiv Detail & Related papers (2021-08-03T04:13:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.