Latent Transformations via NeuralODEs for GAN-based Image Editing
- URL: http://arxiv.org/abs/2111.14825v1
- Date: Mon, 29 Nov 2021 18:59:54 GMT
- Title: Latent Transformations via NeuralODEs for GAN-based Image Editing
- Authors: Valentin Khrulkov, Leyla Mirvakhabova, Ivan Oseledets, Artem Babenko
- Abstract summary: We show that nonlinear latent code manipulations realized as flows of a trainable Neural ODE are beneficial for many practical non-face image domains.
In particular, we investigate a large number of datasets with known attributes and demonstrate that certain attribute manipulations are challenging to obtain with linear shifts only.
- Score: 25.272389610447856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in high-fidelity semantic image editing heavily rely on the
presumably disentangled latent spaces of the state-of-the-art generative
models, such as StyleGAN. Specifically, recent works show that it is possible
to achieve decent controllability of attributes in face images via linear
shifts along with latent directions. Several recent methods address the
discovery of such directions, implicitly assuming that the state-of-the-art
GANs learn the latent spaces with inherently linearly separable attribute
distributions and semantic vector arithmetic properties.
In our work, we show that nonlinear latent code manipulations realized as
flows of a trainable Neural ODE are beneficial for many practical non-face
image domains with more complex non-textured factors of variation. In
particular, we investigate a large number of datasets with known attributes and
demonstrate that certain attribute manipulations are challenging to obtain with
linear shifts only.
Related papers
- SC2GAN: Rethinking Entanglement by Self-correcting Correlated GAN Space [16.040942072859075]
Gene Networks that achieve following editing directions for one attribute could result in entangled changes with other attributes.
We propose a novel framework SC$2$GAN disentanglement by re-projecting low-density latent code samples in the original latent space.
arXiv Detail & Related papers (2023-10-10T14:42:32Z) - Hierarchical Semantic Regularization of Latent Spaces in StyleGANs [53.98170188547775]
We propose a Hierarchical Semantic Regularizer (HSR) which aligns the hierarchical representations learnt by the generator to corresponding powerful features learnt by pretrained networks on large amounts of data.
HSR is shown to not only improve generator representations but also the linearity and smoothness of the latent style spaces, leading to the generation of more natural-looking style-edited images.
arXiv Detail & Related papers (2022-08-07T16:23:33Z) - High-resolution Face Swapping via Latent Semantics Disentanglement [50.23624681222619]
We present a novel high-resolution hallucination face swapping method using the inherent prior knowledge of a pre-trained GAN model.
We explicitly disentangle the latent semantics by utilizing the progressive nature of the generator.
We extend our method to video face swapping by enforcing two-temporal constraints on the latent space and the image space.
arXiv Detail & Related papers (2022-03-30T00:33:08Z) - Exploring Linear Feature Disentanglement For Neural Networks [63.20827189693117]
Non-linear activation functions, e.g., Sigmoid, ReLU, and Tanh, have achieved great success in neural networks (NNs)
Due to the complex non-linear characteristic of samples, the objective of those activation functions is to project samples from their original feature space to a linear separable feature space.
This phenomenon ignites our interest in exploring whether all features need to be transformed by all non-linear functions in current typical NNs.
arXiv Detail & Related papers (2022-03-22T13:09:17Z) - WarpedGANSpace: Finding non-linear RBF paths in GAN latent space [44.7091944340362]
This work addresses the problem of discovering, in an unsupervised manner, interpretable paths in the latent space of pretrained GANs.
We learn non-linear warpings on the latent space, each one parametrized by a set of RBF-based latent space warping functions.
We show that linear paths can be derived as a special case of our method, and show experimentally that non-linear paths in the latent space lead to steeper, more disentangled and interpretable changes in the image space.
arXiv Detail & Related papers (2021-09-27T21:29:35Z) - Topographic VAEs learn Equivariant Capsules [84.33745072274942]
We introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables.
We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST.
We demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.
arXiv Detail & Related papers (2021-09-03T09:25:57Z) - LARGE: Latent-Based Regression through GAN Semantics [42.50535188836529]
We propose a novel method for solving regression tasks using few-shot or weak supervision.
We show that our method can be applied across a wide range of domains, leverage multiple latent direction discovery frameworks, and achieve state-of-the-art results.
arXiv Detail & Related papers (2021-07-22T17:55:35Z) - Robust Training Using Natural Transformation [19.455666609149567]
We present NaTra, an adversarial training scheme to improve robustness of image classification algorithms.
We target attributes of the input images that are independent of the class identification, and manipulate those attributes to mimic real-world natural transformations.
We demonstrate the efficacy of our scheme by utilizing the disentangled latent representations derived from well-trained GANs.
arXiv Detail & Related papers (2021-05-10T01:56:03Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.