WarpedGANSpace: Finding non-linear RBF paths in GAN latent space
- URL: http://arxiv.org/abs/2109.13357v1
- Date: Mon, 27 Sep 2021 21:29:35 GMT
- Title: WarpedGANSpace: Finding non-linear RBF paths in GAN latent space
- Authors: Christos Tzelepis, Georgios Tzimiropoulos, and Ioannis Patras
- Abstract summary: This work addresses the problem of discovering, in an unsupervised manner, interpretable paths in the latent space of pretrained GANs.
We learn non-linear warpings on the latent space, each one parametrized by a set of RBF-based latent space warping functions.
We show that linear paths can be derived as a special case of our method, and show experimentally that non-linear paths in the latent space lead to steeper, more disentangled and interpretable changes in the image space.
- Score: 44.7091944340362
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This work addresses the problem of discovering, in an unsupervised manner,
interpretable paths in the latent space of pretrained GANs, so as to provide an
intuitive and easy way of controlling the underlying generative factors. In
doing so, it addresses some of the limitations of the state-of-the-art works,
namely, a) that they discover directions that are independent of the latent
code, i.e., paths that are linear, and b) that their evaluation relies either
on visual inspection or on laborious human labeling. More specifically, we
propose to learn non-linear warpings on the latent space, each one parametrized
by a set of RBF-based latent space warping functions, and where each warping
gives rise to a family of non-linear paths via the gradient of the function.
Building on the work of Voynov and Babenko, that discovers linear paths, we
optimize the trainable parameters of the set of RBFs, so as that images that
are generated by codes along different paths, are easily distinguishable by a
discriminator network. This leads to easily distinguishable image
transformations, such as pose and facial expressions in facial images. We show
that linear paths can be derived as a special case of our method, and show
experimentally that non-linear paths in the latent space lead to steeper, more
disentangled and interpretable changes in the image space than in state-of-the
art methods, both qualitatively and quantitatively. We make the code and the
pretrained models publicly available at:
https://github.com/chi0tzp/WarpedGANSpace.
Related papers
- TraDiffusion: Trajectory-Based Training-Free Image Generation [85.39724878576584]
We propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion.
This novel method allows users to effortlessly guide image generation via mouse trajectories.
arXiv Detail & Related papers (2024-08-19T07:01:43Z) - Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain.
GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors.
We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z) - ContraCLIP: Interpretable GAN generation driven by pairs of contrasting
sentences [45.06326873752593]
We find non-linear interpretable paths in the latent space of pre-trained GANs in a model-agnostic manner.
By defining an objective that discovers paths that generate changes along the desired paths in the vision-language embedding space, we provide an intuitive way of controlling the underlying generative factors.
arXiv Detail & Related papers (2022-06-05T06:13:42Z) - Rayleigh EigenDirections (REDs): GAN latent space traversals for
multidimensional features [20.11085769303415]
We present a method for finding paths in a deep generative model's latent space.
We can manipulate multidimensional features of an image such as facial identity and pixels within a region.
Our work suggests that a wealth of opportunities lies in the local analysis of the geometry and semantics of latent spaces.
arXiv Detail & Related papers (2022-01-25T16:11:33Z) - Latent Transformations via NeuralODEs for GAN-based Image Editing [25.272389610447856]
We show that nonlinear latent code manipulations realized as flows of a trainable Neural ODE are beneficial for many practical non-face image domains.
In particular, we investigate a large number of datasets with known attributes and demonstrate that certain attribute manipulations are challenging to obtain with linear shifts only.
arXiv Detail & Related papers (2021-11-29T18:59:54Z) - Orthogonal Jacobian Regularization for Unsupervised Disentanglement in
Image Generation [64.92152574895111]
We propose a simple Orthogonal Jacobian Regularization (OroJaR) to encourage deep generative model to learn disentangled representations.
Our method is effective in disentangled and controllable image generation, and performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2021-08-17T15:01:46Z) - LARGE: Latent-Based Regression through GAN Semantics [42.50535188836529]
We propose a novel method for solving regression tasks using few-shot or weak supervision.
We show that our method can be applied across a wide range of domains, leverage multiple latent direction discovery frameworks, and achieve state-of-the-art results.
arXiv Detail & Related papers (2021-07-22T17:55:35Z) - Do Not Escape From the Manifold: Discovering the Local Coordinates on
the Latent Space of GANs [7.443321740418409]
We propose a method to find local-geometry-aware traversal directions on the intermediate latent space of Generative Adversarial Networks (GANs)
Motivated by the intrinsic sparsity of the latent space, the basis is discovered by solving the low-rank approximation problem of the differential of the partial network.
arXiv Detail & Related papers (2021-06-13T10:29:42Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.