Exploring Gradient-based Multi-directional Controls in GANs
- URL: http://arxiv.org/abs/2209.00698v1
- Date: Thu, 1 Sep 2022 19:10:26 GMT
- Title: Exploring Gradient-based Multi-directional Controls in GANs
- Authors: Zikun Chen, Ruowei Jiang, Brendan Duke, Han Zhao, Parham Aarabi
- Abstract summary: We propose a novel approach that discovers nonlinear controls, which enables multi-directional manipulation as well as effective disentanglement.
Our approach is able to gain fine-grained controls over a diverse set of bi-directional and multi-directional attributes, and we showcase its ability to achieve disentanglement significantly better than state-of-the-art methods.
- Score: 19.950198707910587
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) have been widely applied in modeling
diverse image distributions. However, despite its impressive applications, the
structure of the latent space in GANs largely remains as a black-box, leaving
its controllable generation an open problem, especially when spurious
correlations between different semantic attributes exist in the image
distributions. To address this problem, previous methods typically learn linear
directions or individual channels that control semantic attributes in the image
space. However, they often suffer from imperfect disentanglement, or are unable
to obtain multi-directional controls. In this work, in light of the above
challenges, we propose a novel approach that discovers nonlinear controls,
which enables multi-directional manipulation as well as effective
disentanglement, based on gradient information in the learned GAN latent space.
More specifically, we first learn interpolation directions by following the
gradients from classification networks trained separately on the attributes,
and then navigate the latent space by exclusively controlling channels
activated for the target attribute in the learned directions. Empirically, with
small training data, our approach is able to gain fine-grained controls over a
diverse set of bi-directional and multi-directional attributes, and we showcase
its ability to achieve disentanglement significantly better than
state-of-the-art methods both qualitatively and quantitatively.
Related papers
- SC2GAN: Rethinking Entanglement by Self-correcting Correlated GAN Space [16.040942072859075]
Gene Networks that achieve following editing directions for one attribute could result in entangled changes with other attributes.
We propose a novel framework SC$2$GAN disentanglement by re-projecting low-density latent code samples in the original latent space.
arXiv Detail & Related papers (2023-10-10T14:42:32Z) - NormAUG: Normalization-guided Augmentation for Domain Generalization [60.159546669021346]
We propose a simple yet effective method called NormAUG (Normalization-guided Augmentation) for deep learning.
Our method introduces diverse information at the feature level and improves the generalization of the main path.
In the test stage, we leverage an ensemble strategy to combine the predictions from the auxiliary path of our model, further boosting performance.
arXiv Detail & Related papers (2023-07-25T13:35:45Z) - High-fidelity GAN Inversion with Padding Space [38.9258619444968]
Inverting a Generative Adversarial Network (GAN) facilitates a wide range of image editing tasks using pre-trained generators.
Existing methods typically employ the latent space of GANs as the inversion space yet observe the insufficient recovery of spatial details.
We propose to involve the padding space of the generator to complement the latent space with spatial information.
arXiv Detail & Related papers (2022-03-21T16:32:12Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - Latent Transformations via NeuralODEs for GAN-based Image Editing [25.272389610447856]
We show that nonlinear latent code manipulations realized as flows of a trainable Neural ODE are beneficial for many practical non-face image domains.
In particular, we investigate a large number of datasets with known attributes and demonstrate that certain attribute manipulations are challenging to obtain with linear shifts only.
arXiv Detail & Related papers (2021-11-29T18:59:54Z) - Deep Learning Approximation of Diffeomorphisms via Linear-Control
Systems [91.3755431537592]
We consider a control system of the form $dot x = sum_i=1lF_i(x)u_i$, with linear dependence in the controls.
We use the corresponding flow to approximate the action of a diffeomorphism on a compact ensemble of points.
arXiv Detail & Related papers (2021-10-24T08:57:46Z) - WarpedGANSpace: Finding non-linear RBF paths in GAN latent space [44.7091944340362]
This work addresses the problem of discovering, in an unsupervised manner, interpretable paths in the latent space of pretrained GANs.
We learn non-linear warpings on the latent space, each one parametrized by a set of RBF-based latent space warping functions.
We show that linear paths can be derived as a special case of our method, and show experimentally that non-linear paths in the latent space lead to steeper, more disentangled and interpretable changes in the image space.
arXiv Detail & Related papers (2021-09-27T21:29:35Z) - LARGE: Latent-Based Regression through GAN Semantics [42.50535188836529]
We propose a novel method for solving regression tasks using few-shot or weak supervision.
We show that our method can be applied across a wide range of domains, leverage multiple latent direction discovery frameworks, and achieve state-of-the-art results.
arXiv Detail & Related papers (2021-07-22T17:55:35Z) - Cogradient Descent for Dependable Learning [64.02052988844301]
We propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem.
CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint.
It can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-06-20T04:28:20Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.