Towards Disentangling Latent Space for Unsupervised Semantic Face
Editing
- URL: http://arxiv.org/abs/2011.02638v2
- Date: Mon, 19 Jul 2021 01:21:52 GMT
- Title: Towards Disentangling Latent Space for Unsupervised Semantic Face
Editing
- Authors: Kanglin Liu and Gaofeng Cao and Fei Zhou and Bozhi Liu and Jiang Duan
and Guoping Qiu
- Abstract summary: Supervised attribute editing requires annotated training data which is difficult to obtain and limits the editable attributes to those with labels.
In this paper, we present a new technique termed Structure-Texture Independent Architecture with Weight Decomposition and Orthogonal Regularization (STIA-WO) to disentangle the latent space for unsupervised semantic face editing.
- Score: 21.190437168936764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial attributes in StyleGAN generated images are entangled in the latent
space which makes it very difficult to independently control a specific
attribute without affecting the others. Supervised attribute editing requires
annotated training data which is difficult to obtain and limits the editable
attributes to those with labels. Therefore, unsupervised attribute editing in
an disentangled latent space is key to performing neat and versatile semantic
face editing. In this paper, we present a new technique termed
Structure-Texture Independent Architecture with Weight Decomposition and
Orthogonal Regularization (STIA-WO) to disentangle the latent space for
unsupervised semantic face editing. By applying STIA-WO to GAN, we have
developed a StyleGAN termed STGAN-WO which performs weight decomposition
through utilizing the style vector to construct a fully controllable weight
matrix to regulate image synthesis, and employs orthogonal regularization to
ensure each entry of the style vector only controls one independent feature
matrix. To further disentangle the facial attributes, STGAN-WO introduces a
structure-texture independent architecture which utilizes two independently and
identically distributed (i.i.d.) latent vectors to control the synthesis of the
texture and structure components in a disentangled way. Unsupervised semantic
editing is achieved by moving the latent code in the coarse layers along its
orthogonal directions to change texture related attributes or changing the
latent code in the fine layers to manipulate structure related ones. We present
experimental results which show that our new STGAN-WO can achieve better
attribute editing than state of the art methods.
Related papers
- A Compact and Semantic Latent Space for Disentangled and Controllable
Image Editing [4.8201607588546]
We propose an auto-encoder which re-organizes the latent space of StyleGAN, so that each attribute which we wish to edit corresponds to an axis of the new latent space.
We show that our approach has greater disentanglement than competing methods, while maintaining fidelity to the original image with respect to identity.
arXiv Detail & Related papers (2023-12-13T16:18:45Z) - SC2GAN: Rethinking Entanglement by Self-correcting Correlated GAN Space [16.040942072859075]
Gene Networks that achieve following editing directions for one attribute could result in entangled changes with other attributes.
We propose a novel framework SC$2$GAN disentanglement by re-projecting low-density latent code samples in the original latent space.
arXiv Detail & Related papers (2023-10-10T14:42:32Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Face Attribute Editing with Disentangled Latent Vectors [0.0]
We propose an image-to-image translation framework for facial attribute editing.
Inspired by the latent space factorization works of fixed pretrained GANs, we design the attribute editing by latent space factorization.
To project images to semantically organized latent spaces, we set an encoder-decoder architecture with attention-based skip connections.
arXiv Detail & Related papers (2023-01-11T18:32:13Z) - Discovering Class-Specific GAN Controls for Semantic Image Synthesis [73.91655061467988]
We propose a novel method for finding spatially disentangled class-specific directions in the latent space of pretrained SIS models.
We show that the latent directions found by our method can effectively control the local appearance of semantic classes.
arXiv Detail & Related papers (2022-12-02T21:39:26Z) - Editing Out-of-domain GAN Inversion via Differential Activations [56.62964029959131]
We propose a novel GAN prior based editing framework to tackle the out-of-domain inversion problem with a composition-decomposition paradigm.
With the aid of the generated Diff-CAM mask, a coarse reconstruction can intuitively be composited by the paired original and edited images.
In the decomposition phase, we further present a GAN prior based deghosting network for separating the final fine edited image from the coarse reconstruction.
arXiv Detail & Related papers (2022-07-17T10:34:58Z) - VecGAN: Image-to-Image Translation with Interpretable Latent Directions [4.7590051176368915]
VecGAN is an image-to-image translation framework for facial attribute editing with interpretable latent directions.
VecGAN achieves significant improvements over state-of-the-arts for both local and global edits.
arXiv Detail & Related papers (2022-07-07T16:31:05Z) - CLIP2StyleGAN: Unsupervised Extraction of StyleGAN Edit Directions [65.00528970576401]
StyleGAN has enabled unprecedented semantic editing capabilities, on both synthesized and real images.
We propose two novel building blocks; one for finding interesting CLIP directions and one for labeling arbitrary directions in CLIP latent space.
We evaluate the effectiveness of the proposed method and demonstrate that extraction of disentangled labeled StyleGAN edit directions is indeed possible.
arXiv Detail & Related papers (2021-12-09T21:26:03Z) - Delta-GAN-Encoder: Encoding Semantic Changes for Explicit Image Editing,
using Few Synthetic Samples [2.348633570886661]
We propose a novel method for learning to control any desired attribute in a pre-trained GAN's latent space.
We perform Sim2Real learning, relying on minimal samples to achieve an unlimited amount of continuous precise edits.
arXiv Detail & Related papers (2021-11-16T12:42:04Z) - EigenGAN: Layer-Wise Eigen-Learning for GANs [84.33920839885619]
EigenGAN is able to unsupervisedly mine interpretable and controllable dimensions from different generator layers.
By traversing the coefficient of a specific eigen-dimension, the generator can produce samples with continuous changes corresponding to a specific semantic attribute.
arXiv Detail & Related papers (2021-04-26T11:14:37Z) - PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN.
We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN.
An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.