Face Attribute Editing with Disentangled Latent Vectors
- URL: http://arxiv.org/abs/2301.04628v1
- Date: Wed, 11 Jan 2023 18:32:13 GMT
- Title: Face Attribute Editing with Disentangled Latent Vectors
- Authors: Yusuf Dalva, Hamza Pehlivan, Cansu Moran, \"Oyk\"u Irmak
Hatipo\u{g}lu, Ay\c{s}eg\"ul D\"undar
- Abstract summary: We propose an image-to-image translation framework for facial attribute editing.
Inspired by the latent space factorization works of fixed pretrained GANs, we design the attribute editing by latent space factorization.
To project images to semantically organized latent spaces, we set an encoder-decoder architecture with attention-based skip connections.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an image-to-image translation framework for facial attribute
editing with disentangled interpretable latent directions. Facial attribute
editing task faces the challenges of targeted attribute editing with
controllable strength and disentanglement in the representations of attributes
to preserve the other attributes during edits. For this goal, inspired by the
latent space factorization works of fixed pretrained GANs, we design the
attribute editing by latent space factorization, and for each attribute, we
learn a linear direction that is orthogonal to the others. We train these
directions with orthogonality constraints and disentanglement losses. To
project images to semantically organized latent spaces, we set an
encoder-decoder architecture with attention-based skip connections. We
extensively compare with previous image translation algorithms and editing with
pretrained GAN works. Our extensive experiments show that our method
significantly improves over the state-of-the-arts. Project page:
https://yusufdalva.github.io/vecgan
Related papers
- EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing [62.15822650722473]
Current video editing methods fail to edit the foreground and background simultaneously while preserving the original layout.
We introduce EVA, a textbfzero-shot and textbfmulti-attribute video editing framework tailored for human-centric videos with complex motions.
EVA can be easily generalized to multi-object editing scenarios and achieves accurate identity mapping.
arXiv Detail & Related papers (2024-03-24T12:04:06Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - VecGAN: Image-to-Image Translation with Interpretable Latent Directions [4.7590051176368915]
VecGAN is an image-to-image translation framework for facial attribute editing with interpretable latent directions.
VecGAN achieves significant improvements over state-of-the-arts for both local and global edits.
arXiv Detail & Related papers (2022-07-07T16:31:05Z) - Towards Counterfactual Image Manipulation via CLIP [106.94502632502194]
Existing methods can achieve realistic editing of different visual attributes such as age and gender of facial images.
We investigate this problem in a text-driven manner with Contrastive-Language-Image-Pretraining (CLIP)
We design a novel contrastive loss that exploits predefined CLIP-space directions to guide the editing toward desired directions from different perspectives.
arXiv Detail & Related papers (2022-07-06T17:02:25Z) - Semantic Unfolding of StyleGAN Latent Space [0.7646713951724012]
Generative adversarial networks (GANs) have proven to be surprisingly efficient for image editing by inverting and manipulating the latent code corresponding to an input real image.
This editing property emerges from the disentangled nature of the latent space.
In this paper, we identify that the facial attribute disentanglement is not optimal, thus facial editing relying on linear attribute separation is flawed.
arXiv Detail & Related papers (2022-06-29T20:22:10Z) - Each Attribute Matters: Contrastive Attention for Sentence-based Image
Editing [13.321782757637303]
Sentence-based Image Editing (SIE) aims to deploy natural language to edit an image.
Existing methods can hardly produce accurate editing when the query sentence is with multiple editable attributes.
This paper proposes a novel model called Contrastive Attention Generative Adversarial Network (CA-GAN)
arXiv Detail & Related papers (2021-10-21T14:06:20Z) - Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space
Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks.
Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism.
We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Towards Disentangling Latent Space for Unsupervised Semantic Face
Editing [21.190437168936764]
Supervised attribute editing requires annotated training data which is difficult to obtain and limits the editable attributes to those with labels.
In this paper, we present a new technique termed Structure-Texture Independent Architecture with Weight Decomposition and Orthogonal Regularization (STIA-WO) to disentangle the latent space for unsupervised semantic face editing.
arXiv Detail & Related papers (2020-11-05T03:29:24Z) - PA-GAN: Progressive Attention Generative Adversarial Network for Facial
Attribute Editing [67.94255549416548]
We propose a progressive attention GAN (PA-GAN) for facial attribute editing.
Our approach achieves correct attribute editing with irrelevant details much better preserved compared with the state-of-the-arts.
arXiv Detail & Related papers (2020-07-12T03:04:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.