Related papers: Adaptive Nonlinear Latent Transformation for Conditional Face Editing

Adaptive Nonlinear Latent Transformation for Conditional Face Editing

URL: http://arxiv.org/abs/2307.07790v1
Date: Sat, 15 Jul 2023 12:36:50 GMT
Title: Adaptive Nonlinear Latent Transformation for Conditional Face Editing
Authors: Zhizhong Huang, Siteng Ma, Junping Zhang, Hongming Shan
Abstract summary: We propose a novel adaptive nonlinear latent transformation for disentangled and conditional face editing, termed AdaTrans. AdaTrans divides the manipulation process into several finer steps; i.e., the direction and size at each step are conditioned on both the facial attributes and the latent codes. AdaTrans enables a controllable face editing with the advantages of disentanglement, flexibility with non-binary attributes, and high fidelity.
Score: 40.32385363670918
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent works for face editing usually manipulate the latent space of StyleGAN via the linear semantic directions. However, they usually suffer from the entanglement of facial attributes, need to tune the optimal editing strength, and are limited to binary attributes with strong supervision signals. This paper proposes a novel adaptive nonlinear latent transformation for disentangled and conditional face editing, termed AdaTrans. Specifically, our AdaTrans divides the manipulation process into several finer steps; i.e., the direction and size at each step are conditioned on both the facial attributes and the latent codes. In this way, AdaTrans describes an adaptive nonlinear transformation trajectory to manipulate the faces into target attributes while keeping other attributes unchanged. Then, AdaTrans leverages a predefined density model to constrain the learned trajectory in the distribution of latent codes by maximizing the likelihood of transformed latent code. Moreover, we also propose a disentangled learning strategy under a mutual information framework to eliminate the entanglement among attributes, which can further relax the need for labeled data. Consequently, AdaTrans enables a controllable face editing with the advantages of disentanglement, flexibility with non-binary attributes, and high fidelity. Extensive experimental results on various facial attributes demonstrate the qualitative and quantitative effectiveness of the proposed AdaTrans over existing state-of-the-art methods, especially in the most challenging scenarios with a large age gap and few labeled examples. The source code is available at https://github.com/Hzzone/AdaTrans.

Related papers

From Prompt to Progression: Taming Video Diffusion Models for Seamless Attribute Transition [57.809291244375345]
We propose a simple yet effective method to extend existing models for smooth and consistent attribute transitions.<n>Our approach constructs a data-specific transitional direction for each noisy latent, guiding the gradual shift from initial to final attributes frame by frame.<n>We also present the Controlled-Attribute-Transition Benchmark (CAT-Bench), which integrates both attribute and motion dynamics.
arXiv Detail & Related papers (2025-09-24T01:58:22Z)
Self-supervised Transformation Learning for Equivariant Representations [26.207358743969277]
Unsupervised representation learning has significantly advanced various machine learning tasks. We propose Self-supervised Transformation Learning (STL), replacing transformation labels with transformation representations derived from image pairs. We demonstrate the approach's effectiveness across diverse classification and detection tasks, outperforming existing methods in 7 out of 11 benchmarks.
arXiv Detail & Related papers (2025-01-15T10:54:21Z)
PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE enhances global feature representation of point cloud masked autoencoders by making them both discriminative and sensitive to transformations.<n>We propose a novel loss that explicitly penalizes invariant collapse, enabling the network to capture richer transformation cues while preserving discriminative representations.
arXiv Detail & Related papers (2024-09-24T07:57:21Z)
ReDirTrans: Latent-to-Latent Translation for Gaze and Head Redirection [12.474515318770237]
Learning-based gaze estimation methods require large amounts of training data with accurate gaze annotations. We present a portable network, called ReDirTrans, achieving latent-to-latent translation for redirecting gaze directions. We also present improvements for the downstream learning-based gaze estimation task, using redirected samples as dataset augmentation.
arXiv Detail & Related papers (2023-05-19T06:13:26Z)
Expanding the Latent Space of StyleGAN for Real Face Editing [4.1715767752637145]
A surge of face editing techniques have been proposed to employ the pretrained StyleGAN for semantic manipulation. To successfully edit a real image, one must first convert the input image into StyleGAN's latent variables. We present a method to expand the latent space of StyleGAN with additional content features to break down the trade-off between low-distortion and high-editability.
arXiv Detail & Related papers (2022-04-26T18:27:53Z)
High-resolution Face Swapping via Latent Semantics Disentanglement [50.23624681222619]
We present a novel high-resolution hallucination face swapping method using the inherent prior knowledge of a pre-trained GAN model. We explicitly disentangle the latent semantics by utilizing the progressive nature of the generator. We extend our method to video face swapping by enforcing two-temporal constraints on the latent space and the image space.
arXiv Detail & Related papers (2022-03-30T00:33:08Z)
Latent Transformations via NeuralODEs for GAN-based Image Editing [25.272389610447856]
We show that nonlinear latent code manipulations realized as flows of a trainable Neural ODE are beneficial for many practical non-face image domains. In particular, we investigate a large number of datasets with known attributes and demonstrate that certain attribute manipulations are challenging to obtain with linear shifts only.
arXiv Detail & Related papers (2021-11-29T18:59:54Z)
FacialGAN: Style Transfer and Attribute Manipulation on Synthetic Faces [9.664892091493586]
FacialGAN is a novel framework enabling simultaneous rich style transfers and interactive facial attributes manipulation. We show our model's capacity in producing visually compelling results in style transfer, attribute manipulation, diversity and face verification.
arXiv Detail & Related papers (2021-10-18T15:53:38Z)
Transformers Solve the Limited Receptive Field for Monocular Depth Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers. This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z)
FaceController: Controllable Attribute Editing for Face in the Wild [74.56117807309576]
We propose a simple feed-forward network to generate high-fidelity manipulated faces. By simply employing some existing and easy-obtainable prior information, our method can control, transfer, and edit diverse attributes of faces in the wild. In our method, we decouple identity, expression, pose, and illumination using 3D priors; separate texture and colors by using region-wise style codes.
arXiv Detail & Related papers (2021-02-23T02:47:28Z)
GAN "Steerability" without optimization [32.63317794951011]
"steering" directions correspond to semantically meaningful image transformations. We show that "steering" trajectories can be computed in closed form directly from the generator's weights.
arXiv Detail & Related papers (2020-12-09T21:34:34Z)
AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection [76.7063732501752]
We provide a new identity swapping algorithm with large differences in appearance for face forgery detection. The appearance gaps mainly arise from the large discrepancies in illuminations and skin colors. A discriminator is introduced to distinguish the fake parts from a mix of real and fake image patches.
arXiv Detail & Related papers (2020-11-05T06:17:04Z)
PA-GAN: Progressive Attention Generative Adversarial Network for Facial Attribute Editing [67.94255549416548]
We propose a progressive attention GAN (PA-GAN) for facial attribute editing. Our approach achieves correct attribute editing with irrelevant details much better preserved compared with the state-of-the-arts.
arXiv Detail & Related papers (2020-07-12T03:04:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.