FaceCrafter: Identity-Conditional Diffusion with Disentangled Control over Facial Pose, Expression, and Emotion
- URL: http://arxiv.org/abs/2505.15313v1
- Date: Wed, 21 May 2025 09:43:21 GMT
- Title: FaceCrafter: Identity-Conditional Diffusion with Disentangled Control over Facial Pose, Expression, and Emotion
- Authors: Kazuaki Mishima, Antoni Bigata Casademunt, Stavros Petridis, Maja Pantic, Kenji Suzuki,
- Abstract summary: We propose a novel identity-conditional diffusion model that allows precise control over pose, expression, and emotion without compromising identity preservation.<n>Our method surpasses existing approaches in terms of control accuracy over pose, expression, and emotion, while also improving generative diversity under identity-only conditioning.
- Score: 31.56574795895158
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human facial images encode a rich spectrum of information, encompassing both stable identity-related traits and mutable attributes such as pose, expression, and emotion. While recent advances in image generation have enabled high-quality identity-conditional face synthesis, precise control over non-identity attributes remains challenging, and disentangling identity from these mutable factors is particularly difficult. To address these limitations, we propose a novel identity-conditional diffusion model that introduces two lightweight control modules designed to independently manipulate facial pose, expression, and emotion without compromising identity preservation. These modules are embedded within the cross-attention layers of the base diffusion model, enabling precise attribute control with minimal parameter overhead. Furthermore, our tailored training strategy, which leverages cross-attention between the identity feature and each non-identity control feature, encourages identity features to remain orthogonal to control signals, enhancing controllability and diversity. Quantitative and qualitative evaluations, along with perceptual user studies, demonstrate that our method surpasses existing approaches in terms of control accuracy over pose, expression, and emotion, while also improving generative diversity under identity-only conditioning.
Related papers
- High-Fidelity Diffusion Face Swapping with ID-Constrained Facial Conditioning [39.09330483562798]
Face swapping aims to seamlessly transfer a source facial identity onto a target while preserving target attributes such as pose and expression.<n> Diffusion models, known for their superior generative capabilities, have recently shown promise in advancing face-swapping quality.<n>This paper addresses two key challenges in diffusion-based face swapping: the prioritized preservation of identity over target attributes and the inherent conflict between identity and attribute conditioning.
arXiv Detail & Related papers (2025-03-28T06:50:17Z) - iFADIT: Invertible Face Anonymization via Disentangled Identity Transform [51.123936665445356]
Face anonymization aims to conceal the visual identity of a face to safeguard the individual's privacy.<n>This paper proposes a novel framework named iFADIT, an acronym for Invertible Face Anonymization via Disentangled Identity Transform.
arXiv Detail & Related papers (2025-01-08T10:08:09Z) - EmojiDiff: Advanced Facial Expression Control with High Identity Preservation in Portrait Generation [8.314556078632412]
We introduce EmojiDiff, the first end-to-end solution that enables simultaneous control of extremely detailed expression (RGB-level) and high-fidelity identity in portrait generation.<n>For decoupled training, we innovate ID-irrelevant Data Iteration (IDI) to synthesize cross-identity expression pairs.<n>We also present ID-enhanced Contrast Alignment (ICA) for further fine-tuning.
arXiv Detail & Related papers (2024-12-02T08:24:11Z) - ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition [60.15830516741776]
Synthetic face recognition (SFR) aims to generate datasets that mimic the distribution of real face data.
We introduce a diffusion-fueled SFR model termed $textID3$.
$textID3$ employs an ID-preserving loss to generate diverse yet identity-consistent facial appearances.
arXiv Detail & Related papers (2024-09-26T06:46:40Z) - DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation [84.0586749616249]
This paper presents DiffFAE, a one-stage and highly-efficient diffusion-based framework tailored for high-fidelity Facial Appearance Editing.
For high-fidelity query attributes transfer, we adopt Space-sensitive Physical Customization (SPC), which ensures the fidelity and generalization ability.
In order to preserve source attributes, we introduce the Region-responsive Semantic Composition (RSC)
This module is guided to learn decoupled source-regarding features, thereby better preserving the identity and alleviating artifacts from non-facial attributes such as hair, clothes, and background.
arXiv Detail & Related papers (2024-03-26T12:53:10Z) - Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm [31.06269858216316]
We propose Infinite-ID, an ID-semantics decoupling paradigm for identity-preserved personalization.
We introduce an identity-enhanced training, incorporating an additional image cross-attention module to capture sufficient ID information.
We also introduce a feature interaction mechanism that combines a mixed attention module with an AdaIN-mean operation to seamlessly merge the two streams.
arXiv Detail & Related papers (2024-03-18T13:39:53Z) - Disentangle Before Anonymize: A Two-stage Framework for Attribute-preserved and Occlusion-robust De-identification [55.741525129613535]
"Disentangle Before Anonymize" is a novel two-stage Framework(DBAF)<n>This framework includes a Contrastive Identity Disentanglement (CID) module and a Key-authorized Reversible Identity Anonymization (KRIA) module.<n>Extensive experiments demonstrate that our method outperforms state-of-the-art de-identification approaches.
arXiv Detail & Related papers (2023-11-15T08:59:02Z) - Controllable Inversion of Black-Box Face Recognition Models via
Diffusion [8.620807177029892]
We tackle the task of inverting the latent space of pre-trained face recognition models without full model access.
We show that the conditional diffusion model loss naturally emerges and that we can effectively sample from the inverse distribution.
Our method is the first black-box face recognition model inversion method that offers intuitive control over the generation process.
arXiv Detail & Related papers (2023-03-23T03:02:09Z) - FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping [62.38898610210771]
We present a new single-stage method for subject face swapping and identity transfer, named FaceDancer.
We have two major contributions: Adaptive Feature Fusion Attention (AFFA) and Interpreted Feature Similarity Regularization (IFSR)
arXiv Detail & Related papers (2022-10-19T11:31:38Z) - Disentangling Identity and Pose for Facial Expression Recognition [54.50747989860957]
We propose an identity and pose disentangled facial expression recognition (IPD-FER) model to learn more discriminative feature representation.
For identity encoder, a well pre-trained face recognition model is utilized and fixed during training, which alleviates the restriction on specific expression training data.
By comparing the difference between synthesized neutral and expressional images of the same individual, the expression component is further disentangled from identity and pose.
arXiv Detail & Related papers (2022-08-17T06:48:13Z) - FICGAN: Facial Identity Controllable GAN for De-identification [34.38379234653657]
We present Facial Identity Controllable GAN (FICGAN) for generating high-quality de-identified face images with ensured privacy protection.
Based on the analysis, we develop FICGAN, an autoencoder-based conditional generative model that learns to disentangle the identity attributes from non-identity attributes on a face image.
arXiv Detail & Related papers (2021-10-02T07:09:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.