InstaFace: Identity-Preserving Facial Editing with Single Image Inference
- URL: http://arxiv.org/abs/2502.20577v3
- Date: Fri, 07 Mar 2025 03:06:40 GMT
- Title: InstaFace: Identity-Preserving Facial Editing with Single Image Inference
- Authors: MD Wahiduzzaman Khan, Mingshan Jia, Xiaolin Zhang, En Yu, Caifeng Shan, Kaska Musial-Gabrys,
- Abstract summary: We introduce a novel diffusion-based framework, InstaFace, to generate realistic images while preserving identity using only a single image.<n>InstaFace harnesses 3D perspectives by integrating multiple 3DMM-based conditionals without introducing additional trainable parameters.<n>Our method outperforms several state-of-the-art approaches in terms of identity preservation, photorealism, and effective control of pose, expression, and lighting.
- Score: 13.067402877443902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial appearance editing is crucial for digital avatars, AR/VR, and personalized content creation, driving realistic user experiences. However, preserving identity with generative models is challenging, especially in scenarios with limited data availability. Traditional methods often require multiple images and still struggle with unnatural face shifts, inconsistent hair alignment, or excessive smoothing effects. To overcome these challenges, we introduce a novel diffusion-based framework, InstaFace, to generate realistic images while preserving identity using only a single image. Central to InstaFace, we introduce an efficient guidance network that harnesses 3D perspectives by integrating multiple 3DMM-based conditionals without introducing additional trainable parameters. Moreover, to ensure maximum identity retention as well as preservation of background, hair, and other contextual features like accessories, we introduce a novel module that utilizes feature embeddings from a facial recognition model and a pre-trained vision-language model. Quantitative evaluations demonstrate that our method outperforms several state-of-the-art approaches in terms of identity preservation, photorealism, and effective control of pose, expression, and lighting.
Related papers
- Data Synthesis with Diverse Styles for Face Recognition via 3DMM-Guided Diffusion [37.847141686823264]
Identity-preserving face synthesis aims to generate synthetic face images of virtual subjects that can substitute real-world data for training face recognition models.
Prior arts strive to create images with consistent identities and diverse styles, but they face a trade-off between them.
This paper introduces MorphFace, a diffusion-based face generator.
arXiv Detail & Related papers (2025-04-01T05:22:53Z) - Towards Consistent and Controllable Image Synthesis for Face Editing [18.646961062736207]
RigFace is a novel approach to control the lighting, facial expression and head pose of a portrait photo.<n>Our model achieves comparable or even superior performance in both identity preservation and photorealism compared to existing face editing models.
arXiv Detail & Related papers (2025-02-04T16:36:07Z) - Face Anonymization Made Simple [44.24233169815565]
Current face anonymization techniques often depend on identity loss calculated by face recognition models, which can be inaccurate and unreliable.
In contrast, our approach uses diffusion models with only a reconstruction loss, eliminating the need for facial landmarks or masks.
Our model achieves state-of-the-art performance in three key areas: identity anonymization, facial preservation, and image quality.
arXiv Detail & Related papers (2024-11-01T17:45:21Z) - Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models [69.50286698375386]
We propose a novel approach that better harnesses diffusion models for face-swapping.
We introduce a mask shuffling technique during inpainting training, which allows us to create a so-called universal model for swapping.
Ours is a relatively unified approach and so it is resilient to errors in other off-the-shelf models.
arXiv Detail & Related papers (2024-09-11T13:43:53Z) - DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation [84.0586749616249]
This paper presents DiffFAE, a one-stage and highly-efficient diffusion-based framework tailored for high-fidelity Facial Appearance Editing.
For high-fidelity query attributes transfer, we adopt Space-sensitive Physical Customization (SPC), which ensures the fidelity and generalization ability.
In order to preserve source attributes, we introduce the Region-responsive Semantic Composition (RSC)
This module is guided to learn decoupled source-regarding features, thereby better preserving the identity and alleviating artifacts from non-facial attributes such as hair, clothes, and background.
arXiv Detail & Related papers (2024-03-26T12:53:10Z) - StableIdentity: Inserting Anybody into Anywhere at First Sight [57.99693188913382]
We propose StableIdentity, which allows identity-consistent recontextualization with just one face image.
We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
arXiv Detail & Related papers (2024-01-29T09:06:15Z) - PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved
Personalization [92.90392834835751]
PortraitBooth is designed for high efficiency, robust identity preservation, and expression-editable text-to-image generation.
PortraitBooth eliminates computational overhead and mitigates identity distortion.
It incorporates emotion-aware cross-attention control for diverse facial expressions in generated images.
arXiv Detail & Related papers (2023-12-11T13:03:29Z) - When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for
Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images.
We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models.
Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z) - Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo
Collection [65.92058628082322]
Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions.
This paper presents a novel Learning to Aggregate and Personalize framework for unsupervised robust 3D face modeling.
arXiv Detail & Related papers (2021-06-15T03:10:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.