Related papers: StableIdentity: Inserting Anybody into Anywhere at First Sight

StableIdentity: Inserting Anybody into Anywhere at First Sight

URL: http://arxiv.org/abs/2401.15975v1
Date: Mon, 29 Jan 2024 09:06:15 GMT
Title: StableIdentity: Inserting Anybody into Anywhere at First Sight
Authors: Qinghe Wang, Xu Jia, Xiaomin Li, Taiqing Li, Liqian Ma, Yunzhi Zhuge, Huchuan Lu
Abstract summary: We propose StableIdentity, which allows identity-consistent recontextualization with just one face image. We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
Score: 57.99693188913382
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in large pretrained text-to-image models have shown unprecedented capabilities for high-quality human-centric generation, however, customizing face identity is still an intractable problem. Existing methods cannot ensure stable identity preservation and flexible editability, even with several images for each subject during training. In this work, we propose StableIdentity, which allows identity-consistent recontextualization with just one face image. More specifically, we employ a face encoder with an identity prior to encode the input face, and then land the face representation into a space with an editable prior, which is constructed from celeb names. By incorporating identity prior and editability prior, the learned identity can be injected anywhere with various contexts. In addition, we design a masked two-phase diffusion loss to boost the pixel-level perception of the input face and maintain the diversity of generation. Extensive experiments demonstrate our method outperforms previous customization methods. In addition, the learned identity can be flexibly combined with the off-the-shelf modules such as ControlNet. Notably, to the best knowledge, we are the first to directly inject the identity learned from a single image into video/3D generation without finetuning. We believe that the proposed StableIdentity is an important step to unify image, video, and 3D customized generation models.

Related papers

DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability [12.692129257068085]
DynamicID is a tuning-free framework supported by a dual-stage training paradigm. We have developed a curated VariFace-10k facial dataset, comprising 10k unique individuals, each represented by 35 distinct facial images.
arXiv Detail & Related papers (2025-03-09T08:16:19Z)
InstaFace: Identity-Preserving Facial Editing with Single Image Inference [13.067402877443902]
We introduce a novel diffusion-based framework, InstaFace, to generate realistic images while preserving identity using only a single image.<n>InstaFace harnesses 3D perspectives by integrating multiple 3DMM-based conditionals without introducing additional trainable parameters.<n>Our method outperforms several state-of-the-art approaches in terms of identity preservation, photorealism, and effective control of pose, expression, and lighting.
arXiv Detail & Related papers (2025-02-27T22:37:09Z)
G2Face: High-Fidelity Reversible Face Anonymization via Generative and Geometric Priors [71.69161292330504]
Reversible face anonymization seeks to replace sensitive identity information in facial images with synthesized alternatives. This paper introduces Gtextsuperscript2Face, which leverages both generative and geometric priors to enhance identity manipulation. Our method outperforms existing state-of-the-art techniques in face anonymization and recovery, while preserving high data utility.
arXiv Detail & Related papers (2024-08-18T12:36:47Z)
MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation [59.13765130528232]
We present MasterWeaver, a test-time tuning-free method designed to generate personalized images with both faithful identity fidelity and flexible editability. Specifically, MasterWeaver adopts an encoder to extract identity features and steers the image generation through additional introduced cross attention. To improve editability while maintaining identity fidelity, we propose an editing direction loss for training, which aligns the editing directions of our MasterWeaver with those of the original T2I model.
arXiv Detail & Related papers (2024-05-09T14:42:16Z)
FlashFace: Human Image Personalization with High-fidelity Identity Preservation [59.76645602354481]
FlashFace allows users to easily personalize their own photos by providing one or a few reference face images and a text prompt. Our approach is distinguishable from existing human photo customization methods by higher-fidelity identity preservation and better instruction following.
arXiv Detail & Related papers (2024-03-25T17:59:57Z)
Face2Diffusion for Fast and Editable Face Personalization [33.65484538815936]
We propose Face2Diffusion (F2D) for high-editability face personalization. The core idea behind F2D is that removing identity-irrelevant information from the training pipeline prevents the overfitting problem. F2D consists of the following three novel components.
arXiv Detail & Related papers (2024-03-08T06:46:01Z)
When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images. We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models. Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z)
DreamIdentity: Improved Editability for Efficient Face-identity Preserved Image Generation [69.16517915592063]
We propose a novel face-identity encoder to learn an accurate representation of human faces. We also propose self-augmented editability learning to enhance the editability of models. Our methods can generate identity-preserved images under different scenes at a much faster speed.
arXiv Detail & Related papers (2023-07-01T11:01:17Z)
A Systematical Solution for Face De-identification [6.244117712209321]
In different tasks, people have various requirements for face de-identification (De-ID) We propose a systematical solution compatible for these De-ID operations. Our method can flexibly de-identify the face data in various ways and the processed images have high image quality.
arXiv Detail & Related papers (2021-07-19T02:02:51Z)
IdentityDP: Differential Private Identification Protection for Face Images [17.33916392050051]
Face de-identification, also known as face anonymization, refers to generating another image with similar appearance and the same background, while the real identity is hidden. We propose IdentityDP, a face anonymization framework that combines a data-driven deep neural network with a differential privacy mechanism. Our model can effectively obfuscate the identity-related information of faces, preserve significant visual similarity, and generate high-quality images.
arXiv Detail & Related papers (2021-03-02T14:26:00Z)
VAE/WGAN-Based Image Representation Learning For Pose-Preserving Seamless Identity Replacement In Facial Images [15.855376604558977]
We present a novel variational generative adversarial network (VGAN) based on Wasserstein loss. We show that our network can be used to perform pose-preserving identity morphing and identity-preserving pose morphing.
arXiv Detail & Related papers (2020-03-02T03:35:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.