StableIdentity: Inserting Anybody into Anywhere at First Sight
- URL: http://arxiv.org/abs/2401.15975v1
- Date: Mon, 29 Jan 2024 09:06:15 GMT
- Title: StableIdentity: Inserting Anybody into Anywhere at First Sight
- Authors: Qinghe Wang, Xu Jia, Xiaomin Li, Taiqing Li, Liqian Ma, Yunzhi Zhuge,
Huchuan Lu
- Abstract summary: We propose StableIdentity, which allows identity-consistent recontextualization with just one face image.
We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
- Score: 57.99693188913382
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in large pretrained text-to-image models have shown
unprecedented capabilities for high-quality human-centric generation, however,
customizing face identity is still an intractable problem. Existing methods
cannot ensure stable identity preservation and flexible editability, even with
several images for each subject during training. In this work, we propose
StableIdentity, which allows identity-consistent recontextualization with just
one face image. More specifically, we employ a face encoder with an identity
prior to encode the input face, and then land the face representation into a
space with an editable prior, which is constructed from celeb names. By
incorporating identity prior and editability prior, the learned identity can be
injected anywhere with various contexts. In addition, we design a masked
two-phase diffusion loss to boost the pixel-level perception of the input face
and maintain the diversity of generation. Extensive experiments demonstrate our
method outperforms previous customization methods. In addition, the learned
identity can be flexibly combined with the off-the-shelf modules such as
ControlNet. Notably, to the best knowledge, we are the first to directly inject
the identity learned from a single image into video/3D generation without
finetuning. We believe that the proposed StableIdentity is an important step to
unify image, video, and 3D customized generation models.
Related papers
- G2Face: High-Fidelity Reversible Face Anonymization via Generative and Geometric Priors [71.69161292330504]
Reversible face anonymization seeks to replace sensitive identity information in facial images with synthesized alternatives.
This paper introduces Gtextsuperscript2Face, which leverages both generative and geometric priors to enhance identity manipulation.
Our method outperforms existing state-of-the-art techniques in face anonymization and recovery, while preserving high data utility.
arXiv Detail & Related papers (2024-08-18T12:36:47Z) - MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation [59.13765130528232]
We present MasterWeaver, a test-time tuning-free method designed to generate personalized images with both faithful identity fidelity and flexible editability.
Specifically, MasterWeaver adopts an encoder to extract identity features and steers the image generation through additional introduced cross attention.
To improve editability while maintaining identity fidelity, we propose an editing direction loss for training, which aligns the editing directions of our MasterWeaver with those of the original T2I model.
arXiv Detail & Related papers (2024-05-09T14:42:16Z) - FlashFace: Human Image Personalization with High-fidelity Identity Preservation [59.76645602354481]
FlashFace allows users to easily personalize their own photos by providing one or a few reference face images and a text prompt.
Our approach is distinguishable from existing human photo customization methods by higher-fidelity identity preservation and better instruction following.
arXiv Detail & Related papers (2024-03-25T17:59:57Z) - Face2Diffusion for Fast and Editable Face Personalization [33.65484538815936]
We propose Face2Diffusion (F2D) for high-editability face personalization.
The core idea behind F2D is that removing identity-irrelevant information from the training pipeline prevents the overfitting problem.
F2D consists of the following three novel components.
arXiv Detail & Related papers (2024-03-08T06:46:01Z) - When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for
Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images.
We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models.
Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z) - DreamIdentity: Improved Editability for Efficient Face-identity
Preserved Image Generation [69.16517915592063]
We propose a novel face-identity encoder to learn an accurate representation of human faces.
We also propose self-augmented editability learning to enhance the editability of models.
Our methods can generate identity-preserved images under different scenes at a much faster speed.
arXiv Detail & Related papers (2023-07-01T11:01:17Z) - A Systematical Solution for Face De-identification [6.244117712209321]
In different tasks, people have various requirements for face de-identification (De-ID)
We propose a systematical solution compatible for these De-ID operations.
Our method can flexibly de-identify the face data in various ways and the processed images have high image quality.
arXiv Detail & Related papers (2021-07-19T02:02:51Z) - IdentityDP: Differential Private Identification Protection for Face
Images [17.33916392050051]
Face de-identification, also known as face anonymization, refers to generating another image with similar appearance and the same background, while the real identity is hidden.
We propose IdentityDP, a face anonymization framework that combines a data-driven deep neural network with a differential privacy mechanism.
Our model can effectively obfuscate the identity-related information of faces, preserve significant visual similarity, and generate high-quality images.
arXiv Detail & Related papers (2021-03-02T14:26:00Z) - VAE/WGAN-Based Image Representation Learning For Pose-Preserving
Seamless Identity Replacement In Facial Images [15.855376604558977]
We present a novel variational generative adversarial network (VGAN) based on Wasserstein loss.
We show that our network can be used to perform pose-preserving identity morphing and identity-preserving pose morphing.
arXiv Detail & Related papers (2020-03-02T03:35:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.