Face2Diffusion for Fast and Editable Face Personalization
- URL: http://arxiv.org/abs/2403.05094v1
- Date: Fri, 8 Mar 2024 06:46:01 GMT
- Title: Face2Diffusion for Fast and Editable Face Personalization
- Authors: Kaede Shiohara, Toshihiko Yamasaki
- Abstract summary: We propose Face2Diffusion (F2D) for high-editability face personalization.
The core idea behind F2D is that removing identity-irrelevant information from the training pipeline prevents the overfitting problem.
F2D consists of the following three novel components.
- Score: 33.65484538815936
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Face personalization aims to insert specific faces, taken from images, into
pretrained text-to-image diffusion models. However, it is still challenging for
previous methods to preserve both the identity similarity and editability due
to overfitting to training samples. In this paper, we propose Face2Diffusion
(F2D) for high-editability face personalization. The core idea behind F2D is
that removing identity-irrelevant information from the training pipeline
prevents the overfitting problem and improves editability of encoded faces. F2D
consists of the following three novel components: 1) Multi-scale identity
encoder provides well-disentangled identity features while keeping the benefits
of multi-scale information, which improves the diversity of camera poses. 2)
Expression guidance disentangles face expressions from identities and improves
the controllability of face expressions. 3) Class-guided denoising
regularization encourages models to learn how faces should be denoised, which
boosts the text-alignment of backgrounds. Extensive experiments on the
FaceForensics++ dataset and diverse prompts demonstrate our method greatly
improves the trade-off between the identity- and text-fidelity compared to
previous state-of-the-art methods.
Related papers
- FlashFace: Human Image Personalization with High-fidelity Identity Preservation [59.76645602354481]
FlashFace allows users to easily personalize their own photos by providing one or a few reference face images and a text prompt.
Our approach is distinguishable from existing human photo customization methods by higher-fidelity identity preservation and better instruction following.
arXiv Detail & Related papers (2024-03-25T17:59:57Z) - Beyond Inserting: Learning Identity Embedding for Semantic-Fidelity Personalized Diffusion Generation [21.739328335601716]
This paper focuses on inserting accurate and interactive ID embedding into the Stable Diffusion Model for personalized generation.
We propose a face-wise attention loss to fit the face region instead of entangling ID-unrelated information, such as face layout and background.
Our results exhibit superior ID accuracy, text-based manipulation ability, and generalization compared to previous methods.
arXiv Detail & Related papers (2024-01-31T11:52:33Z) - StableIdentity: Inserting Anybody into Anywhere at First Sight [57.99693188913382]
We propose StableIdentity, which allows identity-consistent recontextualization with just one face image.
We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
arXiv Detail & Related papers (2024-01-29T09:06:15Z) - When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for
Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images.
We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models.
Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z) - DreamIdentity: Improved Editability for Efficient Face-identity
Preserved Image Generation [69.16517915592063]
We propose a novel face-identity encoder to learn an accurate representation of human faces.
We also propose self-augmented editability learning to enhance the editability of models.
Our methods can generate identity-preserved images under different scenes at a much faster speed.
arXiv Detail & Related papers (2023-07-01T11:01:17Z) - Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo
Collection [65.92058628082322]
Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions.
This paper presents a novel Learning to Aggregate and Personalize framework for unsupervised robust 3D face modeling.
arXiv Detail & Related papers (2021-06-15T03:10:17Z) - DotFAN: A Domain-transferred Face Augmentation Network for Pose and
Illumination Invariant Face Recognition [94.96686189033869]
We propose a 3D model-assisted domain-transferred face augmentation network (DotFAN)
DotFAN can generate a series of variants of an input face based on the knowledge distilled from existing rich face datasets collected from other domains.
Experiments show that DotFAN is beneficial for augmenting small face datasets to improve their within-class diversity.
arXiv Detail & Related papers (2020-02-23T08:16:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.