DreamIdentity: Improved Editability for Efficient Face-identity
Preserved Image Generation
- URL: http://arxiv.org/abs/2307.00300v1
- Date: Sat, 1 Jul 2023 11:01:17 GMT
- Title: DreamIdentity: Improved Editability for Efficient Face-identity
Preserved Image Generation
- Authors: Zhuowei Chen, Shancheng Fang, Wei Liu, Qian He, Mengqi Huang, Yongdong
Zhang, Zhendong Mao
- Abstract summary: We propose a novel face-identity encoder to learn an accurate representation of human faces.
We also propose self-augmented editability learning to enhance the editability of models.
Our methods can generate identity-preserved images under different scenes at a much faster speed.
- Score: 69.16517915592063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While large-scale pre-trained text-to-image models can synthesize diverse and
high-quality human-centric images, an intractable problem is how to preserve
the face identity for conditioned face images. Existing methods either require
time-consuming optimization for each face-identity or learning an efficient
encoder at the cost of harming the editability of models. In this work, we
present an optimization-free method for each face identity, meanwhile keeping
the editability for text-to-image models. Specifically, we propose a novel
face-identity encoder to learn an accurate representation of human faces, which
applies multi-scale face features followed by a multi-embedding projector to
directly generate the pseudo words in the text embedding space. Besides, we
propose self-augmented editability learning to enhance the editability of
models, which is achieved by constructing paired generated face and edited face
images using celebrity names, aiming at transferring mature ability of
off-the-shelf text-to-image models in celebrity faces to unseen faces.
Extensive experiments show that our methods can generate identity-preserved
images under different scenes at a much faster speed.
Related papers
- FlashFace: Human Image Personalization with High-fidelity Identity Preservation [59.76645602354481]
FlashFace allows users to easily personalize their own photos by providing one or a few reference face images and a text prompt.
Our approach is distinguishable from existing human photo customization methods by higher-fidelity identity preservation and better instruction following.
arXiv Detail & Related papers (2024-03-25T17:59:57Z) - Face2Diffusion for Fast and Editable Face Personalization [33.65484538815936]
We propose Face2Diffusion (F2D) for high-editability face personalization.
The core idea behind F2D is that removing identity-irrelevant information from the training pipeline prevents the overfitting problem.
F2D consists of the following three novel components.
arXiv Detail & Related papers (2024-03-08T06:46:01Z) - StableIdentity: Inserting Anybody into Anywhere at First Sight [57.99693188913382]
We propose StableIdentity, which allows identity-consistent recontextualization with just one face image.
We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
arXiv Detail & Related papers (2024-01-29T09:06:15Z) - PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved
Personalization [92.90392834835751]
PortraitBooth is designed for high efficiency, robust identity preservation, and expression-editable text-to-image generation.
PortraitBooth eliminates computational overhead and mitigates identity distortion.
It incorporates emotion-aware cross-attention control for diverse facial expressions in generated images.
arXiv Detail & Related papers (2023-12-11T13:03:29Z) - When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for
Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images.
We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models.
Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z) - Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo
Collection [65.92058628082322]
Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions.
This paper presents a novel Learning to Aggregate and Personalize framework for unsupervised robust 3D face modeling.
arXiv Detail & Related papers (2021-06-15T03:10:17Z) - S2FGAN: Semantically Aware Interactive Sketch-to-Face Translation [11.724779328025589]
This paper proposes a sketch-to-image generation framework called S2FGAN.
We employ two latent spaces to control the face appearance and adjust the desired attributes of the generated face.
Our method successfully outperforms state-of-the-art methods on attribute manipulation by exploiting greater control of attribute intensity.
arXiv Detail & Related papers (2020-11-30T13:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.