DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability
- URL: http://arxiv.org/abs/2503.06505v3
- Date: Mon, 21 Jul 2025 07:11:27 GMT
- Title: DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability
- Authors: Xirui Hu, Jiahao Wang, Hao Chen, Weizhan Zhang, Benqi Wang, Yikun Li, Haishun Nan,
- Abstract summary: We present DynamicID, a tuning-free framework that inherently facilitates both single-ID and multi-ID personalized generation.<n>Our key innovations include: 1) Semantic-Activated Attention (SAA), which employs query-level activation gating to minimize disruption to the base model when injecting ID features and achieve multi-ID personalization without requiring multi-ID samples during training; 2) Identity-Motion Reconfigurator (IMR), which applies feature-space manipulation to effectively disentangle facial motion and identity features, supporting flexible facial editing; and 3) a task-decoupled training paradigm that reduces data dependency, together with VariFace-10k,
- Score: 12.692129257068085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in text-to-image generation have driven interest in generating personalized human images that depict specific identities from reference images. Although existing methods achieve high-fidelity identity preservation, they are generally limited to single-ID scenarios and offer insufficient facial editability. We present DynamicID, a tuning-free framework that inherently facilitates both single-ID and multi-ID personalized generation with high fidelity and flexible facial editability. Our key innovations include: 1) Semantic-Activated Attention (SAA), which employs query-level activation gating to minimize disruption to the base model when injecting ID features and achieve multi-ID personalization without requiring multi-ID samples during training. 2) Identity-Motion Reconfigurator (IMR), which applies feature-space manipulation to effectively disentangle and reconfigure facial motion and identity features, supporting flexible facial editing. 3) a task-decoupled training paradigm that reduces data dependency, together with VariFace-10k, a curated dataset of 10k unique individuals, each represented by 35 distinct facial images. Experimental results demonstrate that DynamicID outperforms state-of-the-art methods in identity fidelity, facial editability, and multi-ID personalization capability. Our code will be released at https://github.com/ByteCat-bot/DynamicID.
Related papers
- ID-EA: Identity-driven Text Enhancement and Adaptation with Textual Inversion for Personalized Text-to-Image Generation [33.84646269805187]
ID-EA is a novel framework that guides text embeddings to align with visual identity embeddings.<n> ID-EA substantially outperforms state-of-the-art methods in identity preservation metrics.<n>It generates personalized portraits 15 times faster than existing approaches.
arXiv Detail & Related papers (2025-07-16T07:42:02Z) - ID-Booth: Identity-consistent Face Generation with Diffusion Models [10.042492056152232]
We present a novel generative diffusion-based framework called ID-Booth.
The framework enables identity-consistent image generation while retaining the synthesis capabilities of pretrained diffusion models.
Our method facilitates better intra-identity consistency and inter-identity separability than competing methods, while achieving higher image diversity.
arXiv Detail & Related papers (2025-04-10T02:20:18Z) - Turn That Frown Upside Down: FaceID Customization via Cross-Training Data [49.51940625552275]
CrossFaceID is the first large-scale, high-quality, and publicly available dataset designed to improve the facial modification capabilities of FaceID customization models.<n>It consists of 40,000 text-image pairs from approximately 2,000 persons, with each person represented by around 20 images showcasing diverse facial attributes.<n>During the training stage, a specific face of a person is used as input, and the FaceID customization model is forced to generate another image of the same person but with altered facial features.<n>Experiments show that models fine-tuned on the CrossFaceID dataset its performance in preserving FaceID fidelity while significantly improving its
arXiv Detail & Related papers (2025-01-26T05:27:38Z) - ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition [60.15830516741776]
Synthetic face recognition (SFR) aims to generate datasets that mimic the distribution of real face data.
We introduce a diffusion-fueled SFR model termed $textID3$.
$textID3$ employs an ID-preserving loss to generate diverse yet identity-consistent facial appearances.
arXiv Detail & Related papers (2024-09-26T06:46:40Z) - Disentangled Representations for Short-Term and Long-Term Person Re-Identification [33.76874948187976]
We propose a new generative adversarial network, dubbed identity shuffle GAN (IS-GAN)
It disentangles identity-related and unrelated features from person images through an identity-shuffling technique.
Experimental results validate the effectiveness of IS-GAN, showing state-of-the-art performance on standard reID benchmarks.
arXiv Detail & Related papers (2024-09-09T02:09:49Z) - InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation [0.0]
"InstantFamily" is an approach that employs a novel cross-attention mechanism and a multimodal embedding stack to achieve zero-shot multi-ID image generation.
Our method effectively preserves ID as it utilizes global and local features from a pre-trained face recognition model integrated with text conditions.
arXiv Detail & Related papers (2024-04-30T10:16:21Z) - ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving [64.90148669690228]
ConsistentID is an innovative method crafted for diverseidentity-preserving portrait generation under fine-grained multimodal facial prompts.<n>We present a fine-grained portrait dataset, FGID, with over 500,000 facial images, offering greater diversity and comprehensiveness than existing public facial datasets.
arXiv Detail & Related papers (2024-04-25T17:23:43Z) - ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning [57.91881829308395]
Identity-preserving text-to-image generation (ID-T2I) has received significant attention due to its wide range of application scenarios like AI portrait and advertising.
We present textbfID-Aligner, a general feedback learning framework to enhance ID-T2I performance.
arXiv Detail & Related papers (2024-04-23T18:41:56Z) - Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models [66.05234562835136]
We present MuDI, a novel framework that enables multi-subject personalization.
Our main idea is to utilize segmented subjects generated by a foundation model for segmentation.
Experimental results show that our MuDI can produce high-quality personalized images without identity mixing.
arXiv Detail & Related papers (2024-04-05T17:45:22Z) - IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models [31.762112403595612]
IDAdapter is a tuning-free approach that enhances the diversity and identity preservation in personalized image generation from a single face image.
During the training phase, we incorporate mixed features from multiple reference images of a specific identity to enrich identity-related content details.
arXiv Detail & Related papers (2024-03-20T12:13:04Z) - StableIdentity: Inserting Anybody into Anywhere at First Sight [57.99693188913382]
We propose StableIdentity, which allows identity-consistent recontextualization with just one face image.
We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
arXiv Detail & Related papers (2024-01-29T09:06:15Z) - PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved
Personalization [92.90392834835751]
PortraitBooth is designed for high efficiency, robust identity preservation, and expression-editable text-to-image generation.
PortraitBooth eliminates computational overhead and mitigates identity distortion.
It incorporates emotion-aware cross-attention control for diverse facial expressions in generated images.
arXiv Detail & Related papers (2023-12-11T13:03:29Z) - X-ReID: Cross-Instance Transformer for Identity-Level Person
Re-Identification [53.047542904329866]
Cross Intra-Identity Instances module (IntraX) fuses different intra-identity instances to transfer Identity-Level knowledge.
Cross Inter-Identity Instances module (InterX) involves hard positive and hard negative instances to improve the attention response to the same identity.
arXiv Detail & Related papers (2023-02-04T03:16:18Z) - FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping [62.38898610210771]
We present a new single-stage method for subject face swapping and identity transfer, named FaceDancer.
We have two major contributions: Adaptive Feature Fusion Attention (AFFA) and Interpreted Feature Similarity Regularization (IFSR)
arXiv Detail & Related papers (2022-10-19T11:31:38Z) - Semantic Consistency and Identity Mapping Multi-Component Generative
Adversarial Network for Person Re-Identification [39.605062525247135]
We propose a semantic consistency and identity mapping multi-component generative adversarial network (SC-IMGAN) which provides style adaptation from one to many domains.
Our proposed method outperforms state-of-the-art techniques on six challenging person Re-ID datasets.
arXiv Detail & Related papers (2021-04-28T14:12:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.