Related papers: DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability

DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability

URL: http://arxiv.org/abs/2503.06505v3
Date: Mon, 21 Jul 2025 07:11:27 GMT
Title: DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability
Authors: Xirui Hu, Jiahao Wang, Hao Chen, Weizhan Zhang, Benqi Wang, Yikun Li, Haishun Nan,
Abstract summary: We present DynamicID, a tuning-free framework that inherently facilitates both single-ID and multi-ID personalized generation.<n>Our key innovations include: 1) Semantic-Activated Attention (SAA), which employs query-level activation gating to minimize disruption to the base model when injecting ID features and achieve multi-ID personalization without requiring multi-ID samples during training; 2) Identity-Motion Reconfigurator (IMR), which applies feature-space manipulation to effectively disentangle facial motion and identity features, supporting flexible facial editing; and 3) a task-decoupled training paradigm that reduces data dependency, together with VariFace-10k,
Score: 12.692129257068085
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in text-to-image generation have driven interest in generating personalized human images that depict specific identities from reference images. Although existing methods achieve high-fidelity identity preservation, they are generally limited to single-ID scenarios and offer insufficient facial editability. We present DynamicID, a tuning-free framework that inherently facilitates both single-ID and multi-ID personalized generation with high fidelity and flexible facial editability. Our key innovations include: 1) Semantic-Activated Attention (SAA), which employs query-level activation gating to minimize disruption to the base model when injecting ID features and achieve multi-ID personalization without requiring multi-ID samples during training. 2) Identity-Motion Reconfigurator (IMR), which applies feature-space manipulation to effectively disentangle and reconfigure facial motion and identity features, supporting flexible facial editing. 3) a task-decoupled training paradigm that reduces data dependency, together with VariFace-10k, a curated dataset of 10k unique individuals, each represented by 35 distinct facial images. Experimental results demonstrate that DynamicID outperforms state-of-the-art methods in identity fidelity, facial editability, and multi-ID personalization capability. Our code will be released at https://github.com/ByteCat-bot/DynamicID.

Related papers

ID-EA: Identity-driven Text Enhancement and Adaptation with Textual Inversion for Personalized Text-to-Image Generation [33.84646269805187]
ID-EA is a novel framework that guides text embeddings to align with visual identity embeddings.<n> ID-EA substantially outperforms state-of-the-art methods in identity preservation metrics.<n>It generates personalized portraits 15 times faster than existing approaches.
arXiv Detail & Related papers (2025-07-16T07:42:02Z)
ID-Booth: Identity-consistent Face Generation with Diffusion Models [10.042492056152232]
We present a novel generative diffusion-based framework called ID-Booth. The framework enables identity-consistent image generation while retaining the synthesis capabilities of pretrained diffusion models. Our method facilitates better intra-identity consistency and inter-identity separability than competing methods, while achieving higher image diversity.
arXiv Detail & Related papers (2025-04-10T02:20:18Z)
Turn That Frown Upside Down: FaceID Customization via Cross-Training Data [49.51940625552275]
CrossFaceID is the first large-scale, high-quality, and publicly available dataset designed to improve the facial modification capabilities of FaceID customization models.<n>It consists of 40,000 text-image pairs from approximately 2,000 persons, with each person represented by around 20 images showcasing diverse facial attributes.<n>During the training stage, a specific face of a person is used as input, and the FaceID customization model is forced to generate another image of the same person but with altered facial features.<n>Experiments show that models fine-tuned on the CrossFaceID dataset its performance in preserving FaceID fidelity while significantly improving its
arXiv Detail & Related papers (2025-01-26T05:27:38Z)
ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition [60.15830516741776]
Synthetic face recognition (SFR) aims to generate datasets that mimic the distribution of real face data. We introduce a diffusion-fueled SFR model termed $textID3$. $textID3$ employs an ID-preserving loss to generate diverse yet identity-consistent facial appearances.
arXiv Detail & Related papers (2024-09-26T06:46:40Z)
Disentangled Representations for Short-Term and Long-Term Person Re-Identification [33.76874948187976]
We propose a new generative adversarial network, dubbed identity shuffle GAN (IS-GAN) It disentangles identity-related and unrelated features from person images through an identity-shuffling technique. Experimental results validate the effectiveness of IS-GAN, showing state-of-the-art performance on standard reID benchmarks.
arXiv Detail & Related papers (2024-09-09T02:09:49Z)
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation [0.0]
"InstantFamily" is an approach that employs a novel cross-attention mechanism and a multimodal embedding stack to achieve zero-shot multi-ID image generation. Our method effectively preserves ID as it utilizes global and local features from a pre-trained face recognition model integrated with text conditions.
arXiv Detail & Related papers (2024-04-30T10:16:21Z)
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving [64.90148669690228]
ConsistentID is an innovative method crafted for diverseidentity-preserving portrait generation under fine-grained multimodal facial prompts.<n>We present a fine-grained portrait dataset, FGID, with over 500,000 facial images, offering greater diversity and comprehensiveness than existing public facial datasets.
arXiv Detail & Related papers (2024-04-25T17:23:43Z)
ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning [57.91881829308395]
Identity-preserving text-to-image generation (ID-T2I) has received significant attention due to its wide range of application scenarios like AI portrait and advertising. We present textbfID-Aligner, a general feedback learning framework to enhance ID-T2I performance.
arXiv Detail & Related papers (2024-04-23T18:41:56Z)
Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models [66.05234562835136]
We present MuDI, a novel framework that enables multi-subject personalization. Our main idea is to utilize segmented subjects generated by a foundation model for segmentation. Experimental results show that our MuDI can produce high-quality personalized images without identity mixing.
arXiv Detail & Related papers (2024-04-05T17:45:22Z)
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models [31.762112403595612]
IDAdapter is a tuning-free approach that enhances the diversity and identity preservation in personalized image generation from a single face image. During the training phase, we incorporate mixed features from multiple reference images of a specific identity to enrich identity-related content details.
arXiv Detail & Related papers (2024-03-20T12:13:04Z)
StableIdentity: Inserting Anybody into Anywhere at First Sight [57.99693188913382]
We propose StableIdentity, which allows identity-consistent recontextualization with just one face image. We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
arXiv Detail & Related papers (2024-01-29T09:06:15Z)
PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization [92.90392834835751]
PortraitBooth is designed for high efficiency, robust identity preservation, and expression-editable text-to-image generation. PortraitBooth eliminates computational overhead and mitigates identity distortion. It incorporates emotion-aware cross-attention control for diverse facial expressions in generated images.
arXiv Detail & Related papers (2023-12-11T13:03:29Z)
X-ReID: Cross-Instance Transformer for Identity-Level Person Re-Identification [53.047542904329866]
Cross Intra-Identity Instances module (IntraX) fuses different intra-identity instances to transfer Identity-Level knowledge. Cross Inter-Identity Instances module (InterX) involves hard positive and hard negative instances to improve the attention response to the same identity.
arXiv Detail & Related papers (2023-02-04T03:16:18Z)
FaceDancer: Pose- and Occlusion-Aware High Fidelity Face Swapping [62.38898610210771]
We present a new single-stage method for subject face swapping and identity transfer, named FaceDancer. We have two major contributions: Adaptive Feature Fusion Attention (AFFA) and Interpreted Feature Similarity Regularization (IFSR)
arXiv Detail & Related papers (2022-10-19T11:31:38Z)
Semantic Consistency and Identity Mapping Multi-Component Generative Adversarial Network for Person Re-Identification [39.605062525247135]
We propose a semantic consistency and identity mapping multi-component generative adversarial network (SC-IMGAN) which provides style adaptation from one to many domains. Our proposed method outperforms state-of-the-art techniques on six challenging person Re-ID datasets.
arXiv Detail & Related papers (2021-04-28T14:12:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.