Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization
- URL: http://arxiv.org/abs/2512.03964v1
- Date: Wed, 03 Dec 2025 16:57:50 GMT
- Title: Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization
- Authors: Lianyu Pang, Ji Zhou, Qiping Wang, Baoquan Zhao, Zhenguo Yang, Qing Li, Xudong Mao,
- Abstract summary: We introduce UniID, a unified tuning-free framework that synergistically integrates both paradigms.<n>Our key insight is that when merging these approaches, they should mutually reinforce only identity-relevant information.<n>This principled design enables UniID to achieve high-fidelity face personalization with flexible text controllability.
- Score: 16.851646868288135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tuning-free face personalization methods have developed along two distinct paradigms: text embedding approaches that map facial features into the text embedding space, and adapter-based methods that inject features through auxiliary cross-attention layers. While both paradigms have shown promise, existing methods struggle to simultaneously achieve high identity fidelity and flexible text controllability. We introduce UniID, a unified tuning-free framework that synergistically integrates both paradigms. Our key insight is that when merging these approaches, they should mutually reinforce only identity-relevant information while preserving the original diffusion prior for non-identity attributes. We realize this through a principled training-inference strategy: during training, we employ an identity-focused learning scheme that guides both branches to capture identity features exclusively; at inference, we introduce a normalized rescaling mechanism that recovers the text controllability of the base diffusion model while enabling complementary identity signals to enhance each other. This principled design enables UniID to achieve high-fidelity face personalization with flexible text controllability. Extensive experiments against six state-of-the-art methods demonstrate that UniID achieves superior performance in both identity preservation and text controllability. Code will be available at https://github.com/lyuPang/UniID
Related papers
- Optimizing ID Consistency in Multimodal Large Models: Facial Restoration via Alignment, Entanglement, and Disentanglement [54.199726425201895]
Multimodal editing large models have demonstrated powerful editing capabilities across diverse tasks.<n>Current facial ID preservation methods struggle to achieve consistent restoration of both facial identity and edited element IP.<n>We propose EditedID, an Alignment-Disentanglement-Entanglement framework for robust identity-specific facial restoration.
arXiv Detail & Related papers (2026-02-21T08:24:42Z) - FlexID: Training-Free Flexible Identity Injection via Intent-Aware Modulation for Text-to-Image Generation [10.474377498273205]
We propose FlexID, a training-free framework utilizing intent-aware modulation.<n>We introduce a Context-Aware Adaptive Gating (CAG) mechanism that dynamically modulates the weights of these streams.<n>Experiments on IBench demonstrate that FlexID achieves a balance between identity consistency and text adherence.
arXiv Detail & Related papers (2026-02-07T13:59:54Z) - BeyondFacial: Identity-Preserving Personalized Generation Beyond Facial Close-ups [22.017690133402912]
Identity-Preserving Personalized Generation (I) has advanced film production and artistic creation, yet existing approaches overemphasize facial regions.<n>These methods suffer from weak visual narrativity and poor semantic consistency under complex text prompts.<n>This paper presents an I method that breaks the constraint of facial close-ups, achieving synergistic optimization of identity fidelity and scene semantic creation.
arXiv Detail & Related papers (2025-11-15T01:56:14Z) - Beyond Inference Intervention: Identity-Decoupled Diffusion for Face Anonymization [55.29071072675132]
Face anonymization aims to conceal identity information while preserving non-identity attributes.<n>We propose textbfIDsuperscript2Face, a training-centric anonymization framework.<n>We show that IDtextsuperscript2Face outperforms existing methods in visual quality, identity suppression, and utility preservation.
arXiv Detail & Related papers (2025-10-28T09:28:12Z) - WithAnyone: Towards Controllable and ID Consistent Image Generation [83.55786496542062]
Identity-consistent generation has become an important focus in text-to-image research.<n>We develop a large-scale paired dataset tailored for multi-person scenarios.<n>We propose a novel training paradigm with a contrastive identity loss that leverages paired data to balance fidelity with diversity.
arXiv Detail & Related papers (2025-10-16T17:59:54Z) - ID-EA: Identity-driven Text Enhancement and Adaptation with Textual Inversion for Personalized Text-to-Image Generation [33.84646269805187]
ID-EA is a novel framework that guides text embeddings to align with visual identity embeddings.<n> ID-EA substantially outperforms state-of-the-art methods in identity preservation metrics.<n>It generates personalized portraits 15 times faster than existing approaches.
arXiv Detail & Related papers (2025-07-16T07:42:02Z) - ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models [49.09606704563898]
Person re-identification is a crucial task in computer vision, aiming to recognize individuals across non-overlapping camera views.<n>We propose a novel framework ChatReID, that shifts the focus towards a text-side-dominated retrieval paradigm, enabling flexible and interactive re-identification.<n>We introduce a hierarchical progressive tuning strategy, which endows Re-ID ability through three stages of tuning, i.e., from person attribute understanding to fine-grained image retrieval and to multi-modal task reasoning.
arXiv Detail & Related papers (2025-02-27T10:34:14Z) - Foundation Cures Personalization: Improving Personalized Models' Prompt Consistency via Hidden Foundation Knowledge [49.36669870661573]
We propose FreeCure, a framework that improves the prompt consistency of personalization models.<n>We introduce a novel foundation-aware self-attention module, coupled with an inversion-based process to bring well-aligned attribute information to the personalization process.
arXiv Detail & Related papers (2024-11-22T15:21:38Z) - ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning [57.91881829308395]
Identity-preserving text-to-image generation (ID-T2I) has received significant attention due to its wide range of application scenarios like AI portrait and advertising.
We present textbfID-Aligner, a general feedback learning framework to enhance ID-T2I performance.
arXiv Detail & Related papers (2024-04-23T18:41:56Z) - Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm [31.06269858216316]
We propose Infinite-ID, an ID-semantics decoupling paradigm for identity-preserved personalization.
We introduce an identity-enhanced training, incorporating an additional image cross-attention module to capture sufficient ID information.
We also introduce a feature interaction mechanism that combines a mixed attention module with an AdaIN-mean operation to seamlessly merge the two streams.
arXiv Detail & Related papers (2024-03-18T13:39:53Z) - StableIdentity: Inserting Anybody into Anywhere at First Sight [57.99693188913382]
We propose StableIdentity, which allows identity-consistent recontextualization with just one face image.
We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
arXiv Detail & Related papers (2024-01-29T09:06:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.