Related papers: Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization

Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization

URL: http://arxiv.org/abs/2512.03964v1
Date: Wed, 03 Dec 2025 16:57:50 GMT
Title: Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization
Authors: Lianyu Pang, Ji Zhou, Qiping Wang, Baoquan Zhao, Zhenguo Yang, Qing Li, Xudong Mao,
Abstract summary: We introduce UniID, a unified tuning-free framework that synergistically integrates both paradigms.<n>Our key insight is that when merging these approaches, they should mutually reinforce only identity-relevant information.<n>This principled design enables UniID to achieve high-fidelity face personalization with flexible text controllability.
Score: 16.851646868288135
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tuning-free face personalization methods have developed along two distinct paradigms: text embedding approaches that map facial features into the text embedding space, and adapter-based methods that inject features through auxiliary cross-attention layers. While both paradigms have shown promise, existing methods struggle to simultaneously achieve high identity fidelity and flexible text controllability. We introduce UniID, a unified tuning-free framework that synergistically integrates both paradigms. Our key insight is that when merging these approaches, they should mutually reinforce only identity-relevant information while preserving the original diffusion prior for non-identity attributes. We realize this through a principled training-inference strategy: during training, we employ an identity-focused learning scheme that guides both branches to capture identity features exclusively; at inference, we introduce a normalized rescaling mechanism that recovers the text controllability of the base diffusion model while enabling complementary identity signals to enhance each other. This principled design enables UniID to achieve high-fidelity face personalization with flexible text controllability. Extensive experiments against six state-of-the-art methods demonstrate that UniID achieves superior performance in both identity preservation and text controllability. Code will be available at https://github.com/lyuPang/UniID

Related papers

Optimizing ID Consistency in Multimodal Large Models: Facial Restoration via Alignment, Entanglement, and Disentanglement [54.199726425201895]
Multimodal editing large models have demonstrated powerful editing capabilities across diverse tasks.<n>Current facial ID preservation methods struggle to achieve consistent restoration of both facial identity and edited element IP.<n>We propose EditedID, an Alignment-Disentanglement-Entanglement framework for robust identity-specific facial restoration.
arXiv Detail & Related papers (2026-02-21T08:24:42Z)
FlexID: Training-Free Flexible Identity Injection via Intent-Aware Modulation for Text-to-Image Generation [10.474377498273205]
We propose FlexID, a training-free framework utilizing intent-aware modulation.<n>We introduce a Context-Aware Adaptive Gating (CAG) mechanism that dynamically modulates the weights of these streams.<n>Experiments on IBench demonstrate that FlexID achieves a balance between identity consistency and text adherence.
arXiv Detail & Related papers (2026-02-07T13:59:54Z)
BeyondFacial: Identity-Preserving Personalized Generation Beyond Facial Close-ups [22.017690133402912]
Identity-Preserving Personalized Generation (I) has advanced film production and artistic creation, yet existing approaches overemphasize facial regions.<n>These methods suffer from weak visual narrativity and poor semantic consistency under complex text prompts.<n>This paper presents an I method that breaks the constraint of facial close-ups, achieving synergistic optimization of identity fidelity and scene semantic creation.
arXiv Detail & Related papers (2025-11-15T01:56:14Z)
Beyond Inference Intervention: Identity-Decoupled Diffusion for Face Anonymization [55.29071072675132]
Face anonymization aims to conceal identity information while preserving non-identity attributes.<n>We propose textbfIDsuperscript2Face, a training-centric anonymization framework.<n>We show that IDtextsuperscript2Face outperforms existing methods in visual quality, identity suppression, and utility preservation.
arXiv Detail & Related papers (2025-10-28T09:28:12Z)
WithAnyone: Towards Controllable and ID Consistent Image Generation [83.55786496542062]
Identity-consistent generation has become an important focus in text-to-image research.<n>We develop a large-scale paired dataset tailored for multi-person scenarios.<n>We propose a novel training paradigm with a contrastive identity loss that leverages paired data to balance fidelity with diversity.
arXiv Detail & Related papers (2025-10-16T17:59:54Z)
ID-EA: Identity-driven Text Enhancement and Adaptation with Textual Inversion for Personalized Text-to-Image Generation [33.84646269805187]
ID-EA is a novel framework that guides text embeddings to align with visual identity embeddings.<n> ID-EA substantially outperforms state-of-the-art methods in identity preservation metrics.<n>It generates personalized portraits 15 times faster than existing approaches.
arXiv Detail & Related papers (2025-07-16T07:42:02Z)
ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models [49.09606704563898]
Person re-identification is a crucial task in computer vision, aiming to recognize individuals across non-overlapping camera views.<n>We propose a novel framework ChatReID, that shifts the focus towards a text-side-dominated retrieval paradigm, enabling flexible and interactive re-identification.<n>We introduce a hierarchical progressive tuning strategy, which endows Re-ID ability through three stages of tuning, i.e., from person attribute understanding to fine-grained image retrieval and to multi-modal task reasoning.
arXiv Detail & Related papers (2025-02-27T10:34:14Z)
Foundation Cures Personalization: Improving Personalized Models' Prompt Consistency via Hidden Foundation Knowledge [49.36669870661573]
We propose FreeCure, a framework that improves the prompt consistency of personalization models.<n>We introduce a novel foundation-aware self-attention module, coupled with an inversion-based process to bring well-aligned attribute information to the personalization process.
arXiv Detail & Related papers (2024-11-22T15:21:38Z)
ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning [57.91881829308395]
Identity-preserving text-to-image generation (ID-T2I) has received significant attention due to its wide range of application scenarios like AI portrait and advertising. We present textbfID-Aligner, a general feedback learning framework to enhance ID-T2I performance.
arXiv Detail & Related papers (2024-04-23T18:41:56Z)
Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm [31.06269858216316]
We propose Infinite-ID, an ID-semantics decoupling paradigm for identity-preserved personalization. We introduce an identity-enhanced training, incorporating an additional image cross-attention module to capture sufficient ID information. We also introduce a feature interaction mechanism that combines a mixed attention module with an AdaIN-mean operation to seamlessly merge the two streams.
arXiv Detail & Related papers (2024-03-18T13:39:53Z)
StableIdentity: Inserting Anybody into Anywhere at First Sight [57.99693188913382]
We propose StableIdentity, which allows identity-consistent recontextualization with just one face image. We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
arXiv Detail & Related papers (2024-01-29T09:06:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.