UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization
- URL: http://arxiv.org/abs/2408.05939v2
- Date: Fri, 6 Sep 2024 14:44:12 GMT
- Title: UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization
- Authors: Junjie He, Yifeng Geng, Liefeng Bo,
- Abstract summary: UniPortrait is an innovative human image personalization framework that unifies single- and multi-ID customization.
UniPortrait consists of only two plug-and-play modules: an ID embedding module and an ID routing module.
- Score: 10.760799194716922
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents UniPortrait, an innovative human image personalization framework that unifies single- and multi-ID customization with high face fidelity, extensive facial editability, free-form input description, and diverse layout generation. UniPortrait consists of only two plug-and-play modules: an ID embedding module and an ID routing module. The ID embedding module extracts versatile editable facial features with a decoupling strategy for each ID and embeds them into the context space of diffusion models. The ID routing module then combines and distributes these embeddings adaptively to their respective regions within the synthesized image, achieving the customization of single and multiple IDs. With a carefully designed two-stage training scheme, UniPortrait achieves superior performance in both single- and multi-ID customization. Quantitative and qualitative experiments demonstrate the advantages of our method over existing approaches as well as its good scalability, e.g., the universal compatibility with existing generative control tools. The project page is at https://aigcdesigngroup.github.io/UniPortrait-Page/ .
Related papers
- IC-Portrait: In-Context Matching for View-Consistent Personalized Portrait [51.18967854258571]
IC-Portrait is a novel framework designed to accurately encode individual identities for personalized portrait generation.
Our key insight is that pre-trained diffusion models are fast learners for in-context dense correspondence matching.
We show that IC-Portrait consistently outperforms existing state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2025-01-28T18:59:03Z) - Omni-ID: Holistic Identity Representation Designed for Generative Tasks [75.29174595706533]
Omni-ID encodes holistic information about an individual's appearance across diverse expressions.
It consolidates information from a varied number of unstructured input images into a structured representation.
It demonstrates substantial improvements over conventional representations across various generative tasks.
arXiv Detail & Related papers (2024-12-12T19:21:20Z) - CustAny: Customizing Anything from A Single Example [73.90939022698399]
We present a novel pipeline to construct a large dataset of general objects, featuring 315k text-image samples across 10k categories.
With the help of MC-IDC, we introduce Customizing Anything (CustAny), a zero-shot framework that maintains ID fidelity and supports flexible text editing for general objects.
Our contributions include a large-scale dataset, the CustAny framework and novel ID processing to advance this field.
arXiv Detail & Related papers (2024-06-17T15:26:22Z) - InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation [0.0]
"InstantFamily" is an approach that employs a novel cross-attention mechanism and a multimodal embedding stack to achieve zero-shot multi-ID image generation.
Our method effectively preserves ID as it utilizes global and local features from a pre-trained face recognition model integrated with text conditions.
arXiv Detail & Related papers (2024-04-30T10:16:21Z) - ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning [57.91881829308395]
Identity-preserving text-to-image generation (ID-T2I) has received significant attention due to its wide range of application scenarios like AI portrait and advertising.
We present textbfID-Aligner, a general feedback learning framework to enhance ID-T2I performance.
arXiv Detail & Related papers (2024-04-23T18:41:56Z) - Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm [31.06269858216316]
We propose Infinite-ID, an ID-semantics decoupling paradigm for identity-preserved personalization.
We introduce an identity-enhanced training, incorporating an additional image cross-attention module to capture sufficient ID information.
We also introduce a feature interaction mechanism that combines a mixed attention module with an AdaIN-mean operation to seamlessly merge the two streams.
arXiv Detail & Related papers (2024-03-18T13:39:53Z) - MuLan: Multimodal-LLM Agent for Progressive and Interactive Multi-Object Diffusion [81.7514869897233]
We develop a training-free Multimodal-LLM agent (MuLan), as a human painter, that can progressively generate multi-object.
MuLan harnesses a large language model (LLM) to decompose a prompt to a sequence of sub-tasks, each generating only one object by stable diffusion.
MuLan also adopts a vision-language model (VLM) to provide feedback to the image generated in each sub-task and control the diffusion model to re-generate the image if it violates the original prompt.
arXiv Detail & Related papers (2024-02-20T06:14:30Z) - InstantID: Zero-shot Identity-Preserving Generation in Seconds [21.04236321562671]
We introduce InstantID, a powerful diffusion model-based solution for ID embedding.
Our plug-and-play module adeptly handles image personalization in various styles using just a single facial image.
Our work seamlessly integrates with popular pre-trained text-to-image diffusion models like SD1.5 and SDXL.
arXiv Detail & Related papers (2024-01-15T07:50:18Z) - Identity Encoder for Personalized Diffusion [57.1198884486401]
We propose an encoder-based approach for personalization.
We learn an identity encoder which can extract an identity representation from a set of reference images of a subject.
We show that our approach consistently outperforms existing fine-tuning based approach in both image generation and reconstruction.
arXiv Detail & Related papers (2023-04-14T23:32:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.