Zero-Shot Head Swapping in Real-World Scenarios
- URL: http://arxiv.org/abs/2503.00861v3
- Date: Mon, 24 Mar 2025 06:03:55 GMT
- Title: Zero-Shot Head Swapping in Real-World Scenarios
- Authors: Taewoong Kang, Sohyun Jeong, Hyojin Jang, Jaegul Choo,
- Abstract summary: We propose a novel head swapping method, HID, that is robust to images including the full head and the upper body.<n>For automatic mask generation, we introduce the IOMask, which enables seamless blending of the head and body.<n>Our experiments demonstrate that the proposed approach achieves state-of-the-art performance in head swapping.
- Score: 30.493743596793212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With growing demand in media and social networks for personalized images, the need for advanced head-swapping techniques, integrating an entire head from the head image with the body from the body image, has increased. However, traditional head swapping methods heavily rely on face-centered cropped data with primarily frontal facing views, which limits their effectiveness in real world applications. Additionally, their masking methods, designed to indicate regions requiring editing, are optimized for these types of dataset but struggle to achieve seamless blending in complex situations, such as when the original data includes features like long hair extending beyond the masked area. To overcome these limitations and enhance adaptability in diverse and complex scenarios, we propose a novel head swapping method, HID, that is robust to images including the full head and the upper body, and handles from frontal to side views, while automatically generating context aware masks. For automatic mask generation, we introduce the IOMask, which enables seamless blending of the head and body, effectively addressing integration challenges. We further introduce the hair injection module to capture hair details with greater precision. Our experiments demonstrate that the proposed approach achieves state-of-the-art performance in head swapping, providing visually consistent and realistic results across a wide range of challenging conditions.
Related papers
- InstaFace: Identity-Preserving Facial Editing with Single Image Inference [13.067402877443902]
We introduce a novel diffusion-based framework, InstaFace, to generate realistic images while preserving identity using only a single image.
InstaFace harnesses 3D perspectives by integrating multiple 3DMM-based conditionals without introducing additional trainable parameters.
Our method outperforms several state-of-the-art approaches in terms of identity preservation, photorealism, and effective control of pose, expression, and lighting.
arXiv Detail & Related papers (2025-02-27T22:37:09Z) - GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations [54.94362657501809]
We propose a new method to generate highly dynamic and deformable human head avatars from multi-view imagery in real-time.
At the core of our method is a hierarchical representation of head models that allows to capture the complex dynamics of facial expressions and head movements.
We train this coarse-to-fine facial avatar model along with the head pose as a learnable parameter in an end-to-end framework.
arXiv Detail & Related papers (2024-09-18T13:05:43Z) - Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models [69.50286698375386]
We propose a novel approach that better harnesses diffusion models for face-swapping.
We introduce a mask shuffling technique during inpainting training, which allows us to create a so-called universal model for swapping.
Ours is a relatively unified approach and so it is resilient to errors in other off-the-shelf models.
arXiv Detail & Related papers (2024-09-11T13:43:53Z) - What to Preserve and What to Transfer: Faithful, Identity-Preserving Diffusion-based Hairstyle Transfer [35.80645300182437]
Existing hairstyle transfer approaches rely on StyleGAN.<n>We propose a one-stage hairstyle transfer diffusion model, HairFusion, that applies to real-world scenarios.<n>Our method achieves state-of-the-art performance compared to the existing methods in preserving the integrity of both the transferred hairstyle and the surrounding features.
arXiv Detail & Related papers (2024-08-29T11:30:21Z) - MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing [34.31657241047574]
We propose a Hybrid Mesh-Gaussian Head Avatar (MeGA) that models different head components with more suitable representations.
MeGA generates higher-fidelity renderings for the whole head and naturally supports more downstream tasks.
Experiments on the NeRSemble dataset demonstrate the effectiveness of our designs.
arXiv Detail & Related papers (2024-04-29T18:10:12Z) - Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image.
We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z) - Few-Shot Head Swapping in the Wild [79.78228139171574]
The head swapping task aims at flawlessly placing a source head onto a target body, which is of great importance to various entertainment scenarios.
It is inherently challenging due to its unique needs in head modeling and background blending.
We present the Head Swapper (HeSer), which achieves few-shot head swapping in the wild through two delicately designed modules.
arXiv Detail & Related papers (2022-04-27T17:52:51Z) - HeadGAN: One-shot Neural Head Synthesis and Editing [70.30831163311296]
HeadGAN is a system that synthesises on 3D face representations and adapted to the facial geometry of any reference image.
The 3D face representation enables HeadGAN to be further used as an efficient method for compression and reconstruction and a tool for expression and pose editing.
arXiv Detail & Related papers (2020-12-15T12:51:32Z) - MichiGAN: Multi-Input-Conditioned Hair Image Generation for Portrait
Editing [122.82964863607938]
MichiGAN is a novel conditional image generation method for interactive portrait hair manipulation.
We provide user control over every major hair visual factor, including shape, structure, appearance, and background.
We also build an interactive portrait hair editing system that enables straightforward manipulation of hair by projecting intuitive and high-level user inputs.
arXiv Detail & Related papers (2020-10-30T17:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.