Towards High-fidelity Head Blending with Chroma Keying for Industrial Applications
- URL: http://arxiv.org/abs/2411.00652v1
- Date: Fri, 01 Nov 2024 15:14:59 GMT
- Title: Towards High-fidelity Head Blending with Chroma Keying for Industrial Applications
- Authors: Hah Min Lew, Sahng-Min Yoo, Hyunwoo Kang, Gyeong-Moon Park,
- Abstract summary: We introduce an industrial Head Blending pipeline for the task of seamlessly integrating an actor's head onto a target body in digital content creation.
The key challenge stems from discrepancies in head shape and hair structure, which lead to unnatural boundaries and blending artifacts.
We propose CHANGER, a novel pipeline that decouples background integration from foreground blending.
- Score: 7.479901577089033
- License:
- Abstract: We introduce an industrial Head Blending pipeline for the task of seamlessly integrating an actor's head onto a target body in digital content creation. The key challenge stems from discrepancies in head shape and hair structure, which lead to unnatural boundaries and blending artifacts. Existing methods treat foreground and background as a single task, resulting in suboptimal blending quality. To address this problem, we propose CHANGER, a novel pipeline that decouples background integration from foreground blending. By utilizing chroma keying for artifact-free background generation and introducing Head shape and long Hair augmentation ($H^2$ augmentation) to simulate a wide range of head shapes and hair styles, CHANGER improves generalization on innumerable various real-world cases. Furthermore, our Foreground Predictive Attention Transformer (FPAT) module enhances foreground blending by predicting and focusing on key head and body regions. Quantitative and qualitative evaluations on benchmark datasets demonstrate that our CHANGER outperforms state-of-the-art methods, delivering high-fidelity, industrial-grade results.
Related papers
- TKG-DM: Training-free Chroma Key Content Generation Diffusion Model [9.939293311550655]
Training-Free Chroma Key Content Generation Diffusion Model (TKG-DM)
We present a novel Training-Free Chroma Key Content Generation Diffusion Model (TKG-DM)
Our proposed method is the first to explore the manipulation of the color aspects in initial noise for controlled background generation.
arXiv Detail & Related papers (2024-11-23T15:07:15Z) - Efficient Face Super-Resolution via Wavelet-based Feature Enhancement Network [27.902725520665133]
Face super-resolution aims to reconstruct a high-resolution face image from a low-resolution face image.
Previous methods typically employ an encoder-decoder structure to extract facial structural features.
We propose a wavelet-based feature enhancement network, which mitigates feature distortion by losslessly decomposing the input feature into high and low-frequency components.
arXiv Detail & Related papers (2024-07-29T08:03:33Z) - HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos [52.23323966700072]
We present a framework for acquiring human avatars that are attached with high-resolution physically-based material textures and mesh from monocular video.
Our method introduces a novel information fusion strategy to combine the information from the monocular video and synthesize virtual multi-view images.
Experiments show that our approach outperforms previous representations in terms of high fidelity, and this explicit result supports deployment on common triangulars.
arXiv Detail & Related papers (2024-05-18T11:49:09Z) - MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing [34.31657241047574]
We propose a Hybrid Mesh-Gaussian Head Avatar (MeGA) that models different head components with more suitable representations.
MeGA generates higher-fidelity renderings for the whole head and naturally supports more downstream tasks.
Experiments on the NeRSemble dataset demonstrate the effectiveness of our designs.
arXiv Detail & Related papers (2024-04-29T18:10:12Z) - GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable.
Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology.
We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z) - DPHMs: Diffusion Parametric Head Models for Depth-based Tracking [42.016598097736626]
We introduce Diffusion Parametric Head Models (DPHMs)
DPHMs are a generative model that enables robust volumetric head reconstruction and tracking from monocular depth sequences.
We propose a latent diffusion-based prior to regularize volumetric head reconstruction and tracking.
arXiv Detail & Related papers (2023-12-02T08:34:22Z) - Indoor Scene Reconstruction with Fine-Grained Details Using Hybrid Representation and Normal Prior Enhancement [50.56517624931987]
The reconstruction of indoor scenes from multi-view RGB images is challenging due to the coexistence of flat and texture-less regions.
Recent methods leverage neural radiance fields aided by predicted surface normal priors to recover the scene geometry.
This work aims to reconstruct high-fidelity surfaces with fine-grained details by addressing the above limitations.
arXiv Detail & Related papers (2023-09-14T12:05:29Z) - HS-Diffusion: Semantic-Mixing Diffusion for Head Swapping [150.06405071177048]
We propose a semantic-mixing diffusion model for head swapping (HS-Diffusion)
We blend the semantic layouts of source head and source body, and then inpaint the transition region by the semantic layout generator.
We construct a new image-based head swapping benchmark and design two tailor-designed metrics.
arXiv Detail & Related papers (2022-12-13T10:04:01Z) - DepthFormer: Exploiting Long-Range Correlation and Local Information for
Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation.
We propose to leverage the Transformer to model this global context with an effective attention mechanism.
Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z) - HiFaceGAN: Face Renovation via Collaborative Suppression and
Replenishment [63.333407973913374]
"Face Renovation"(FR) is a semantic-guided generation problem.
"HiFaceGAN" is a multi-stage framework containing several nested CSR units.
experiments on both synthetic and real face images have verified the superior performance of HiFaceGAN.
arXiv Detail & Related papers (2020-05-11T11:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.