Related papers: Identity Preserving 3D Head Stylization with Multiview Score Distillation

Identity Preserving 3D Head Stylization with Multiview Score Distillation

URL: http://arxiv.org/abs/2411.13536v1
Date: Wed, 20 Nov 2024 18:37:58 GMT
Title: Identity Preserving 3D Head Stylization with Multiview Score Distillation
Authors: Bahri Batuhan Bilecen, Ahmet Berke Gokmen, Furkan Guzelant, Aysegul Dundar,
Abstract summary: 3D head stylization transforms realistic facial features into artistic representations, enhancing user engagement across gaming and virtual reality applications. This paper addresses these challenges by leveraging the PanoHead model, synthesizing images from a comprehensive 360-degree perspective. We propose a novel framework that employs negative log-likelihood distillation (LD) to enhance identity preservation and improve stylization quality.
Score: 7.8340104876025105
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: 3D head stylization transforms realistic facial features into artistic representations, enhancing user engagement across gaming and virtual reality applications. While 3D-aware generators have made significant advancements, many 3D stylization methods primarily provide near-frontal views and struggle to preserve the unique identities of original subjects, often resulting in outputs that lack diversity and individuality. This paper addresses these challenges by leveraging the PanoHead model, synthesizing images from a comprehensive 360-degree perspective. We propose a novel framework that employs negative log-likelihood distillation (LD) to enhance identity preservation and improve stylization quality. By integrating multi-view grid score and mirror gradients within the 3D GAN architecture and introducing a score rank weighing technique, our approach achieves substantial qualitative and quantitative improvements. Our findings not only advance the state of 3D head stylization but also provide valuable insights into effective distillation processes between diffusion models and GANs, focusing on the critical issue of identity preservation. Please visit the https://three-bee.github.io/head_stylization for more visuals.

Related papers

SEGA: Drivable 3D Gaussian Head Avatar from a Single Image [15.117619290414064]
We propose SEGA, a novel approach for 3D drivable Gaussian head Avatar creation. SEGA seamlessly combines priors derived from large-scale 2D datasets with 3D priors learned from multi-view, multi-expression, and multi-ID data. Experiments show our method outperforms state-of-the-art approaches in generalization ability, identity preservation, and expression realism.
arXiv Detail & Related papers (2025-04-19T18:23:31Z)
SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models [80.33151028528563]
SpinMeRound is a diffusion-based approach designed to generate consistent and accurate head portraits from novel viewpoints. By leveraging a number of input views alongside an identity embedding, our method effectively synthesizes diverse viewpoints of a subject.
arXiv Detail & Related papers (2025-04-14T21:16:20Z)
High-Quality 3D Head Reconstruction from Any Single Portrait Image [18.035517064261168]
We introduce a novel high-fidelity 3D head reconstruction method from a single portrait image, regardless of perspective, expression, or accessories. Our method demonstrates robust performance across challenging scenarios, including side-face angles and complex accessories.
arXiv Detail & Related papers (2025-03-11T15:08:37Z)
Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance [69.9745497000557]
We introduce Arc2Avatar, the first SDS-based method utilizing a human face foundation model as guidance with just a single image as input. Our avatars maintain a dense correspondence with a human face mesh template, allowing blendshape-based expression generation.
arXiv Detail & Related papers (2025-01-09T17:04:33Z)
Zero-to-Hero: Enhancing Zero-Shot Novel View Synthesis via Attention Map Filtering [16.382098950820822]
We propose Zero-to-Hero, a novel test-time approach that enhances view synthesis by manipulating attention maps. We modify the self-attention mechanism to integrate information from the source view, reducing shape distortions. Results demonstrate substantial improvements in fidelity and consistency, validated on a diverse set of out-of-distribution objects.
arXiv Detail & Related papers (2024-05-29T00:58:22Z)
ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling [96.87575334960258]
ID-to-3D is a method to generate identity- and text-guided 3D human heads with disentangled expressions. Results achieve an unprecedented level of identity-consistent and high-quality texture and geometry generation.
arXiv Detail & Related papers (2024-05-26T13:36:45Z)
Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation [12.693847842218604]
We introduce a novel 3D customization method, dubbed Make-Your-3D, that can personalize high-fidelity and consistent 3D content within 5 minutes. Our key insight is to harmonize the distributions of a multi-view diffusion model and an identity-specific 2D generative model, aligning them with the distribution of the desired 3D subject. Our method can produce high-quality, consistent, and subject-specific 3D content with text-driven modifications that are unseen in subject image.
arXiv Detail & Related papers (2024-03-14T17:57:04Z)
CharNeRF: 3D Character Generation from Concept Art [3.8061090528695543]
We present a novel approach to create volumetric representations of 3D characters from consistent turnaround concept art. We train the network to make use of these priors for various 3D points through a learnable view-direction-attended multi-head self-attention layer. Our model is able to generate high-quality 360-degree views of characters.
arXiv Detail & Related papers (2024-02-27T01:22:08Z)
UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures [80.047065473698]
We propose a novel 3D avatar generation approach termed UltrAvatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting. We demonstrate the effectiveness and robustness of the proposed method, outperforming the state-of-the-art methods by a large margin in the experiments.
arXiv Detail & Related papers (2024-01-20T01:55:17Z)
GPAvatar: Generalizable and Precise Head Avatar from Image(s) [71.555405205039]
GPAvatar is a framework that reconstructs 3D head avatars from one or several images in a single forward pass. The proposed method achieves faithful identity reconstruction, precise expression control, and multi-view consistency.
arXiv Detail & Related papers (2024-01-18T18:56:34Z)
IT3D: Improved Text-to-3D Generation with Explicit View Synthesis [71.68595192524843]
This study presents a novel strategy that leverages explicitly synthesized multi-view images to address these issues. Our approach involves the utilization of image-to-image pipelines, empowered by LDMs, to generate posed high-quality images. For the incorporated discriminator, the synthesized multi-view images are considered real data, while the renderings of the optimized 3D models function as fake data.
arXiv Detail & Related papers (2023-08-22T14:39:17Z)
Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image. We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z)
High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views. Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.