DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis
- URL: http://arxiv.org/abs/2503.15667v1
- Date: Wed, 19 Mar 2025 19:47:04 GMT
- Title: DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis
- Authors: Yuming Gu, Phong Tran, Yujian Zheng, Hongyi Xu, Heyuan Li, Adilbek Karmanov, Hao Li,
- Abstract summary: We introduce a novel approach that generates fully consistent 360-degree head views.<n>By training on continuous view sequences and integrating a back reference image, our approach achieves robust, locally continuous view synthesis.<n>Our model can be used to produce high-quality neural radiance fields (NeRFs) for real-time, free-viewpoint rendering.
- Score: 11.51144219543605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating high-quality 360-degree views of human heads from single-view images is essential for enabling accessible immersive telepresence applications and scalable personalized content creation. While cutting-edge methods for full head generation are limited to modeling realistic human heads, the latest diffusion-based approaches for style-omniscient head synthesis can produce only frontal views and struggle with view consistency, preventing their conversion into true 3D models for rendering from arbitrary angles. We introduce a novel approach that generates fully consistent 360-degree head views, accommodating human, stylized, and anthropomorphic forms, including accessories like glasses and hats. Our method builds on the DiffPortrait3D framework, incorporating a custom ControlNet for back-of-head detail generation and a dual appearance module to ensure global front-back consistency. By training on continuous view sequences and integrating a back reference image, our approach achieves robust, locally continuous view synthesis. Our model can be used to produce high-quality neural radiance fields (NeRFs) for real-time, free-viewpoint rendering, outperforming state-of-the-art methods in object synthesis and 360-degree head generation for very challenging input portraits.
Related papers
- SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models [80.33151028528563]
SpinMeRound is a diffusion-based approach designed to generate consistent and accurate head portraits from novel viewpoints.
By leveraging a number of input views alongside an identity embedding, our method effectively synthesizes diverse viewpoints of a subject.
arXiv Detail & Related papers (2025-04-14T21:16:20Z) - High-Quality 3D Head Reconstruction from Any Single Portrait Image [18.035517064261168]
We introduce a novel high-fidelity 3D head reconstruction method from a single portrait image, regardless of perspective, expression, or accessories.<n>Our method demonstrates robust performance across challenging scenarios, including side-face angles and complex accessories.
arXiv Detail & Related papers (2025-03-11T15:08:37Z) - Synthetic Prior for Few-Shot Drivable Head Avatar Inversion [61.51887011274453]
We present SynShot, a novel method for the few-shot inversion of a drivable head avatar based on a synthetic prior.<n>Inspired by machine learning models trained solely on synthetic data, we propose a method that learns a prior model from a large dataset of synthetic heads.
arXiv Detail & Related papers (2025-01-12T19:01:05Z) - FaceLift: Single Image to 3D Head with View Generation and GS-LRM [54.24070918942727]
FaceLift is a feed-forward approach for rapid, high-quality, 360-degree head reconstruction from a single image.<n>We show that FaceLift outperforms state-of-the-art methods in 3D head reconstruction, highlighting its practical applicability and robust performance on real-world images.
arXiv Detail & Related papers (2024-12-23T18:59:49Z) - Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360° [25.86740659962933]
We build a dataset of artist-designed high-fidelity human heads and propose to create a novel parametric head model from it.
Our scheme decouples the facial motion/shape and facial appearance, which are represented by a classic parametric 3D mesh model and an attached neural texture.
Experiments show that facial motions and appearances are well disentangled in the parametric space, leading to SOTA performance in rendering and animating quality.
arXiv Detail & Related papers (2024-08-01T05:46:06Z) - Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation [14.064983137553353]
We aim to enhance the quality and functionality of generative diffusion models for the task of creating controllable, photorealistic human avatars.
We achieve this by integrating a 3D morphable model into the state-of-the-art multi-view-consistent diffusion approach.
Our proposed framework is the first diffusion model to enable the creation of fully 3D-consistent, animatable, and photorealistic human avatars.
arXiv Detail & Related papers (2024-01-09T18:59:04Z) - InceptionHuman: Controllable Prompt-to-NeRF for Photorealistic 3D Human Generation [61.62346472443454]
InceptionHuman is a prompt-to-NeRF framework that allows easy control via a combination of prompts in different modalities to generate photorealistic 3D humans.
InceptionHuman achieves consistent 3D human generation within a progressively refined NeRF space.
arXiv Detail & Related papers (2023-11-27T15:49:41Z) - Single-Image 3D Human Digitization with Shape-Guided Diffusion [31.99621159464388]
NeRF and its variants typically require videos or images from different viewpoints.
We present an approach to generate a 360-degree view of a person with a consistent, high-resolution appearance from a single input image.
arXiv Detail & Related papers (2023-11-15T18:59:56Z) - 3DMM-RF: Convolutional Radiance Fields for 3D Face Modeling [111.98096975078158]
We introduce a style-based generative network that synthesizes in one pass all and only the required rendering samples of a neural radiance field.
We show that this model can accurately be fit to "in-the-wild" facial images of arbitrary pose and illumination, extract the facial characteristics, and be used to re-render the face in controllable conditions.
arXiv Detail & Related papers (2022-09-15T15:28:45Z) - Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control [54.079327030892244]
Free-HeadGAN is a person-generic neural talking head synthesis system.
We show that modeling faces with sparse 3D facial landmarks are sufficient for achieving state-of-the-art generative performance.
arXiv Detail & Related papers (2022-08-03T16:46:08Z) - Generalizable Neural Performer: Learning Robust Radiance Fields for
Human Novel View Synthesis [52.720314035084215]
This work targets at using a general deep learning framework to synthesize free-viewpoint images of arbitrary human performers.
We present a simple yet powerful framework, named Generalizable Neural Performer (GNR), that learns a generalizable and robust neural body representation.
Experiments on GeneBody-1.0 and ZJU-Mocap show better robustness of our methods than recent state-of-the-art generalizable methods.
arXiv Detail & Related papers (2022-04-25T17:14:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.