Related papers: FaceLift: Single Image to 3D Head with View Generation and GS-LRM

FaceLift: Single Image to 3D Head with View Generation and GS-LRM

URL: http://arxiv.org/abs/2412.17812v1
Date: Mon, 23 Dec 2024 18:59:49 GMT
Title: FaceLift: Single Image to 3D Head with View Generation and GS-LRM
Authors: Weijie Lyu, Yi Zhou, Ming-Hsuan Yang, Zhixin Shu,
Abstract summary: FaceLift is a feed-forward approach for rapid, high-quality, 360-degree head reconstruction from a single image.<n>We show that FaceLift outperforms state-of-the-art methods in 3D head reconstruction, highlighting its practical applicability and robust performance on real-world images.
Score: 54.24070918942727
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We present FaceLift, a feed-forward approach for rapid, high-quality, 360-degree head reconstruction from a single image. Our pipeline begins by employing a multi-view latent diffusion model that generates consistent side and back views of the head from a single facial input. These generated views then serve as input to a GS-LRM reconstructor, which produces a comprehensive 3D representation using Gaussian splats. To train our system, we develop a dataset of multi-view renderings using synthetic 3D human head as-sets. The diffusion-based multi-view generator is trained exclusively on synthetic head images, while the GS-LRM reconstructor undergoes initial training on Objaverse followed by fine-tuning on synthetic head data. FaceLift excels at preserving identity and maintaining view consistency across views. Despite being trained solely on synthetic data, FaceLift demonstrates remarkable generalization to real-world images. Through extensive qualitative and quantitative evaluations, we show that FaceLift outperforms state-of-the-art methods in 3D head reconstruction, highlighting its practical applicability and robust performance on real-world images. In addition to single image reconstruction, FaceLift supports video inputs for 4D novel view synthesis and seamlessly integrates with 2D reanimation techniques to enable 3D facial animation. Project page: https://weijielyu.github.io/FaceLift.

Related papers

DeOcc-1-to-3: 3D De-Occlusion from a Single Image via Self-Supervised Multi-View Diffusion [50.90541069907167]
We propose DeOcc-1-to-3, an end-to-end framework for occlusion-aware multi-view generation.<n>Our self-supervised training pipeline leverages occluded-unoccluded image pairs and pseudo-ground-truth views to teach the model structure-aware completion and view consistency.
arXiv Detail & Related papers (2025-06-26T17:58:26Z)
3D Face Reconstruction With Geometry Details From a Single Color Image Under Occluded Scenes [4.542616945567623]
3D face reconstruction technology aims to generate a face stereo model naturally and realistically.<n>Previous deep face reconstruction approaches are typically designed to generate convincing textures.<n>By introducing bump mapping, we successfully added mid-level details to coarse 3D faces.
arXiv Detail & Related papers (2024-12-25T15:16:02Z)
Single Image, Any Face: Generalisable 3D Face Generation [59.9369171926757]
We propose a novel model, Gen3D-Face, which generates 3D human faces with unconstrained single image input. To the best of our knowledge, this is the first attempt and benchmark for creating photorealistic 3D human face avatars from single images.
arXiv Detail & Related papers (2024-09-25T14:56:37Z)
G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation. Our novel approach empowers the face animation model to incorporate 3D information using only 2D images. In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z)
FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models [79.65289816077629]
We present FitDiff, a diffusion-based 3D facial avatar generative model. Our model accurately generates relightable facial avatars, utilizing an identity embedding extracted from an "in-the-wild" 2D facial image. Being the first 3D LDM conditioned on face recognition embeddings, FitDiff reconstructs relightable human avatars, that can be used as-is in common rendering engines.
arXiv Detail & Related papers (2023-12-07T17:35:49Z)
SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion [35.73448283467723]
SiTH is a novel pipeline that integrates an image-conditioned diffusion model into a 3D mesh reconstruction workflow. We employ a powerful generative diffusion model to hallucinate unseen back-view appearance based on the input images. For the latter, we leverage skinned body meshes as guidance to recover full-body texture meshes from the input and back-view images.
arXiv Detail & Related papers (2023-11-27T14:22:07Z)
PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360$^{\circ}$ [17.355141949293852]
Existing 3D generative adversarial networks (GANs) for 3D human head synthesis are either limited to near-frontal views or hard to preserve 3D consistency in large view angles. We propose PanoHead, the first 3D-aware generative model that enables high-quality view-consistent image synthesis of full heads in $360circ$ with diverse appearance and detailed geometry.
arXiv Detail & Related papers (2023-03-23T06:54:34Z)
A Hierarchical Representation Network for Accurate and Detailed Face Reconstruction from In-The-Wild Images [15.40230841242637]
We present a novel hierarchical representation network (HRN) to achieve accurate and detailed face reconstruction from a single image. Our framework can be extended to a multi-view fashion by considering detail consistency of different views. Our method outperforms the existing methods in both reconstruction accuracy and visual effects.
arXiv Detail & Related papers (2023-02-28T09:24:36Z)
High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views. Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)
ReFu: Refine and Fuse the Unobserved View for Detail-Preserving Single-Image 3D Human Reconstruction [31.782985891629448]
Single-image 3D human reconstruction aims to reconstruct the 3D textured surface of the human body given a single image. We propose ReFu, a coarse-to-fine approach that refines the projected backside view image and fuses the refined image to predict the final human body.
arXiv Detail & Related papers (2022-11-09T09:14:11Z)
Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control [54.079327030892244]
Free-HeadGAN is a person-generic neural talking head synthesis system. We show that modeling faces with sparse 3D facial landmarks are sufficient for achieving state-of-the-art generative performance.
arXiv Detail & Related papers (2022-08-03T16:46:08Z)
Self-supervised High-fidelity and Re-renderable 3D Facial Reconstruction from a Single Image [19.0074836183624]
We propose a novel self-supervised learning framework for reconstructing high-quality 3D faces from single-view images in-the-wild. Our framework substantially outperforms state-of-the-art approaches in both qualitative and quantitative comparisons.
arXiv Detail & Related papers (2021-11-16T08:10:24Z)
Image-to-Video Generation via 3D Facial Dynamics [78.01476554323179]
We present a versatile model, FaceAnime, for various video generation tasks from still images. Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
arXiv Detail & Related papers (2021-05-31T02:30:11Z)
Fast-GANFIT: Generative Adversarial Network for High Fidelity 3D Face Reconstruction [76.1612334630256]
We harness the power of Generative Adversarial Networks (GANs) and Deep Convolutional Neural Networks (DCNNs) to reconstruct the facial texture and shape from single images. We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, facial texture reconstruction with high-frequency details.
arXiv Detail & Related papers (2021-05-16T16:35:44Z)
Inverting Generative Adversarial Renderer for Face Reconstruction [58.45125455811038]
In this work, we introduce a novel Generative Adversa Renderer (GAR) GAR learns to model the complicated real-world image, instead of relying on the graphics rules, it is capable of producing realistic images. Our method achieves state-of-the-art performances on multiple face reconstruction.
arXiv Detail & Related papers (2021-05-06T04:16:06Z)
Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images [47.18219551855583]
We propose a novel unsupervised framework that can synthesize photo-realistic rotated faces. Our key insight is that rotating faces in the 3D space back and forth, and re-rendering them to the 2D plane can serve as a strong self-supervision. Our approach has superior synthesis quality as well as identity preservation over the state-of-the-art methods.
arXiv Detail & Related papers (2020-03-18T09:54:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.