SHeaP: Self-Supervised Head Geometry Predictor Learned via 2D Gaussians
- URL: http://arxiv.org/abs/2504.12292v1
- Date: Wed, 16 Apr 2025 17:55:02 GMT
- Title: SHeaP: Self-Supervised Head Geometry Predictor Learned via 2D Gaussians
- Authors: Liam Schoneveld, Zhe Chen, Davide Davoli, Jiapeng Tang, Saimon Terazawa, Ko Nishino, Matthias Nießner,
- Abstract summary: 3D reconstruction of human heads from monocular images and videos underlies numerous visual applications.<n>Previous methods have sought to learn from abundant 2D videos in a self-supervised manner.<n>We propose SHeaP (Self-supervised Head Geometry Predictor Learned via 2D Gaussians)
- Score: 55.813327441814344
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Accurate, real-time 3D reconstruction of human heads from monocular images and videos underlies numerous visual applications. As 3D ground truth data is hard to come by at scale, previous methods have sought to learn from abundant 2D videos in a self-supervised manner. Typically, this involves the use of differentiable mesh rendering, which is effective but faces limitations. To improve on this, we propose SHeaP (Self-supervised Head Geometry Predictor Learned via 2D Gaussians). Given a source image, we predict a 3DMM mesh and a set of Gaussians that are rigged to this mesh. We then reanimate this rigged head avatar to match a target frame, and backpropagate photometric losses to both the 3DMM and Gaussian prediction networks. We find that using Gaussians for rendering substantially improves the effectiveness of this self-supervised approach. Training solely on 2D data, our method surpasses existing self-supervised approaches in geometric evaluations on the NoW benchmark for neutral faces and a new benchmark for non-neutral expressions. Our method also produces highly expressive meshes, outperforming state-of-the-art in emotion classification.
Related papers
- GaussRender: Learning 3D Occupancy with Gaussian Rendering [86.89653628311565]
GaussRender is a module that improves 3D occupancy learning by enforcing projective consistency.<n>Our method penalizes 3D configurations that produce inconsistent 2D projections, thereby enforcing a more coherent 3D structure.
arXiv Detail & Related papers (2025-02-07T16:07:51Z) - MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussian Splatting [38.16397253003339]
Reconstructing high-fidelity 3D head avatars is crucial in various applications such as virtual reality.<n>Recent methods based on 3D Gaussian Splatting (3DGS) significantly improve the efficiency of training and rendering.<n>We propose a novel method named MixedGaussianAvatar for realistically and geometrically accurate head avatar reconstruction.
arXiv Detail & Related papers (2024-12-06T11:17:25Z) - Generalizable and Animatable Gaussian Head Avatar [50.34788590904843]
We propose Generalizable and Animatable Gaussian head Avatar (GAGAvatar) for one-shot animatable head avatar reconstruction.
We generate the parameters of 3D Gaussians from a single image in a single forward pass.
Our method exhibits superior performance compared to previous methods in terms of reconstruction quality and expression accuracy.
arXiv Detail & Related papers (2024-10-10T14:29:00Z) - DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation [10.250715657201363]
We introduce DreamMesh4D, a novel framework combining mesh representation with geometric skinning technique to generate high-quality 4D object from a monocular video.
Our method is compatible with modern graphic pipelines, showcasing its potential in the 3D gaming and film industry.
arXiv Detail & Related papers (2024-10-09T10:41:08Z) - GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - Effective Rank Analysis and Regularization for Enhanced 3D Gaussian Splatting [33.01987451251659]
3D Gaussian Splatting (3DGS) has emerged as a promising technique capable of real-time rendering with high-quality 3D reconstruction.<n>Despite its potential, 3DGS encounters challenges such as needle-like artifacts, suboptimal geometries, and inaccurate normals.<n>We introduce the effective rank as a regularization, which constrains the structure of the Gaussians.
arXiv Detail & Related papers (2024-06-17T15:51:59Z) - GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos [56.40776739573832]
We present a novel method that facilitates the creation of vivid 3D Gaussian avatars from monocular video inputs (GVA)
Our innovation lies in addressing the intricate challenges of delivering high-fidelity human body reconstructions.
We introduce a pose refinement technique to improve hand and foot pose accuracy by aligning normal maps and silhouettes.
arXiv Detail & Related papers (2024-02-26T14:40:15Z) - Magic123: One Image to High-Quality 3D Object Generation Using Both 2D
and 3D Diffusion Priors [104.79392615848109]
We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes from a single unposed image.
In the first stage, we optimize a neural radiance field to produce a coarse geometry.
In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing texture.
arXiv Detail & Related papers (2023-06-30T17:59:08Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.