Related papers: GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos

GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos

URL: http://arxiv.org/abs/2402.16607v2
Date: Tue, 19 Mar 2024 08:58:17 GMT
Title: GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos
Authors: Xinqi Liu, Chenming Wu, Jialun Liu, Xing Liu, Jinbo Wu, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang,
Abstract summary: We present a novel method that facilitates the creation of vivid 3D Gaussian avatars from monocular video inputs (GVA) Our innovation lies in addressing the intricate challenges of delivering high-fidelity human body reconstructions. We introduce a pose refinement technique to improve hand and foot pose accuracy by aligning normal maps and silhouettes.
Score: 56.40776739573832
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we present a novel method that facilitates the creation of vivid 3D Gaussian avatars from monocular video inputs (GVA). Our innovation lies in addressing the intricate challenges of delivering high-fidelity human body reconstructions and aligning 3D Gaussians with human skin surfaces accurately. The key contributions of this paper are twofold. Firstly, we introduce a pose refinement technique to improve hand and foot pose accuracy by aligning normal maps and silhouettes. Precise pose is crucial for correct shape and appearance reconstruction. Secondly, we address the problems of unbalanced aggregation and initialization bias that previously diminished the quality of 3D Gaussian avatars, through a novel surface-guided re-initialization method that ensures accurate alignment of 3D Gaussian points with avatar surfaces. Experimental results demonstrate that our proposed method achieves high-fidelity and vivid 3D Gaussian avatar reconstruction. Extensive experimental analyses validate the performance qualitatively and quantitatively, demonstrating that it achieves state-of-the-art performance in photo-realistic novel view synthesis while offering fine-grained control over the human body and hand pose. Project page: https://3d-aigc.github.io/GVA/.

Related papers

3D Gaussian Flats: Hybrid 2D/3D Photometric Scene Reconstruction [62.84879632157956]
We propose a novel hybrid 2D/3D representation that jointly optimize constrained planar (2D) Gaussians for modeling flat surfaces and freeform (3D) Gaussians for the rest of the scene.<n>Our end-to-end approach dynamically detects and refines planar regions, improving both visual fidelity and geometric accuracy.<n>It achieves state-of-the-art depth estimation on ScanNet++ and ScanNetv2, and excels at mesh extraction without overfitting to a specific camera model.
arXiv Detail & Related papers (2025-09-19T21:04:36Z)
FMGS-Avatar: Mesh-Guided 2D Gaussian Splatting with Foundation Model Priors for 3D Monocular Avatar Reconstruction [18.570290675633732]
We introduce Mesh-Guided 2D Gaussian Splatting, where 2D primitives are attached directly to template mesh faces with constrained position, rotation, and movement.<n>We leverage foundation models trained on large-scale datasets, such as Sapiens, to complement the limited visual cues from monocular videos.<n> Experimental evaluation demonstrates superior reconstruction quality compared to existing methods, with notable gains in geometric accuracy and appearance fidelity.
arXiv Detail & Related papers (2025-09-18T08:41:41Z)
AvatarBack: Back-Head Generation for Complete 3D Avatars from Front-View Images [24.909494214820324]
AvatarBack is a novel plug-and-play framework specifically designed to reconstruct complete and consistent 3D Gaussian avatars.<n>We show that AvatarBack significantly enhances back-head reconstruction quality while preserving frontal fidelity.<n>The reconstructed avatars maintain consistent visual realism under diverse motions and remain fully animatable.
arXiv Detail & Related papers (2025-08-28T10:15:38Z)
MoGA: 3D Generative Avatar Prior for Monocular Gaussian Avatar Reconstruction [65.5412504339528]
MoGA is a novel method to reconstruct high-fidelity 3D Gaussian avatars from a single-view image.<n>Our method surpasses state-of-the-art techniques and generalizes well to real-world scenarios.
arXiv Detail & Related papers (2025-07-31T14:36:24Z)
AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion [56.12859795754579]
AdaHuman is a novel framework that generates high-fidelity animatable 3D avatars from a single in-the-wild image.<n>AdaHuman incorporates two key innovations: a pose-conditioned 3D joint diffusion model and a compositional 3DGS refinement module.
arXiv Detail & Related papers (2025-05-30T17:59:54Z)
GUAVA: Generalizable Upper Body 3D Gaussian Avatar [32.476282286315055]
3D human avatar reconstruction typically requires multi-view or monocular videos and training on individual IDs.<n>We first introduce an expressive human model (EHM) to enhance facial expression capabilities.<n>We propose GUAVA, the first framework for fast animatable upper-body 3D Gaussian avatar reconstruction.
arXiv Detail & Related papers (2025-05-06T09:19:16Z)
Generalizable and Animatable Gaussian Head Avatar [50.34788590904843]
We propose Generalizable and Animatable Gaussian head Avatar (GAGAvatar) for one-shot animatable head avatar reconstruction. We generate the parameters of 3D Gaussians from a single image in a single forward pass. Our method exhibits superior performance compared to previous methods in terms of reconstruction quality and expression accuracy.
arXiv Detail & Related papers (2024-10-10T14:29:00Z)
DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion [69.67970568012599]
We present DreamWaltz-G, a novel learning framework for animatable 3D avatar generation from text. The core of this framework lies in Score Distillation and Hybrid 3D Gaussian Avatar representation. Our framework further supports diverse applications, including human video reenactment and multi-subject scene composition.
arXiv Detail & Related papers (2024-09-25T17:59:45Z)
Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities [10.816370283498287]
We introduce the "Gaussian Deja-vu" framework, which first obtains a generalized model of the head avatar and then personalizes the result. For personalizing, we propose learnable expression-aware rectification blendmaps, ensuring rapid convergence without the reliance on neural networks. It outperforms state-of-the-art 3D Gaussian head avatars in terms of photorealistic quality as well as reduces training time consumption to at least a quarter of the existing methods.
arXiv Detail & Related papers (2024-09-23T00:11:30Z)
FAGhead: Fully Animate Gaussian Head from Monocular Videos [2.9979421496374683]
FAGhead is a method that enables fully controllable human portraits from monocular videos. We explicit the traditional 3D morphable meshes (3DMM) and optimize the neutral 3D Gaussians to reconstruct with complex expressions. To effectively manage the edges of avatars, we introduced the alpha rendering to supervise the alpha value of each pixel.
arXiv Detail & Related papers (2024-06-27T10:40:35Z)
UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar Modeling [71.87807614875497]
We propose UV Gaussians, which models the 3D human body by jointly learning mesh deformations and 2D UV-space Gaussian textures. We collect and process a new dataset of human motion, which includes multi-view images, scanned models, parametric model registration, and corresponding texture maps. Experimental results demonstrate that our method achieves state-of-the-art synthesis of novel view and novel pose.
arXiv Detail & Related papers (2024-03-18T09:03:56Z)
Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence. ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images. Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z)
DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars. We leverage the SMPL model to provide shape and pose guidance for the generation. We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.