Related papers: Parametric Gaussian Human Model: Generalizable Prior for Efficient and Realistic Human Avatar Modeling

Parametric Gaussian Human Model: Generalizable Prior for Efficient and Realistic Human Avatar Modeling

URL: http://arxiv.org/abs/2506.06645v1
Date: Sat, 07 Jun 2025 03:53:30 GMT
Title: Parametric Gaussian Human Model: Generalizable Prior for Efficient and Realistic Human Avatar Modeling
Authors: Cheng Peng, Jingxiang Sun, Yushuo Chen, Zhaoqi Su, Zhuo Su, Yebin Liu,
Abstract summary: Photo and animatable human avatars are a key enabler for virtual/augmented reality, telepresence, and digital entertainment.<n>We present the Parametric Gaussian Human Model (PGHM), a generalizable and efficient framework that integrates human priors into 3DGS.<n>Experiments show that PGHM is significantly more efficient than optimization-from-scratch methods, requiring only approximately 20 minutes per subject to produce avatars with comparable visual quality.
Score: 32.480049588166544
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Photorealistic and animatable human avatars are a key enabler for virtual/augmented reality, telepresence, and digital entertainment. While recent advances in 3D Gaussian Splatting (3DGS) have greatly improved rendering quality and efficiency, existing methods still face fundamental challenges, including time-consuming per-subject optimization and poor generalization under sparse monocular inputs. In this work, we present the Parametric Gaussian Human Model (PGHM), a generalizable and efficient framework that integrates human priors into 3DGS for fast and high-fidelity avatar reconstruction from monocular videos. PGHM introduces two core components: (1) a UV-aligned latent identity map that compactly encodes subject-specific geometry and appearance into a learnable feature tensor; and (2) a disentangled Multi-Head U-Net that predicts Gaussian attributes by decomposing static, pose-dependent, and view-dependent components via conditioned decoders. This design enables robust rendering quality under challenging poses and viewpoints, while allowing efficient subject adaptation without requiring multi-view capture or long optimization time. Experiments show that PGHM is significantly more efficient than optimization-from-scratch methods, requiring only approximately 20 minutes per subject to produce avatars with comparable visual quality, thereby demonstrating its practical applicability for real-world monocular avatar creation.

Related papers

PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images [23.745241278910946]
PF-LHM is a large human reconstruction model that generates high-quality 3D avatars in seconds from one or multiple casually captured pose-free images.<n>Our method unifies single- and multi-image 3D human reconstruction, achieving high-fidelity and animatable 3D human avatars without requiring camera and human pose annotations.
arXiv Detail & Related papers (2025-06-16T17:59:56Z)
EVA: Expressive Virtual Avatars from Multi-view Videos [51.33851869426057]
We introduce Expressive Virtual Avatars (EVA), an actor-specific, fully controllable, and expressive human avatar framework.<n>EVA achieves high-fidelity, lifelike renderings in real time while enabling independent control of facial expressions, body movements, and hand gestures.<n>This work represents a significant advancement towards fully drivable digital human models.
arXiv Detail & Related papers (2025-05-21T11:22:52Z)
GUAVA: Generalizable Upper Body 3D Gaussian Avatar [32.476282286315055]
3D human avatar reconstruction typically requires multi-view or monocular videos and training on individual IDs.<n>We first introduce an expressive human model (EHM) to enhance facial expression capabilities.<n>We propose GUAVA, the first framework for fast animatable upper-body 3D Gaussian avatar reconstruction.
arXiv Detail & Related papers (2025-05-06T09:19:16Z)
SEGA: Drivable 3D Gaussian Head Avatar from a Single Image [15.117619290414064]
We propose SEGA, a novel approach for 3D drivable Gaussian head Avatar creation.<n>SEGA seamlessly combines priors derived from large-scale 2D datasets with 3D priors learned from multi-view, multi-expression, and multi-ID data.<n>Experiments show our method outperforms state-of-the-art approaches in generalization ability, identity preservation, and expression realism.
arXiv Detail & Related papers (2025-04-19T18:23:31Z)
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets [72.26350984924129]
We propose a latent space generation paradigm for 3D human digitization.<n>We transform the ill-posed low-to-high-dimensional mapping problem into a learnable distribution shift.<n>We employ the multi-view optimization approach combined with synthetic data to construct the HGS-1M dataset.
arXiv Detail & Related papers (2025-04-09T15:38:18Z)
GPHM: Gaussian Parametric Head Model for Monocular Head Avatar Reconstruction [47.113910048252805]
High-fidelity 3D human head avatars are crucial for applications in VR/AR, digital human, and film production. Recent advances have leveraged morphable face models to generate animated head avatars, representing varying identities and expressions. We introduce 3D Gaussian Parametric Head Model, which employs 3D Gaussians to accurately represent the complexities of the human head.
arXiv Detail & Related papers (2024-07-21T06:03:11Z)
Expressive Gaussian Human Avatars from Monocular RGB Video [69.56388194249942]
We introduce EVA, a drivable human model that meticulously sculpts fine details based on 3D Gaussians and SMPL-X. We highlight the critical importance of aligning the SMPL-X model with RGB frames for effective avatar learning. We propose a context-aware adaptive density control strategy, which is adaptively adjusting the gradient thresholds.
arXiv Detail & Related papers (2024-07-03T15:36:27Z)
NPGA: Neural Parametric Gaussian Avatars [46.52887358194364]
We propose a data-driven approach to create high-fidelity controllable avatars from multi-view video recordings. We build our method around 3D Gaussian splatting for its highly efficient rendering and to inherit the topological flexibility of point clouds. We evaluate our method on the public NeRSemble dataset, demonstrating that NPGA significantly outperforms the previous state-of-the-art avatars on the self-reenactment task by 2.6 PSNR.
arXiv Detail & Related papers (2024-05-29T17:58:09Z)
GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh [97.47701169876272]
GoMAvatar is a novel approach for real-time, memory-efficient, high-quality human modeling. GoMAvatar matches or surpasses current monocular human modeling algorithms in rendering quality.
arXiv Detail & Related papers (2024-04-11T17:59:57Z)
Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence. ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images. Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z)
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians [51.46168990249278]
We present an efficient approach to creating realistic human avatars with dynamic 3D appearances from a single video. GustafAvatar is validated on both the public dataset and our collected dataset.
arXiv Detail & Related papers (2023-12-04T18:55:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.