Towards Efficient 3D Gaussian Human Avatar Compression: A Prior-Guided Framework
- URL: http://arxiv.org/abs/2510.10492v1
- Date: Sun, 12 Oct 2025 07:50:18 GMT
- Title: Towards Efficient 3D Gaussian Human Avatar Compression: A Prior-Guided Framework
- Authors: Shanzhi Yin, Bolin Chen, Xinju Wu, Ru-Ling Liao, Jie Chen, Shiqi Wang, Yan Ye,
- Abstract summary: This paper proposes an efficient 3D avatar coding framework that enables high-quality 3D human avatar video compression at ultra-low bit rates.<n>The proposed method significantly outperforms conventional 2D/3D codecs and existing learnable dynamic 3D Gaussian splatting compression method.
- Score: 19.464262452201996
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper proposes an efficient 3D avatar coding framework that leverages compact human priors and canonical-to-target transformation to enable high-quality 3D human avatar video compression at ultra-low bit rates. The framework begins by training a canonical Gaussian avatar using articulated splatting in a network-free manner, which serves as the foundation for avatar appearance modeling. Simultaneously, a human-prior template is employed to capture temporal body movements through compact parametric representations. This decomposition of appearance and temporal evolution minimizes redundancy, enabling efficient compression: the canonical avatar is shared across the sequence, requiring compression only once, while the temporal parameters, consisting of just 94 parameters per frame, are transmitted with minimal bit-rate. For each frame, the target human avatar is generated by deforming canonical avatar via Linear Blend Skinning transformation, facilitating temporal coherent video reconstruction and novel view synthesis. Experimental results demonstrate that the proposed method significantly outperforms conventional 2D/3D codecs and existing learnable dynamic 3D Gaussian splatting compression method in terms of rate-distortion performance on mainstream multi-view human video datasets, paving the way for seamless immersive multimedia experiences in meta-verse applications.
Related papers
- Human Video Generation from a Single Image with 3D Pose and View Control [62.676151243249556]
We present Human Video Generation in 4D (HVG), a latent video diffusion model capable of generating high-quality multi-view,temporally coherent human videos from a single image.<n>HVG achieves this through three key designs: (i) Articulated Pose Modulation, which captures the anatomical relationships of 3D joints via a novel dual-dimensional bone map and resolves self-occlusions across views by introducing 3D information; (ii) View and Temporal Alignment, which ensures multi-view consistency and alignment between a reference image and pose sequences for frame-to-frame stability; and (iii)
arXiv Detail & Related papers (2026-02-24T18:42:20Z) - FastGHA: Generalized Few-Shot 3D Gaussian Head Avatars with Real-Time Animation [26.161556787983496]
OURS is a feed-forward method to generate high-quality Gaussian head avatars from only a few input images.<n>Our approach directly learns a per-pixel Gaussian representation from the input images.<n>Experiments show that our approach significantly outperforms existing methods in both rendering quality and inference efficiency.
arXiv Detail & Related papers (2026-01-20T10:49:49Z) - Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length [57.458450695137664]
We present Live Avatar, an algorithm-system co-designed framework for efficient, high-fidelity, and infinite-length avatar generation.<n>Live Avatar is first to achieve practical, real-time, high-fidelity avatar generation at this scale.
arXiv Detail & Related papers (2025-12-04T11:11:24Z) - HGC-Avatar: Hierarchical Gaussian Compression for Streamable Dynamic 3D Avatars [45.746590759473435]
HGC-Avatar is a novel Hierarchical Gaussian Compression framework for efficient transmission and high-quality rendering of dynamic avatars.<n>We show that HGC-Avatar provides a streamable solution for rapid 3D avatar rendering, while significantly outperforming prior methods in both visual quality and compression efficiency.
arXiv Detail & Related papers (2025-10-18T12:03:26Z) - D-FCGS: Feedforward Compression of Dynamic Gaussian Splatting for Free-Viewpoint Videos [12.24209693552492]
Free-viewpoint video (FVV) enables immersive 3D experiences, but efficient compression of dynamic 3D representations remains a major challenge.<n>This paper presents Feedforward Compression of Dynamic Gaussian Splatting (D-FCGS), a novel feedforward framework for compressing temporally correlated Gaussian point cloud sequences.<n> Experiments show that it matches the rate-distortion performance of optimization-based methods, achieving over 40 times compression in under 2 seconds.
arXiv Detail & Related papers (2025-07-08T10:39:32Z) - PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images [23.745241278910946]
PF-LHM is a large human reconstruction model that generates high-quality 3D avatars in seconds from one or multiple casually captured pose-free images.<n>Our method unifies single- and multi-image 3D human reconstruction, achieving high-fidelity and animatable 3D human avatars without requiring camera and human pose annotations.
arXiv Detail & Related papers (2025-06-16T17:59:56Z) - Parametric Gaussian Human Model: Generalizable Prior for Efficient and Realistic Human Avatar Modeling [32.480049588166544]
Photo and animatable human avatars are a key enabler for virtual/augmented reality, telepresence, and digital entertainment.<n>We present the Parametric Gaussian Human Model (PGHM), a generalizable and efficient framework that integrates human priors into 3DGS.<n>Experiments show that PGHM is significantly more efficient than optimization-from-scratch methods, requiring only approximately 20 minutes per subject to produce avatars with comparable visual quality.
arXiv Detail & Related papers (2025-06-07T03:53:30Z) - SEGA: Drivable 3D Gaussian Head Avatar from a Single Image [15.117619290414064]
We propose SEGA, a novel approach for 3D drivable Gaussian head Avatar creation.<n>SEGA seamlessly combines priors derived from large-scale 2D datasets with 3D priors learned from multi-view, multi-expression, and multi-ID data.<n>Experiments show our method outperforms state-of-the-art approaches in generalization ability, identity preservation, and expression realism.
arXiv Detail & Related papers (2025-04-19T18:23:31Z) - FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images [74.86864398919467]
We present a novel method for reconstructing personalized 3D human avatars with realistic animation from only a few images.<n>We learn a universal prior from over a thousand clothed humans to achieve instant feedforward generation and zero-shot generalization.<n>Our method generates more authentic reconstruction and animation than state-of-the-arts, and can be directly generalized to inputs from casually taken phone photos.
arXiv Detail & Related papers (2025-03-24T23:20:47Z) - Deblur-Avatar: Animatable Avatars from Motion-Blurred Monocular Videos [64.10307207290039]
We introduce a novel framework for modeling high-fidelity, animatable 3D human avatars from motion-blurred monocular video inputs.<n>By explicitly modeling human motion trajectories during exposure time, we jointly optimize the trajectories and 3D Gaussians to reconstruct sharp, high-quality human avatars.
arXiv Detail & Related papers (2025-01-23T02:31:57Z) - Sequential Gaussian Avatars with Hierarchical Motion Context [7.6736633105043515]
SMPL-driven 3DGS human avatars struggle to capture fine appearance details due to complex mapping from pose to appearance during fitting.<n>We propose SeqAvatar, which excavates the explicit 3DGS representation to better model human avatars based on a hierarchical motion context.<n>Our method significantly outperforms 3DGS-based approaches and renders human avatars rendering orders of magnitude faster than the latest NeRF-based models.
arXiv Detail & Related papers (2024-11-25T04:05:19Z) - InstantSplat: Sparse-view Gaussian Splatting in Seconds [91.77050739918037]
We introduce InstantSplat, a novel approach for addressing sparse-view 3D scene reconstruction at lightning-fast speed.<n>InstantSplat employs a self-supervised framework that optimize 3D scene representation and camera poses.<n>It achieves an acceleration of over 30x in reconstruction and improves visual quality (SSIM) from 0.3755 to 0.7624 compared to traditional SfM with 3D-GS.
arXiv Detail & Related papers (2024-03-29T17:29:58Z) - Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.