Related papers: Gaussian Pixel Codec Avatars: A Hybrid Representation for Efficient Rendering

Gaussian Pixel Codec Avatars: A Hybrid Representation for Efficient Rendering

URL: http://arxiv.org/abs/2512.15711v1
Date: Wed, 17 Dec 2025 18:58:50 GMT
Title: Gaussian Pixel Codec Avatars: A Hybrid Representation for Efficient Rendering
Authors: Divam Gupta, Anuj Pahuja, Nemanja Bartolovic, Tomas Simon, Forrest Iandola, Giljoo Nam,
Abstract summary: GPiCA utilizes a unique hybrid representation that combines a triangle mesh and anisotropic 3D Gaussians.<n>We train neural networks to decode a facial expression code into three components: a 3D face mesh, an RGBA texture, and a set of 3D Gaussians.<n>Our results demonstrate that GPiCA achieves the realism of purely Gaussian-based avatars while matching the rendering performance of mesh-based avatars.
Score: 11.508015004156391
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present Gaussian Pixel Codec Avatars (GPiCA), photorealistic head avatars that can be generated from multi-view images and efficiently rendered on mobile devices. GPiCA utilizes a unique hybrid representation that combines a triangle mesh and anisotropic 3D Gaussians. This combination maximizes memory and rendering efficiency while maintaining a photorealistic appearance. The triangle mesh is highly efficient in representing surface areas like facial skin, while the 3D Gaussians effectively handle non-surface areas such as hair and beard. To this end, we develop a unified differentiable rendering pipeline that treats the mesh as a semi-transparent layer within the volumetric rendering paradigm of 3D Gaussian Splatting. We train neural networks to decode a facial expression code into three components: a 3D face mesh, an RGBA texture, and a set of 3D Gaussians. These components are rendered simultaneously in a unified rendering engine. The networks are trained using multi-view image supervision. Our results demonstrate that GPiCA achieves the realism of purely Gaussian-based avatars while matching the rendering performance of mesh-based avatars.

Related papers

SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing [26.400512371742096]
Surface-Volumetric Gaussian Head Avatar (SVG-Head) is a novel hybrid representation that explicitly models the geometry with 3D Gaussians bound on a FLAME mesh.<n>To model the correspondence between 3D world and texture space, we provide a mesh-aware Gaussian UV mapping method.<n>Experiments on the NeRSemble dataset show that SVG-Head not only generates high-fidelity rendering results, but also is the first method to obtain explicit texture images for Gaussian head avatars.
arXiv Detail & Related papers (2025-08-13T08:27:55Z)
GaussRender: Learning 3D Occupancy with Gaussian Rendering [86.89653628311565]
GaussRender is a module that improves 3D occupancy learning by enforcing projective consistency.<n>Our method penalizes 3D configurations that produce inconsistent 2D projections, thereby enforcing a more coherent 3D structure.
arXiv Detail & Related papers (2025-02-07T16:07:51Z)
Hybrid Explicit Representation for Ultra-Realistic Head Avatars [55.829497543262214]
We introduce a novel approach to creating ultra-realistic head avatars and rendering them in real-time.<n> UV-mapped 3D mesh is utilized to capture sharp and rich textures on smooth surfaces, while 3D Gaussian Splatting is employed to represent complex geometric structures.<n>Experiments that our modeled results exceed those of state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-18T04:01:26Z)
SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting [26.849406891462557]
We present SplattingAvatar, a hybrid 3D representation of human avatars with Gaussian Splatting embedded on a triangle mesh. SplattingAvatar renders over 300 FPS on a modern GPU and 30 FPS on a mobile device.
arXiv Detail & Related papers (2024-03-08T06:28:09Z)
GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning [60.33970027554299]
Gaussian splatting has emerged as a powerful 3D representation that harnesses the advantages of both explicit (mesh) and implicit (NeRF) 3D representations. In this paper, we seek to leverage Gaussian splatting to generate realistic animatable avatars from textual descriptions. Our proposed method, GAvatar, enables the large-scale generation of diverse animatable avatars using only text prompts.
arXiv Detail & Related papers (2023-12-18T18:59:12Z)
ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering [62.81677824868519]
We propose an animatable Gaussian splatting approach for photorealistic rendering of dynamic humans in real-time. We parameterize the clothed human as animatable 3D Gaussians, which can be efficiently splatted into image space to generate the final rendering. We benchmark ASH with competing methods on pose-controllable avatars, demonstrating that our method outperforms existing real-time methods by a large margin and shows comparable or even better results than offline methods.
arXiv Detail & Related papers (2023-12-10T17:07:37Z)
Gaussian Grouping: Segment and Edit Anything in 3D Scenes [65.49196142146292]
We propose Gaussian Grouping, which extends Gaussian Splatting to jointly reconstruct and segment anything in open-world 3D scenes. Compared to the implicit NeRF representation, we show that the grouped 3D Gaussians can reconstruct, segment and edit anything in 3D with high visual quality, fine granularity and efficiency.
arXiv Detail & Related papers (2023-12-01T17:09:31Z)
Gaussian Shell Maps for Efficient 3D Human Generation [96.25056237689988]
3D generative adversarial networks (GANs) have demonstrated state-of-the-art (SOTA) quality and diversity for generated assets. Current 3D GAN architectures, however, rely on volume representations, which are slow to render, thereby hampering the GAN training and requiring multi-view-inconsistent 2D upsamplers.
arXiv Detail & Related papers (2023-11-29T18:04:07Z)
Compact 3D Gaussian Representation for Radiance Field [14.729871192785696]
We propose a learnable mask strategy to reduce the number of 3D Gaussian points without sacrificing performance. We also propose a compact but effective representation of view-dependent color by employing a grid-based neural field. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering.
arXiv Detail & Related papers (2023-11-22T20:31:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.