SqueezeMe: Mobile-Ready Distillation of Gaussian Full-Body Avatars
- URL: http://arxiv.org/abs/2412.15171v3
- Date: Mon, 17 Feb 2025 23:29:25 GMT
- Title: SqueezeMe: Mobile-Ready Distillation of Gaussian Full-Body Avatars
- Authors: Forrest Iandola, Stanislav Pidhorskyi, Igor Santesteban, Divam Gupta, Anuj Pahuja, Nemanja Bartolovic, Frank Yu, Emanuel Garbin, Tomas Simon, Shunsuke Saito,
- Abstract summary: We present SqueezeMe, a framework to convert high-fidelity 3D Gaussian full-body avatars into a lightweight representation.<n>We achieve, for the first time, simultaneous animation and rendering of 3 Gaussian avatars in real-time (72 FPS) on a Meta Quest 3 VR headset.
- Score: 19.249226899376943
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gaussian-based human avatars have achieved an unprecedented level of visual fidelity. However, existing approaches based on high-capacity neural networks typically require a desktop GPU to achieve real-time performance for a single avatar, and it remains non-trivial to animate and render such avatars on mobile devices including a standalone VR headset due to substantially limited memory and computational bandwidth. In this paper, we present SqueezeMe, a simple and highly effective framework to convert high-fidelity 3D Gaussian full-body avatars into a lightweight representation that supports both animation and rendering with mobile-grade compute. Our key observation is that the decoding of pose-dependent Gaussian attributes from a neural network creates non-negligible memory and computational overhead. Inspired by blendshapes and linear pose correctives widely used in Computer Graphics, we address this by distilling the pose correctives learned with neural networks into linear layers. Moreover, we further reduce the parameters by sharing the correctives among nearby Gaussians. Combining them with a custom splatting pipeline based on Vulkan, we achieve, for the first time, simultaneous animation and rendering of 3 Gaussian avatars in real-time (72 FPS) on a Meta Quest 3 VR headset.
Related papers
- LAM: Large Avatar Model for One-shot Animatable Gaussian Head [20.503641046404184]
We present LAM, an innovative Large Avatar Model for animatable Gaussian head reconstruction from a single image.
LAM creates an animatable Gaussian head in a single forward pass, enabling reenactment and rendering without additional networks or post-processing steps.
Our results demonstrate that LAM outperforms state-of-the-art methods on existing benchmarks.
arXiv Detail & Related papers (2025-02-25T02:57:45Z) - GASP: Gaussian Avatars with Synthetic Priors [30.404213841722108]
We propose GASP: Gaussian Avatars with Synthetic Priors.<n>We exploit the pixel-perfect nature of synthetic data to train a Gaussian Avatar prior.<n>Our prior is only required for fitting, not inference, enabling real-time application.
arXiv Detail & Related papers (2024-12-10T18:36:21Z) - MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussian Splatting [38.16397253003339]
Reconstructing high-fidelity 3D head avatars is crucial in various applications such as virtual reality.<n>Recent methods based on 3D Gaussian Splatting (3DGS) significantly improve the efficiency of training and rendering.<n>We propose a novel method named MixedGaussianAvatar for realistically and geometrically accurate head avatar reconstruction.
arXiv Detail & Related papers (2024-12-06T11:17:25Z) - Generalizable and Animatable Gaussian Head Avatar [50.34788590904843]
We propose Generalizable and Animatable Gaussian head Avatar (GAGAvatar) for one-shot animatable head avatar reconstruction.
We generate the parameters of 3D Gaussians from a single image in a single forward pass.
Our method exhibits superior performance compared to previous methods in terms of reconstruction quality and expression accuracy.
arXiv Detail & Related papers (2024-10-10T14:29:00Z) - NPGA: Neural Parametric Gaussian Avatars [46.52887358194364]
We propose a data-driven approach to create high-fidelity controllable avatars from multi-view video recordings.
We build our method around 3D Gaussian splatting for its highly efficient rendering and to inherit the topological flexibility of point clouds.
We evaluate our method on the public NeRSemble dataset, demonstrating that NPGA significantly outperforms the previous state-of-the-art avatars on the self-reenactment task by 2.6 PSNR.
arXiv Detail & Related papers (2024-05-29T17:58:09Z) - Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning [60.33970027554299]
Gaussian splatting has emerged as a powerful 3D representation that harnesses the advantages of both explicit (mesh) and implicit (NeRF) 3D representations.
In this paper, we seek to leverage Gaussian splatting to generate realistic animatable avatars from textual descriptions.
Our proposed method, GAvatar, enables the large-scale generation of diverse animatable avatars using only text prompts.
arXiv Detail & Related papers (2023-12-18T18:59:12Z) - ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering [62.81677824868519]
We propose an animatable Gaussian splatting approach for photorealistic rendering of dynamic humans in real-time.
We parameterize the clothed human as animatable 3D Gaussians, which can be efficiently splatted into image space to generate the final rendering.
We benchmark ASH with competing methods on pose-controllable avatars, demonstrating that our method outperforms existing real-time methods by a large margin and shows comparable or even better results than offline methods.
arXiv Detail & Related papers (2023-12-10T17:07:37Z) - HUGS: Human Gaussian Splats [21.73294518957075]
We introduce Human Gaussian Splats (HUGS) that represents an animatable human together with the scene using 3D Gaussian Splatting (3DGS)
Our method takes only a monocular video with a small number of (50-100) frames, and it automatically learns to disentangle the static scene and a fully animatable human avatar within 30 minutes.
We achieve state-of-the-art rendering quality with a rendering speed of 60 FPS while being 100x faster to train over previous work.
arXiv Detail & Related papers (2023-11-29T18:56:32Z) - Animatable and Relightable Gaussians for High-fidelity Human Avatar Modeling [47.1427140235414]
We introduce a new avatar representation that leverages powerful 2D CNNs and 3D Gaussian splatting to create high-fidelity avatars.
Our method can create lifelike avatars with dynamic, realistic, generalized and relightable appearances.
arXiv Detail & Related papers (2023-11-27T18:59:04Z) - FLARE: Fast Learning of Animatable and Relightable Mesh Avatars [64.48254296523977]
Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems.
We introduce FLARE, a technique that enables the creation of animatable and relightable avatars from a single monocular video.
arXiv Detail & Related papers (2023-10-26T16:13:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.