Related papers: LAM: Large Avatar Model for One-shot Animatable Gaussian Head

LAM: Large Avatar Model for One-shot Animatable Gaussian Head

URL: http://arxiv.org/abs/2502.17796v2
Date: Fri, 04 Apr 2025 06:30:27 GMT
Title: LAM: Large Avatar Model for One-shot Animatable Gaussian Head
Authors: Yisheng He, Xiaodong Gu, Xiaodan Ye, Chao Xu, Zhengyi Zhao, Yuan Dong, Weihao Yuan, Zilong Dong, Liefeng Bo,
Abstract summary: We present LAM, an innovative Large Avatar Model for animatable Gaussian head reconstruction from a single image.<n>LAM creates an animatable Gaussian head in a single forward pass, enabling reenactment and rendering without additional networks or post-processing steps.<n>Our results demonstrate that LAM outperforms state-of-the-art methods on existing benchmarks.
Score: 20.503641046404184
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present LAM, an innovative Large Avatar Model for animatable Gaussian head reconstruction from a single image. Unlike previous methods that require extensive training on captured video sequences or rely on auxiliary neural networks for animation and rendering during inference, our approach generates Gaussian heads that are immediately animatable and renderable. Specifically, LAM creates an animatable Gaussian head in a single forward pass, enabling reenactment and rendering without additional networks or post-processing steps. This capability allows for seamless integration into existing rendering pipelines, ensuring real-time animation and rendering across a wide range of platforms, including mobile phones. The centerpiece of our framework is the canonical Gaussian attributes generator, which utilizes FLAME canonical points as queries. These points interact with multi-scale image features through a Transformer to accurately predict Gaussian attributes in the canonical space. The reconstructed canonical Gaussian avatar can then be animated utilizing standard linear blend skinning (LBS) with corrective blendshapes as the FLAME model did and rendered in real-time on various platforms. Our experimental results demonstrate that LAM outperforms state-of-the-art methods on existing benchmarks. Our code and video are available at https://aigc3d.github.io/projects/LAM/

Related papers

Dream, Lift, Animate: From Single Images to Animatable Gaussian Avatars [20.807609264738865]
We introduce Dream, Lift, Animate (DLA), a novel framework that reconstructs animatable 3D human avatars from a single image.<n>This is achieved by leveraging multi-view generation, 3D Gaussian lifting, and pose-aware UV-space mapping of 3D Gaussians.<n>Our method outperforms state-of-the-art approaches on ActorsHQ and 4D-Dress datasets in both perceptual quality and photometric accuracy.
arXiv Detail & Related papers (2025-07-21T18:20:09Z)
HoliGS: Holistic Gaussian Splatting for Embodied View Synthesis [59.25751939710903]
We propose a novel deformable Gaussian splatting framework that addresses embodied view synthesis from long monocular RGB videos.<n>Our method leverages invertible Gaussian Splatting deformation networks to reconstruct large-scale, dynamic environments accurately.<n>Results highlight a practical and scalable solution for EVS in real-world scenarios.
arXiv Detail & Related papers (2025-06-24T03:54:40Z)
SqueezeMe: Mobile-Ready Distillation of Gaussian Full-Body Avatars [19.249226899376943]
We present SqueezeMe, a framework to convert high-fidelity 3D Gaussian full-body avatars into a lightweight representation.<n>We achieve, for the first time, simultaneous animation and rendering of 3 Gaussian avatars in real-time (72 FPS) on a Meta Quest 3 VR headset.
arXiv Detail & Related papers (2024-12-19T18:46:55Z)
NovelGS: Consistent Novel-view Denoising via Large Gaussian Reconstruction Model [57.92709692193132]
NovelGS is a diffusion model for Gaussian Splatting given sparse-view images. We leverage the novel view denoising through a transformer-based network to generate 3D Gaussians.
arXiv Detail & Related papers (2024-11-25T07:57:17Z)
Generalizable and Animatable Gaussian Head Avatar [50.34788590904843]
We propose Generalizable and Animatable Gaussian head Avatar (GAGAvatar) for one-shot animatable head avatar reconstruction. We generate the parameters of 3D Gaussians from a single image in a single forward pass. Our method exhibits superior performance compared to previous methods in terms of reconstruction quality and expression accuracy.
arXiv Detail & Related papers (2024-10-10T14:29:00Z)
Interactive Rendering of Relightable and Animatable Gaussian Avatars [37.73483372890271]
We propose a simple and efficient method to decouple body materials and lighting from multi-view or monocular avatar videos. Our method can render higher quality results at a faster speed on both synthetic and real datasets.
arXiv Detail & Related papers (2024-07-15T13:25:07Z)
Gaussian Eigen Models for Human Heads [28.49783203616257]
Current personalized neural head avatars face a trade-off: lightweight models lack detail and realism, while high-quality, animatable avatars require significant computational resources.<n>We introduce Gaussian Eigen Models (GEM), which provide high-quality, lightweight, and easily controllable head avatars.
arXiv Detail & Related papers (2024-07-05T14:30:24Z)
GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh [97.47701169876272]
GoMAvatar is a novel approach for real-time, memory-efficient, high-quality human modeling. GoMAvatar matches or surpasses current monocular human modeling algorithms in rendering quality.
arXiv Detail & Related papers (2024-04-11T17:59:57Z)
SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting [26.849406891462557]
We present SplattingAvatar, a hybrid 3D representation of human avatars with Gaussian Splatting embedded on a triangle mesh. SplattingAvatar renders over 300 FPS on a modern GPU and 30 FPS on a mobile device.
arXiv Detail & Related papers (2024-03-08T06:28:09Z)
GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning [60.33970027554299]
Gaussian splatting has emerged as a powerful 3D representation that harnesses the advantages of both explicit (mesh) and implicit (NeRF) 3D representations. In this paper, we seek to leverage Gaussian splatting to generate realistic animatable avatars from textual descriptions. Our proposed method, GAvatar, enables the large-scale generation of diverse animatable avatars using only text prompts.
arXiv Detail & Related papers (2023-12-18T18:59:12Z)
GauFRe: Gaussian Deformation Fields for Real-time Dynamic Novel View Synthesis [16.733855781461802]
Implicit deformable representations commonly model motion with a canonical space and time-dependent deformation field.<n>GauFRe, uses a forward-warping deformation to explicitly model non-rigid transformations of scene geometry.<n>Experiments show our method achieves competitive results and higher efficiency than previous state-of-the-art NeRF and Gaussian-based methods.
arXiv Detail & Related papers (2023-12-18T18:59:03Z)
ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering [62.81677824868519]
We propose an animatable Gaussian splatting approach for photorealistic rendering of dynamic humans in real-time. We parameterize the clothed human as animatable 3D Gaussians, which can be efficiently splatted into image space to generate the final rendering. We benchmark ASH with competing methods on pose-controllable avatars, demonstrating that our method outperforms existing real-time methods by a large margin and shows comparable or even better results than offline methods.
arXiv Detail & Related papers (2023-12-10T17:07:37Z)
FLARE: Fast Learning of Animatable and Relightable Mesh Avatars [64.48254296523977]
Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems. We introduce FLARE, a technique that enables the creation of animatable and relightable avatars from a single monocular video.
arXiv Detail & Related papers (2023-10-26T16:13:00Z)
PointAvatar: Deformable Point-based Head Avatars from Videos [103.43941945044294]
PointAvatar is a deformable point-based representation that disentangles the source color into intrinsic albedo and normal-dependent shading. We show that our method is able to generate animatable 3D avatars using monocular videos from multiple sources.
arXiv Detail & Related papers (2022-12-16T10:05:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.