Related papers: RGBAvatar: Reduced Gaussian Blendshapes for Online Modeling of Head Avatars

RGBAvatar: Reduced Gaussian Blendshapes for Online Modeling of Head Avatars

URL: http://arxiv.org/abs/2503.12886v1
Date: Mon, 17 Mar 2025 07:31:21 GMT
Title: RGBAvatar: Reduced Gaussian Blendshapes for Online Modeling of Head Avatars
Authors: Linzhou Li, Yumeng Li, Yanlin Weng, Youyi Zheng, Kun Zhou,
Abstract summary: We present Reduced Gaussian Blendshapes Avatar (RGBAvatar), a method for reconstructing, animatable head avatars at speeds sufficient for on-the-fly reconstruction.<n>Our method maps tracked 3DMM parameters into reduced blendshape weights with an composition, leading to a compact set of blendshape bases.<n>We propose a local-global sampling strategy that enables direct on-the-fly reconstruction, immediately reconstructing the images as video streams in real time while achieving quality comparable to offline settings.
Score: 30.56664313203195
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Reduced Gaussian Blendshapes Avatar (RGBAvatar), a method for reconstructing photorealistic, animatable head avatars at speeds sufficient for on-the-fly reconstruction. Unlike prior approaches that utilize linear bases from 3D morphable models (3DMM) to model Gaussian blendshapes, our method maps tracked 3DMM parameters into reduced blendshape weights with an MLP, leading to a compact set of blendshape bases. The learned compact base composition effectively captures essential facial details for specific individuals, and does not rely on the fixed base composition weights of 3DMM, leading to enhanced reconstruction quality and higher efficiency. To further expedite the reconstruction process, we develop a novel color initialization estimation method and a batch-parallel Gaussian rasterization process, achieving state-of-the-art quality with training throughput of about 630 images per second. Moreover, we propose a local-global sampling strategy that enables direct on-the-fly reconstruction, immediately reconstructing the model as video streams in real time while achieving quality comparable to offline settings. Our source code is available at https://github.com/gapszju/RGBAvatar.

Related papers

DiMeR: Disentangled Mesh Reconstruction Model [24.07380724530745]
We introduce DiMeR, a novel disentangled dual-stream feed-forward model for sparse-view mesh reconstruction. We demonstrate robust capabilities across various tasks, including sparse-view reconstruction, single-image-to-3D, and text-to-3D.
arXiv Detail & Related papers (2025-04-24T15:39:20Z)
HORT: Monocular Hand-held Objects Reconstruction with Transformers [61.36376511119355]
Reconstructing hand-held objects in 3D from monocular images is a significant challenge in computer vision. We propose a transformer-based model to efficiently reconstruct dense 3D point clouds of hand-held objects. Our method achieves state-of-the-art accuracy with much faster inference speed, while generalizing well to in-the-wild images.
arXiv Detail & Related papers (2025-03-27T09:45:09Z)
Hybrid Rendering for Multimodal Autonomous Driving: Merging Neural and Physics-Based Simulation [1.0027737736304287]
We introduce a hybrid approach that combines the strengths of neural reconstruction with physics-based rendering.<n>Our approach significantly enhances novel view synthesis quality, especially for road surfaces and lane markings.<n>We achieve this by training a customized NeRF model on the original images with depth regularization derived from a noisy LiDAR point cloud.
arXiv Detail & Related papers (2025-03-12T15:18:50Z)
3D Gaussian Splatting with Normal Information for Mesh Extraction and Improved Rendering [8.59572577251833]
We propose a novel regularization method using the gradients of a signed distance function estimated from the Gaussians.<n>We demonstrate the effectiveness of our approach on datasets such as Mip-NeRF360, Tanks and Temples, and Deep-Blending.
arXiv Detail & Related papers (2025-01-14T18:40:33Z)
MonoGSDF: Exploring Monocular Geometric Cues for Gaussian Splatting-Guided Implicit Surface Reconstruction [84.07233691641193]
We introduce MonoGSDF, a novel method that couples primitives with a neural Signed Distance Field (SDF) for high-quality reconstruction. To handle arbitrary-scale scenes, we propose a scaling strategy for robust generalization. Experiments on real-world datasets outperforms prior methods while maintaining efficiency.
arXiv Detail & Related papers (2024-11-25T20:07:07Z)
GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement [51.97726804507328]
We propose a novel approach for 3D mesh reconstruction from multi-view images. Our method takes inspiration from large reconstruction models that use a transformer-based triplane generator and a Neural Radiance Field (NeRF) model trained on multi-view images.
arXiv Detail & Related papers (2024-06-09T05:19:24Z)
InstantSplat: Sparse-view Gaussian Splatting in Seconds [91.77050739918037]
We introduce InstantSplat, a novel approach for addressing sparse-view 3D scene reconstruction at lightning-fast speed. InstantSplat employs a self-supervised framework that optimize 3D scene representation and camera poses. It achieves an acceleration of over 30x in reconstruction and improves visual quality (SSIM) from 0.3755 to 0.7624 compared to traditional SfM with 3D-GS.
arXiv Detail & Related papers (2024-03-29T17:29:58Z)
Hybrid Explicit Representation for Ultra-Realistic Head Avatars [55.829497543262214]
We introduce a novel approach to creating ultra-realistic head avatars and rendering them in real-time.<n> UV-mapped 3D mesh is utilized to capture sharp and rich textures on smooth surfaces, while 3D Gaussian Splatting is employed to represent complex geometric structures.<n>Experiments that our modeled results exceed those of state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-18T04:01:26Z)
CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model [37.75256020559125]
We present a high-fidelity feed-forward single image-to-3D generative model. We highlight the necessity of integrating geometric priors into network design. Our model delivers a high-fidelity textured mesh from an image in just 10 seconds, without any test-time optimization.
arXiv Detail & Related papers (2024-03-08T04:25:29Z)
GaussianBody: Clothed Human Reconstruction via 3d Gaussian Splatting [14.937297984020821]
We propose a novel clothed human reconstruction method called GaussianBody, based on 3D Gaussian Splatting. Applying the static 3D Gaussian Splatting model to the dynamic human reconstruction problem is non-trivial due to complicated non-rigid deformations and rich cloth details. We show that our method can achieve state-of-the-art photorealistic novel-view rendering results with high-quality details for dynamic clothed human bodies.
arXiv Detail & Related papers (2024-01-18T04:48:13Z)
NeuManifold: Neural Watertight Manifold Reconstruction with Efficient and High-Quality Rendering Support [43.5015470997138]
We present a method for generating high-quality watertight manifold meshes from multi-view input images.<n>Our method combines the benefits of both worlds; we take the geometry obtained from neural fields, and further optimize the geometry as well as a compact neural texture representation.
arXiv Detail & Related papers (2023-05-26T17:59:21Z)
$PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction [97.06927852165464]
Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision. We propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process.
arXiv Detail & Related papers (2023-02-21T13:37:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.