Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos
- URL: http://arxiv.org/abs/2402.03723v1
- Date: Tue, 6 Feb 2024 05:40:53 GMT
- Title: Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos
- Authors: Alfredo Rivero, ShahRukh Athar, Zhixin Shu, Dimitris Samaras
- Abstract summary: We introduce Rig3DGS to create controllable 3D human portraits from casual smartphone videos.
Key innovation is a carefully designed deformation method which is guided by a learnable prior derived from a 3D morphable model.
We demonstrate the effectiveness of our learned deformation through extensive quantitative and qualitative experiments.
- Score: 33.779636707618785
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Creating controllable 3D human portraits from casual smartphone videos is
highly desirable due to their immense value in AR/VR applications. The recent
development of 3D Gaussian Splatting (3DGS) has shown improvements in rendering
quality and training efficiency. However, it still remains a challenge to
accurately model and disentangle head movements and facial expressions from a
single-view capture to achieve high-quality renderings. In this paper, we
introduce Rig3DGS to address this challenge. We represent the entire scene,
including the dynamic subject, using a set of 3D Gaussians in a canonical
space. Using a set of control signals, such as head pose and expressions, we
transform them to the 3D space with learned deformations to generate the
desired rendering. Our key innovation is a carefully designed deformation
method which is guided by a learnable prior derived from a 3D morphable model.
This approach is highly efficient in training and effective in controlling
facial expressions, head positions, and view synthesis across various captures.
We demonstrate the effectiveness of our learned deformation through extensive
quantitative and qualitative experiments. The project page can be found at
http://shahrukhathar.github.io/2024/02/05/Rig3DGS.html
Related papers
- iHuman: Instant Animatable Digital Humans From Monocular Videos [16.98924995658091]
We present a fast, simple, yet effective method for creating animatable 3D digital humans from monocular videos.
This work achieves and illustrates the need of accurate 3D mesh-type modelling of the human body.
Our method is faster by an order of magnitude (in terms of training time) than its closest competitor.
arXiv Detail & Related papers (2024-07-15T18:51:51Z) - GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - SuperGaussian: Repurposing Video Models for 3D Super Resolution [67.19266415499139]
We present a simple, modular, and generic method that upsamples coarse 3D models by adding geometric and appearance details.
We demonstrate that it is possible to directly repurpose existing (pretrained) video models for 3D super-resolution.
arXiv Detail & Related papers (2024-06-02T03:44:50Z) - Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh [44.57625460339714]
We propose a triangular mesh to manipulate 3DGS directly with self-adaptation.
Our approach is capable of handling large deformations, local manipulations, and soft body simulations while keeping high-fidelity rendering.
arXiv Detail & Related papers (2024-05-28T04:13:21Z) - Bootstrap 3D Reconstructed Scenes from 3D Gaussian Splatting [10.06208115191838]
We present a bootstrapping method to enhance the rendering of novel views using trained 3D-GS.
Our results indicate that bootstrapping effectively reduces artifacts, as well as clear enhancements on the evaluation metrics.
arXiv Detail & Related papers (2024-04-29T12:57:05Z) - VOODOO 3D: Volumetric Portrait Disentanglement for One-Shot 3D Head
Reenactment [17.372274738231443]
We present a 3D-aware one-shot head reenactment method based on a fully neural disentanglement framework for source appearance and driver expressions.
Our method is real-time and produces high-fidelity and view-consistent output, suitable for 3D teleconferencing systems based on holographic displays.
arXiv Detail & Related papers (2023-12-07T19:19:57Z) - DeformGS: Scene Flow in Highly Deformable Scenes for Deformable Object Manipulation [66.7719069053058]
DeformGS is an approach to recover scene flow in highly deformable scenes using simultaneous video captures of a dynamic scene from multiple cameras.
DeformGS improves 3D tracking by an average of 55.8% compared to the state-of-the-art.
With sufficient texture, DeformGS achieves a median tracking error of 3.3 mm on a cloth of 1.5 x 1.5 m in area.
arXiv Detail & Related papers (2023-11-30T18:53:03Z) - Drivable 3D Gaussian Avatars [26.346626608626057]
Current drivable avatars require either accurate 3D registrations during training, dense input images during testing, or both.
This work uses the recently presented 3D Gaussian Splatting (3DGS) technique to render realistic humans at real-time framerates.
Given their smaller size, we drive these deformations with joint angles and keypoints, which are more suitable for communication applications.
arXiv Detail & Related papers (2023-11-14T22:54:29Z) - PonderV2: Pave the Way for 3D Foundation Model with A Universal
Pre-training Paradigm [114.47216525866435]
We introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation.
For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness.
arXiv Detail & Related papers (2023-10-12T17:59:57Z) - HQ3DAvatar: High Quality Controllable 3D Head Avatar [65.70885416855782]
This paper presents a novel approach to building highly photorealistic digital head avatars.
Our method learns a canonical space via an implicit function parameterized by a neural network.
At test time, our method is driven by a monocular RGB video.
arXiv Detail & Related papers (2023-03-25T13:56:33Z) - AniFaceGAN: Animatable 3D-Aware Face Image Generation for Video Avatars [71.00322191446203]
2D generative models often suffer from undesirable artifacts when rendering images from different camera viewpoints.
Recently, 3D-aware GANs extend 2D GANs for explicit disentanglement of camera pose by leveraging 3D scene representations.
We propose an animatable 3D-aware GAN for multiview consistent face animation generation.
arXiv Detail & Related papers (2022-10-12T17:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.