Related papers: iHuman: Instant Animatable Digital Humans From Monocular Videos

iHuman: Instant Animatable Digital Humans From Monocular Videos

URL: http://arxiv.org/abs/2407.11174v1
Date: Mon, 15 Jul 2024 18:51:51 GMT
Title: iHuman: Instant Animatable Digital Humans From Monocular Videos
Authors: Pramish Paudel, Anubhav Khanal, Ajad Chhatkuli, Danda Pani Paudel, Jyoti Tandukar,
Abstract summary: We present a fast, simple, yet effective method for creating animatable 3D digital humans from monocular videos. This work achieves and illustrates the need of accurate 3D mesh-type modelling of the human body. Our method is faster by an order of magnitude (in terms of training time) than its closest competitor.
Score: 16.98924995658091
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Personalized 3D avatars require an animatable representation of digital humans. Doing so instantly from monocular videos offers scalability to broad class of users and wide-scale applications. In this paper, we present a fast, simple, yet effective method for creating animatable 3D digital humans from monocular videos. Our method utilizes the efficiency of Gaussian splatting to model both 3D geometry and appearance. However, we observed that naively optimizing Gaussian splats results in inaccurate geometry, thereby leading to poor animations. This work achieves and illustrates the need of accurate 3D mesh-type modelling of the human body for animatable digitization through Gaussian splats. This is achieved by developing a novel pipeline that benefits from three key aspects: (a) implicit modelling of surface's displacements and the color's spherical harmonics; (b) binding of 3D Gaussians to the respective triangular faces of the body template; (c) a novel technique to render normals followed by their auxiliary supervision. Our exhaustive experiments on three different benchmark datasets demonstrates the state-of-the-art results of our method, in limited time settings. In fact, our method is faster by an order of magnitude (in terms of training time) than its closest competitor. At the same time, we achieve superior rendering and 3D reconstruction performance under the change of poses.

Related papers

EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization. We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
GaussianMotion: End-to-End Learning of Animatable Gaussian Avatars with Pose Guidance from Text [39.16924298167778]
We introduce a novel rendering model that generates fully animatable scenes aligned with textual descriptions. Our method generates fully animatable 3D avatars by combining deformable 3D Gaussian Splatting with text-to-3D score distillation.
arXiv Detail & Related papers (2025-02-17T10:36:36Z)
3D$^2$-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling [37.11454674584874]
We introduce 3D$2$-Actor, a pose-conditioned 3D-aware human modeling pipeline that integrates 2D denoising and 3D rectifying steps. Experimental results demonstrate that 3D$2$-Actor excels in high-fidelity avatar modeling and robustly generalizes to novel poses.
arXiv Detail & Related papers (2024-12-16T09:37:52Z)
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses [57.17501809717155]
We present DreamDance, a novel method for animating human images using only skeleton pose sequences as conditional inputs. Our key insight is that human images naturally exhibit multiple levels of correlation. We construct the TikTok-Dance5K dataset, comprising 5K high-quality dance videos with detailed frame annotations.
arXiv Detail & Related papers (2024-11-30T08:42:13Z)
GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers [23.96688843662126]
Reconstructing posed 3D human models from monocular images has important applications in the sports industry. We combine 3D human pose and shape estimation with 3D Gaussian Splatting (3DGS), a representation of the scene composed of a mixture of Gaussians. We show that this combination can achieve near real-time inference of 3D human models from a single image without expensive diffusion models or 3D points supervision.
arXiv Detail & Related papers (2024-09-06T11:34:24Z)
Gaussian Eigen Models for Human Heads [28.49783203616257]
We present personalized Gaussian Eigen Models (GEMs) for human heads, a novel method that compresses dynamic 3D Gaussians into low-dimensional linear spaces. Our approach is inspired by the seminal work of Blanz and Vetter, where a mesh-based 3D morphable model (3DMM) is constructed from registered meshes. We show and compare self-reenactment and cross-person reenactment to state-of-the-art 3D avatar methods, demonstrating higher quality and better control.
arXiv Detail & Related papers (2024-07-05T14:30:24Z)
UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar Modeling [71.87807614875497]
We propose UV Gaussians, which models the 3D human body by jointly learning mesh deformations and 2D UV-space Gaussian textures. We collect and process a new dataset of human motion, which includes multi-view images, scanned models, parametric model registration, and corresponding texture maps. Experimental results demonstrate that our method achieves state-of-the-art synthesis of novel view and novel pose.
arXiv Detail & Related papers (2024-03-18T09:03:56Z)
Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos [33.779636707618785]
We introduce Rig3DGS to create controllable 3D human portraits from casual smartphone videos. Key innovation is a carefully designed deformation method which is guided by a learnable prior derived from a 3D morphable model. We demonstrate the effectiveness of our learned deformation through extensive quantitative and qualitative experiments.
arXiv Detail & Related papers (2024-02-06T05:40:53Z)
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data [36.51674664590734]
We present En3D, an enhanced izable scheme for high-qualityd 3D human avatars. Unlike previous works that rely on scarce 3D datasets or limited 2D collections with imbalance viewing angles and pose priors, our approach aims to develop a zero-shot 3D capable of producing 3D humans.
arXiv Detail & Related papers (2024-01-02T12:06:31Z)
ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering [62.81677824868519]
We propose an animatable Gaussian splatting approach for photorealistic rendering of dynamic humans in real-time. We parameterize the clothed human as animatable 3D Gaussians, which can be efficiently splatted into image space to generate the final rendering. We benchmark ASH with competing methods on pose-controllable avatars, demonstrating that our method outperforms existing real-time methods by a large margin and shows comparable or even better results than offline methods.
arXiv Detail & Related papers (2023-12-10T17:07:37Z)
Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing [53.05069432989608]
We present a novel framework for generating 3D human heads with remarkable flexibility. Our method facilitates the creation of diverse and realistic 3D human heads with fine-grained editing over facial features and expressions.
arXiv Detail & Related papers (2023-12-05T19:05:58Z)
GauHuman: Articulated Gaussian Splatting from Monocular Human Videos [58.553979884950834]
GauHuman is a 3D human model with Gaussian Splatting for both fast training (1 2 minutes) and real-time rendering (up to 189 FPS) GauHuman encodes Gaussian Splatting in the canonical space and transforms 3D Gaussians from canonical space to posed space with linear blend skinning (LBS) Experiments on ZJU_Mocap and MonoCap datasets demonstrate that GauHuman achieves state-of-the-art performance quantitatively and qualitatively with fast training and real-time rendering speed.
arXiv Detail & Related papers (2023-12-05T18:59:14Z)
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians [51.46168990249278]
We present an efficient approach to creating realistic human avatars with dynamic 3D appearances from a single video. GustafAvatar is validated on both the public dataset and our collected dataset.
arXiv Detail & Related papers (2023-12-04T18:55:45Z)
Drivable 3D Gaussian Avatars [26.346626608626057]
Current drivable avatars require either accurate 3D registrations during training, dense input images during testing, or both. This work uses the recently presented 3D Gaussian Splatting (3DGS) technique to render realistic humans at real-time framerates. Given their smaller size, we drive these deformations with joint angles and keypoints, which are more suitable for communication applications.
arXiv Detail & Related papers (2023-11-14T22:54:29Z)
DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars. It exploits the advantages of both the 2D and 3D neural rendering techniques. Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.