Efficient Neural Implicit Representation for 3D Human Reconstruction
- URL: http://arxiv.org/abs/2410.17741v1
- Date: Wed, 23 Oct 2024 10:16:01 GMT
- Title: Efficient Neural Implicit Representation for 3D Human Reconstruction
- Authors: Zexu Huang, Sarah Monazam Erfani, Siying Lu, Mingming Gong,
- Abstract summary: Conventional methods for reconstructing 3D human motion frequently require the use of expensive hardware and have high processing costs.
This study presents HumanAvatar, an innovative approach that efficiently reconstructs precise human avatars from monocular video sources.
- Score: 38.241511336562844
- License:
- Abstract: High-fidelity digital human representations are increasingly in demand in the digital world, particularly for interactive telepresence, AR/VR, 3D graphics, and the rapidly evolving metaverse. Even though they work well in small spaces, conventional methods for reconstructing 3D human motion frequently require the use of expensive hardware and have high processing costs. This study presents HumanAvatar, an innovative approach that efficiently reconstructs precise human avatars from monocular video sources. At the core of our methodology, we integrate the pre-trained HuMoR, a model celebrated for its proficiency in human motion estimation. This is adeptly fused with the cutting-edge neural radiance field technology, Instant-NGP, and the state-of-the-art articulated model, Fast-SNARF, to enhance the reconstruction fidelity and speed. By combining these two technologies, a system is created that can render quickly and effectively while also providing estimation of human pose parameters that are unmatched in accuracy. We have enhanced our system with an advanced posture-sensitive space reduction technique, which optimally balances rendering quality with computational efficiency. In our detailed experimental analysis using both artificial and real-world monocular videos, we establish the advanced performance of our approach. HumanAvatar consistently equals or surpasses contemporary leading-edge reconstruction techniques in quality. Furthermore, it achieves these complex reconstructions in minutes, a fraction of the time typically required by existing methods. Our models achieve a training speed that is 110X faster than that of State-of-The-Art (SoTA) NeRF-based models. Our technique performs noticeably better than SoTA dynamic human NeRF methods if given an identical runtime limit. HumanAvatar can provide effective visuals after only 30 seconds of training.
Related papers
- GauHuman: Articulated Gaussian Splatting from Monocular Human Videos [58.553979884950834]
GauHuman is a 3D human model with Gaussian Splatting for both fast training (1 2 minutes) and real-time rendering (up to 189 FPS)
GauHuman encodes Gaussian Splatting in the canonical space and transforms 3D Gaussians from canonical space to posed space with linear blend skinning (LBS)
Experiments on ZJU_Mocap and MonoCap datasets demonstrate that GauHuman achieves state-of-the-art performance quantitatively and qualitatively with fast training and real-time rendering speed.
arXiv Detail & Related papers (2023-12-05T18:59:14Z) - Efficient-3DiM: Learning a Generalizable Single-image Novel-view
Synthesizer in One Day [63.96075838322437]
We propose a framework to learn a single-image novel-view synthesizer.
Our framework is able to reduce the total training time from 10 days to less than 1 day.
arXiv Detail & Related papers (2023-10-04T17:57:07Z) - InstantAvatar: Efficient 3D Head Reconstruction via Surface Rendering [13.85652935706768]
We introduce InstantAvatar, a method that recovers full-head avatars from few images (down to just one) in a few seconds on commodity hardware.
We present a novel statistical model that learns a prior distribution over 3D head signed distance functions using a voxel-grid based architecture.
arXiv Detail & Related papers (2023-08-09T11:02:00Z) - Efficient Meshy Neural Fields for Animatable Human Avatars [87.68529918184494]
Efficiently digitizing high-fidelity animatable human avatars from videos is a challenging and active research topic.
Recent rendering-based neural representations open a new way for human digitization with their friendly usability and photo-varying reconstruction quality.
We present EMA, a method that Efficiently learns Meshy neural fields to reconstruct animatable human Avatars.
arXiv Detail & Related papers (2023-03-23T00:15:34Z) - Real-time volumetric rendering of dynamic humans [83.08068677139822]
We present a method for fast 3D reconstruction and real-time rendering of dynamic humans from monocular videos.
Our method can reconstruct a dynamic human in less than 3h using a single GPU, compared to recent state-of-the-art alternatives that take up to 72h.
A novel local ray marching rendering allows visualizing the neural human on a mobile VR device at 40 frames per second with minimal loss of visual quality.
arXiv Detail & Related papers (2023-03-21T14:41:25Z) - InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds [43.41503529747328]
We propose a system that can reconstruct human avatars from a monocular video within seconds, and these avatars can be animated and rendered at an interactive rate.
Compared to existing methods, InstantAvatar converges 130x faster and can be trained in minutes instead of hours.
arXiv Detail & Related papers (2022-12-20T18:53:58Z) - HDHumans: A Hybrid Approach for High-fidelity Digital Humans [107.19426606778808]
HDHumans is the first method for HD human character synthesis that jointly produces an accurate and temporally coherent 3D deforming surface.
Our method is carefully designed to achieve a synergy between classical surface deformation and neural radiance fields (NeRF)
arXiv Detail & Related papers (2022-10-21T14:42:11Z) - SelfNeRF: Fast Training NeRF for Human from Monocular Self-rotating
Video [29.50059002228373]
SelfNeRF is an efficient neural radiance field based novel view synthesis method for human performance.
It can train from scratch and achieve high-fidelity results in about twenty minutes.
arXiv Detail & Related papers (2022-10-04T14:54:40Z) - Physics to the Rescue: Deep Non-line-of-sight Reconstruction for
High-speed Imaging [13.271762773872476]
We present a novel deep model that incorporates the complementary physics priors of wave propagation and volume rendering into a neural network for high-quality and robust NLOS reconstruction.
Our method outperforms prior physics and learning based approaches on both synthetic and real measurements.
arXiv Detail & Related papers (2022-05-03T02:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.