Neural Body: Implicit Neural Representations with Structured Latent
Codes for Novel View Synthesis of Dynamic Humans
- URL: http://arxiv.org/abs/2012.15838v2
- Date: Mon, 29 Mar 2021 14:13:59 GMT
- Title: Neural Body: Implicit Neural Representations with Structured Latent
Codes for Novel View Synthesis of Dynamic Humans
- Authors: Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai,
Hujun Bao, Xiaowei Zhou
- Abstract summary: This paper addresses the challenge of novel view synthesis for a human performer from a very sparse set of camera views.
We propose Neural Body, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh.
Experiments on ZJU-MoCap show that our approach outperforms prior works by a large margin in terms of novel view synthesis quality.
- Score: 56.63912568777483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the challenge of novel view synthesis for a human
performer from a very sparse set of camera views. Some recent works have shown
that learning implicit neural representations of 3D scenes achieves remarkable
view synthesis quality given dense input views. However, the representation
learning will be ill-posed if the views are highly sparse. To solve this
ill-posed problem, our key idea is to integrate observations over video frames.
To this end, we propose Neural Body, a new human body representation which
assumes that the learned neural representations at different frames share the
same set of latent codes anchored to a deformable mesh, so that the
observations across frames can be naturally integrated. The deformable mesh
also provides geometric guidance for the network to learn 3D representations
more efficiently. To evaluate our approach, we create a multi-view dataset
named ZJU-MoCap that captures performers with complex motions. Experiments on
ZJU-MoCap show that our approach outperforms prior works by a large margin in
terms of novel view synthesis quality. We also demonstrate the capability of
our approach to reconstruct a moving person from a monocular video on the
People-Snapshot dataset. The code and dataset are available at
https://zju3dv.github.io/neuralbody/.
Related papers
- Novel View Synthesis of Humans using Differentiable Rendering [50.57718384229912]
We present a new approach for synthesizing novel views of people in new poses.
Our synthesis makes use of diffuse Gaussian primitives that represent the underlying skeletal structure of a human.
Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network.
arXiv Detail & Related papers (2023-03-28T10:48:33Z) - One-Shot Neural Fields for 3D Object Understanding [112.32255680399399]
We present a unified and compact scene representation for robotics.
Each object in the scene is depicted by a latent code capturing geometry and appearance.
This representation can be decoded for various tasks such as novel view rendering, 3D reconstruction, and stable grasp prediction.
arXiv Detail & Related papers (2022-10-21T17:33:14Z) - Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Remote Sensing Novel View Synthesis with Implicit Multiplane
Representations [26.33490094119609]
We propose a novel remote sensing view synthesis method by leveraging the recent advances in implicit neural representations.
Considering the overhead and far depth imaging of remote sensing images, we represent the 3D space by combining implicit multiplane images (MPI) representation and deep neural networks.
Images from any novel views can be freely rendered on the basis of the reconstructed model.
arXiv Detail & Related papers (2022-05-18T13:03:55Z) - Neural Rendering of Humans in Novel View and Pose from Monocular Video [68.37767099240236]
We introduce a new method that generates photo-realistic humans under novel views and poses given a monocular video as input.
Our method significantly outperforms existing approaches under unseen poses and novel views given monocular videos as input.
arXiv Detail & Related papers (2022-04-04T03:09:20Z) - Human View Synthesis using a Single Sparse RGB-D Input [16.764379184593256]
We present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D.
An enhancer network leverages the overall fidelity, even in occluded areas from the original view, producing crisp renders with fine details.
arXiv Detail & Related papers (2021-12-27T20:13:53Z) - Human Pose Manipulation and Novel View Synthesis using Differentiable
Rendering [46.04980667824064]
We present a new approach for synthesizing novel views of people in new poses.
Our synthesis makes use of diffuse Gaussian primitives that represent the underlying skeletal structure of a human.
Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network.
arXiv Detail & Related papers (2021-11-24T19:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.