Generalizable Neural Performer: Learning Robust Radiance Fields for
Human Novel View Synthesis
- URL: http://arxiv.org/abs/2204.11798v1
- Date: Mon, 25 Apr 2022 17:14:22 GMT
- Title: Generalizable Neural Performer: Learning Robust Radiance Fields for
Human Novel View Synthesis
- Authors: Wei Cheng, Su Xu, Jingtan Piao, Chen Qian, Wayne Wu, Kwan-Yee Lin,
Hongsheng Li
- Abstract summary: This work targets at using a general deep learning framework to synthesize free-viewpoint images of arbitrary human performers.
We present a simple yet powerful framework, named Generalizable Neural Performer (GNR), that learns a generalizable and robust neural body representation.
Experiments on GeneBody-1.0 and ZJU-Mocap show better robustness of our methods than recent state-of-the-art generalizable methods.
- Score: 52.720314035084215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work targets at using a general deep learning framework to synthesize
free-viewpoint images of arbitrary human performers, only requiring a sparse
number of camera views as inputs and skirting per-case fine-tuning. The large
variation of geometry and appearance, caused by articulated body poses, shapes
and clothing types, are the key bottlenecks of this task. To overcome these
challenges, we present a simple yet powerful framework, named Generalizable
Neural Performer (GNR), that learns a generalizable and robust neural body
representation over various geometry and appearance. Specifically, we compress
the light fields for novel view human rendering as conditional implicit neural
radiance fields from both geometry and appearance aspects. We first introduce
an Implicit Geometric Body Embedding strategy to enhance the robustness based
on both parametric 3D human body model and multi-view images hints. We further
propose a Screen-Space Occlusion-Aware Appearance Blending technique to
preserve the high-quality appearance, through interpolating source view
appearance to the radiance fields with a relax but approximate geometric
guidance.
To evaluate our method, we present our ongoing effort of constructing a
dataset with remarkable complexity and diversity. The dataset GeneBody-1.0,
includes over 360M frames of 370 subjects under multi-view cameras capturing,
performing a large variety of pose actions, along with diverse body shapes,
clothing, accessories and hairdos. Experiments on GeneBody-1.0 and ZJU-Mocap
show better robustness of our methods than recent state-of-the-art
generalizable methods among all cross-dataset, unseen subjects and unseen poses
settings. We also demonstrate the competitiveness of our model compared with
cutting-edge case-specific ones. Dataset, code and model will be made publicly
available.
Related papers
- TriHuman : A Real-time and Controllable Tri-plane Representation for
Detailed Human Geometry and Appearance Synthesis [76.73338151115253]
TriHuman is a novel human-tailored, deformable, and efficient tri-plane representation.
We non-rigidly warp global ray samples into our undeformed tri-plane texture space.
We show how such a tri-plane feature representation can be conditioned on the skeletal motion to account for dynamic appearance and geometry changes.
arXiv Detail & Related papers (2023-12-08T16:40:38Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - Self-supervised Neural Articulated Shape and Appearance Models [18.99030452836038]
We propose a novel approach for learning a representation of the geometry, appearance, and motion of a class of articulated objects.
Our representation learns shape, appearance, and articulation codes that enable independent control of these semantic dimensions.
arXiv Detail & Related papers (2022-05-17T17:50:47Z) - Neural Rendering of Humans in Novel View and Pose from Monocular Video [68.37767099240236]
We introduce a new method that generates photo-realistic humans under novel views and poses given a monocular video as input.
Our method significantly outperforms existing approaches under unseen poses and novel views given monocular videos as input.
arXiv Detail & Related papers (2022-04-04T03:09:20Z) - Neural Human Performer: Learning Generalizable Radiance Fields for Human
Performance Rendering [34.80975358673563]
We propose a novel approach that learns generalizable neural radiance fields based on a parametric human body model for robust performance capture.
Experiments on the ZJU-MoCap and AIST datasets show that our method significantly outperforms recent generalizable NeRF methods on unseen identities and poses.
arXiv Detail & Related papers (2021-09-15T17:32:46Z) - Neural Actor: Neural Free-view Synthesis of Human Actors with Pose
Control [80.79820002330457]
We propose a new method for high-quality synthesis of humans from arbitrary viewpoints and under arbitrary controllable poses.
Our method achieves better quality than the state-of-the-arts on playback as well as novel pose synthesis, and can even generalize well to new poses that starkly differ from the training poses.
arXiv Detail & Related papers (2021-06-03T17:40:48Z) - Monocular Real-time Full Body Capture with Inter-part Correlations [66.22835689189237]
We present the first method for real-time full body capture that estimates shape and motion of body and hands together with a dynamic 3D face model from a single color image.
Our approach uses a new neural network architecture that exploits correlations between body and hands at high computational efficiency.
arXiv Detail & Related papers (2020-12-11T02:37:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.