Neural Human Performer: Learning Generalizable Radiance Fields for Human
Performance Rendering
- URL: http://arxiv.org/abs/2109.07448v1
- Date: Wed, 15 Sep 2021 17:32:46 GMT
- Title: Neural Human Performer: Learning Generalizable Radiance Fields for Human
Performance Rendering
- Authors: Youngjoong Kwon and Dahun Kim and Duygu Ceylan and Henry Fuchs
- Abstract summary: We propose a novel approach that learns generalizable neural radiance fields based on a parametric human body model for robust performance capture.
Experiments on the ZJU-MoCap and AIST datasets show that our method significantly outperforms recent generalizable NeRF methods on unseen identities and poses.
- Score: 34.80975358673563
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we aim at synthesizing a free-viewpoint video of an arbitrary
human performance using sparse multi-view cameras. Recently, several works have
addressed this problem by learning person-specific neural radiance fields
(NeRF) to capture the appearance of a particular human. In parallel, some work
proposed to use pixel-aligned features to generalize radiance fields to
arbitrary new scenes and objects. Adopting such generalization approaches to
humans, however, is highly challenging due to the heavy occlusions and dynamic
articulations of body parts. To tackle this, we propose Neural Human Performer,
a novel approach that learns generalizable neural radiance fields based on a
parametric human body model for robust performance capture. Specifically, we
first introduce a temporal transformer that aggregates tracked visual features
based on the skeletal body motion over time. Moreover, a multi-view transformer
is proposed to perform cross-attention between the temporally-fused features
and the pixel-aligned features at each time step to integrate observations on
the fly from multiple views. Experiments on the ZJU-MoCap and AIST datasets
show that our method significantly outperforms recent generalizable NeRF
methods on unseen identities and poses. The video results and code are
available at https://youngjoongunc.github.io/nhp.
Related papers
- GHuNeRF: Generalizable Human NeRF from a Monocular Video [63.741714198481354]
GHuNeRF learns a generalizable human NeRF model from a monocular video.
We validate our approach on the widely-used ZJU-MoCap dataset.
arXiv Detail & Related papers (2023-08-31T09:19:06Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - Neural Novel Actor: Learning a Generalized Animatable Neural
Representation for Human Actors [98.24047528960406]
We propose a new method for learning a generalized animatable neural representation from a sparse set of multi-view imagery of multiple persons.
The learned representation can be used to synthesize novel view images of an arbitrary person from a sparse set of cameras, and further animate them with the user's pose control.
arXiv Detail & Related papers (2022-08-25T07:36:46Z) - Generalizable Neural Performer: Learning Robust Radiance Fields for
Human Novel View Synthesis [52.720314035084215]
This work targets at using a general deep learning framework to synthesize free-viewpoint images of arbitrary human performers.
We present a simple yet powerful framework, named Generalizable Neural Performer (GNR), that learns a generalizable and robust neural body representation.
Experiments on GeneBody-1.0 and ZJU-Mocap show better robustness of our methods than recent state-of-the-art generalizable methods.
arXiv Detail & Related papers (2022-04-25T17:14:22Z) - Neural Rendering of Humans in Novel View and Pose from Monocular Video [68.37767099240236]
We introduce a new method that generates photo-realistic humans under novel views and poses given a monocular video as input.
Our method significantly outperforms existing approaches under unseen poses and novel views given monocular videos as input.
arXiv Detail & Related papers (2022-04-04T03:09:20Z) - Human View Synthesis using a Single Sparse RGB-D Input [16.764379184593256]
We present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D.
An enhancer network leverages the overall fidelity, even in occluded areas from the original view, producing crisp renders with fine details.
arXiv Detail & Related papers (2021-12-27T20:13:53Z) - HumanNeRF: Generalizable Neural Human Radiance Field from Sparse Inputs [35.77939325296057]
Recent neural human representations can produce high-quality multi-view rendering but require using dense multi-view inputs and costly training.
We present HumanNeRF - a generalizable neural representation - for high-fidelity free-view synthesis of dynamic humans.
arXiv Detail & Related papers (2021-12-06T05:22:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.