MPS-NeRF: Generalizable 3D Human Rendering from Multiview Images
- URL: http://arxiv.org/abs/2203.16875v1
- Date: Thu, 31 Mar 2022 08:09:03 GMT
- Title: MPS-NeRF: Generalizable 3D Human Rendering from Multiview Images
- Authors: Xiangjun Gao, Jiaolong Yang, Jongyoo Kim, Sida Peng, Zicheng Liu, Xin
Tong
- Abstract summary: This paper deals with rendering novel views and novel poses for a person unseen in training, using only multiview images as input.
Key ingredient is a dedicated representation combining a canonical NeRF and a volume deformation scheme.
Experiments on both real and synthetic data with the novel view synthesis and pose animation tasks collectively demonstrate the efficacy of our method.
- Score: 32.84481902544513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been rapid progress recently on 3D human rendering, including novel
view synthesis and pose animation, based on the advances of neural radiance
fields (NeRF). However, most existing methods focus on person-specific training
and their training typically requires multi-view videos. This paper deals with
a new challenging task -- rendering novel views and novel poses for a person
unseen in training, using only multiview images as input. For this task, we
propose a simple yet effective method to train a generalizable NeRF with
multiview images as conditional input. The key ingredient is a dedicated
representation combining a canonical NeRF and a volume deformation scheme.
Using a canonical space enables our method to learn shared properties of human
and easily generalize to different people. Volume deformation is used to
connect the canonical space with input and target images and query image
features for radiance and density prediction. We leverage the parametric 3D
human model fitted on the input images to derive the deformation, which works
quite well in practice when combined with our canonical NeRF. The experiments
on both real and synthetic data with the novel view synthesis and pose
animation tasks collectively demonstrate the efficacy of our method.
Related papers
- Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - PixelHuman: Animatable Neural Radiance Fields from Few Images [27.932366091437103]
We propose PixelHuman, a novel rendering model that generates animatable human scenes from a few images of a person.
Our method differs from existing methods in that it can generalize to any input image for animatable human synthesis.
Our experiments show that our method achieves state-of-the-art performance in multiview and novel pose synthesis from few-shot images.
arXiv Detail & Related papers (2023-07-18T08:41:17Z) - ActorsNeRF: Animatable Few-shot Human Rendering with Generalizable NeRFs [61.677180970486546]
We propose a novel animatable NeRF called ActorsNeRF.
It is first pre-trained on diverse human subjects, and then adapted with few-shot monocular video frames for a new actor with unseen poses.
We quantitatively and qualitatively demonstrate that ActorsNeRF significantly outperforms the existing state-of-the-art on few-shot generalization to new people and poses on multiple datasets.
arXiv Detail & Related papers (2023-04-27T17:58:48Z) - Novel View Synthesis of Humans using Differentiable Rendering [50.57718384229912]
We present a new approach for synthesizing novel views of people in new poses.
Our synthesis makes use of diffuse Gaussian primitives that represent the underlying skeletal structure of a human.
Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network.
arXiv Detail & Related papers (2023-03-28T10:48:33Z) - HQ3DAvatar: High Quality Controllable 3D Head Avatar [65.70885416855782]
This paper presents a novel approach to building highly photorealistic digital head avatars.
Our method learns a canonical space via an implicit function parameterized by a neural network.
At test time, our method is driven by a monocular RGB video.
arXiv Detail & Related papers (2023-03-25T13:56:33Z) - Neural Novel Actor: Learning a Generalized Animatable Neural
Representation for Human Actors [98.24047528960406]
We propose a new method for learning a generalized animatable neural representation from a sparse set of multi-view imagery of multiple persons.
The learned representation can be used to synthesize novel view images of an arbitrary person from a sparse set of cameras, and further animate them with the user's pose control.
arXiv Detail & Related papers (2022-08-25T07:36:46Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - ViewFormer: NeRF-free Neural Rendering from Few Images Using
Transformers [34.4824364161812]
Novel view synthesis is a problem where we are given only a few context views sparsely covering a scene or an object.
The goal is to predict novel viewpoints in the scene, which requires learning priors.
We propose a 2D-only method that maps multiple context views and a query pose to a new image in a single pass of a neural network.
arXiv Detail & Related papers (2022-03-18T21:08:23Z) - HumanNeRF: Generalizable Neural Human Radiance Field from Sparse Inputs [35.77939325296057]
Recent neural human representations can produce high-quality multi-view rendering but require using dense multi-view inputs and costly training.
We present HumanNeRF - a generalizable neural representation - for high-fidelity free-view synthesis of dynamic humans.
arXiv Detail & Related papers (2021-12-06T05:22:09Z) - pixelNeRF: Neural Radiance Fields from One or Few Images [20.607712035278315]
pixelNeRF is a learning framework that predicts a continuous neural scene representation conditioned on one or few input images.
We conduct experiments on ShapeNet benchmarks for single image novel view synthesis tasks with held-out objects.
In all cases, pixelNeRF outperforms current state-of-the-art baselines for novel view synthesis and single image 3D reconstruction.
arXiv Detail & Related papers (2020-12-03T18:59:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.