Human View Synthesis using a Single Sparse RGB-D Input
- URL: http://arxiv.org/abs/2112.13889v1
- Date: Mon, 27 Dec 2021 20:13:53 GMT
- Title: Human View Synthesis using a Single Sparse RGB-D Input
- Authors: Phong Nguyen, Nikolaos Sarafianos, Christoph Lassner, Janne Heikkila,
Tony Tung
- Abstract summary: We present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D.
An enhancer network leverages the overall fidelity, even in occluded areas from the original view, producing crisp renders with fine details.
- Score: 16.764379184593256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Novel view synthesis for humans in motion is a challenging computer vision
problem that enables applications such as free-viewpoint video. Existing
methods typically use complex setups with multiple input views, 3D supervision,
or pre-trained models that do not generalize well to new identities. Aiming to
address these limitations, we present a novel view synthesis framework to
generate realistic renders from unseen views of any human captured from a
single-view sensor with sparse RGB-D, similar to a low-cost depth camera, and
without actor-specific models. We propose an architecture to learn dense
features in novel views obtained by sphere-based neural rendering, and create
complete renders using a global context inpainting model. Additionally, an
enhancer network leverages the overall fidelity, even in occluded areas from
the original view, producing crisp renders with fine details. We show our
method generates high-quality novel views of synthetic and real human actors
given a single sparse RGB-D input. It generalizes to unseen identities, new
poses and faithfully reconstructs facial expressions. Our approach outperforms
prior human view synthesis methods and is robust to different levels of input
sparsity.
Related papers
- Cafca: High-quality Novel View Synthesis of Expressive Faces from Casual Few-shot Captures [33.463245327698]
We present a novel volumetric prior on human faces that allows for high-fidelity expressive face modeling.
We leverage a 3D Morphable Face Model to synthesize a large training set, rendering each identity with different expressions.
We then train a conditional Neural Radiance Field prior on this synthetic dataset and, at inference time, fine-tune the model on a very sparse set of real images of a single subject.
arXiv Detail & Related papers (2024-10-01T12:24:50Z) - InceptionHuman: Controllable Prompt-to-NeRF for Photorealistic 3D Human Generation [61.62346472443454]
InceptionHuman is a prompt-to-NeRF framework that allows easy control via a combination of prompts in different modalities to generate photorealistic 3D humans.
InceptionHuman achieves consistent 3D human generation within a progressively refined NeRF space.
arXiv Detail & Related papers (2023-11-27T15:49:41Z) - GenLayNeRF: Generalizable Layered Representations with 3D Model
Alignment for Multi-Human View Synthesis [1.6574413179773757]
GenLayNeRF is a generalizable layered scene representation for free-viewpoint rendering of multiple human subjects.
We divide the scene into multi-human layers anchored by the 3D body meshes.
We extract point-wise image-aligned and human-anchored features which are correlated and fused.
arXiv Detail & Related papers (2023-09-20T20:37:31Z) - Novel View Synthesis of Humans using Differentiable Rendering [50.57718384229912]
We present a new approach for synthesizing novel views of people in new poses.
Our synthesis makes use of diffuse Gaussian primitives that represent the underlying skeletal structure of a human.
Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network.
arXiv Detail & Related papers (2023-03-28T10:48:33Z) - SHERF: Generalizable Human NeRF from a Single Image [59.10589479808622]
SHERF is the first generalizable Human NeRF model for recovering animatable 3D humans from a single input image.
We propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.
arXiv Detail & Related papers (2023-03-22T17:59:12Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Human Pose Manipulation and Novel View Synthesis using Differentiable
Rendering [46.04980667824064]
We present a new approach for synthesizing novel views of people in new poses.
Our synthesis makes use of diffuse Gaussian primitives that represent the underlying skeletal structure of a human.
Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network.
arXiv Detail & Related papers (2021-11-24T19:00:07Z) - Neural Body: Implicit Neural Representations with Structured Latent
Codes for Novel View Synthesis of Dynamic Humans [56.63912568777483]
This paper addresses the challenge of novel view synthesis for a human performer from a very sparse set of camera views.
We propose Neural Body, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh.
Experiments on ZJU-MoCap show that our approach outperforms prior works by a large margin in terms of novel view synthesis quality.
arXiv Detail & Related papers (2020-12-31T18:55:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.