PVA: Pixel-aligned Volumetric Avatars
- URL: http://arxiv.org/abs/2101.02697v1
- Date: Thu, 7 Jan 2021 18:58:46 GMT
- Title: PVA: Pixel-aligned Volumetric Avatars
- Authors: Amit Raj, Michael Zollhoefer, Tomas Simon, Jason Saragih, Shunsuke
Saito, James Hays and Stephen Lombardi
- Abstract summary: We devise a novel approach for predicting volumetric avatars of the human head given just a small number of inputs.
Our approach is trained in an end-to-end manner solely based on a photometric re-rendering loss without requiring explicit 3D supervision.
- Score: 34.929560973779466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Acquisition and rendering of photo-realistic human heads is a highly
challenging research problem of particular importance for virtual telepresence.
Currently, the highest quality is achieved by volumetric approaches trained in
a person specific manner on multi-view data. These models better represent fine
structure, such as hair, compared to simpler mesh-based models. Volumetric
models typically employ a global code to represent facial expressions, such
that they can be driven by a small set of animation parameters. While such
architectures achieve impressive rendering quality, they can not easily be
extended to the multi-identity setting. In this paper, we devise a novel
approach for predicting volumetric avatars of the human head given just a small
number of inputs. We enable generalization across identities by a novel
parameterization that combines neural radiance fields with local, pixel-aligned
features extracted directly from the inputs, thus sidestepping the need for
very deep or complex networks. Our approach is trained in an end-to-end manner
solely based on a photometric re-rendering loss without requiring explicit 3D
supervision.We demonstrate that our approach outperforms the existing state of
the art in terms of quality and is able to generate faithful facial expressions
in a multi-identity setting.
Related papers
- HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos [52.23323966700072]
We present a framework for acquiring human avatars that are attached with high-resolution physically-based material textures and mesh from monocular video.
Our method introduces a novel information fusion strategy to combine the information from the monocular video and synthesize virtual multi-view images.
Experiments show that our approach outperforms previous representations in terms of high fidelity, and this explicit result supports deployment on common triangulars.
arXiv Detail & Related papers (2024-05-18T11:49:09Z) - Preface: A Data-driven Volumetric Prior for Few-shot Ultra
High-resolution Face Synthesis [0.0]
NeRFs have enabled highly realistic synthesis of human faces including complex appearance and reflectance effects of hair and skin.
We propose a novel human face prior that enables the synthesis of ultra high-resolution novel views of subjects that are not part of the prior's training distribution.
arXiv Detail & Related papers (2023-09-28T21:21:44Z) - Neural Point-based Volumetric Avatar: Surface-guided Neural Points for
Efficient and Photorealistic Volumetric Head Avatar [62.87222308616711]
We propose fullname (name), a method that adopts the neural point representation and the neural volume rendering process.
Specifically, the neural points are strategically constrained around the surface of the target expression via a high-resolution UV displacement map.
By design, our name is better equipped to handle topologically changing regions and thin structures while also ensuring accurate expression control when animating avatars.
arXiv Detail & Related papers (2023-07-11T03:40:10Z) - Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image.
We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z) - Drivable Volumetric Avatars using Texel-Aligned Features [52.89305658071045]
Photo telepresence requires both high-fidelity body modeling and faithful driving to enable dynamically synthesized appearance.
We propose an end-to-end framework that addresses two core challenges in modeling and driving full-body avatars of real people.
arXiv Detail & Related papers (2022-07-20T09:28:16Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Human View Synthesis using a Single Sparse RGB-D Input [16.764379184593256]
We present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D.
An enhancer network leverages the overall fidelity, even in occluded areas from the original view, producing crisp renders with fine details.
arXiv Detail & Related papers (2021-12-27T20:13:53Z) - MetaAvatar: Learning Animatable Clothed Human Models from Few Depth
Images [60.56518548286836]
To generate realistic cloth deformations from novel input poses, watertight meshes or dense full-body scans are usually needed as inputs.
We propose an approach that can quickly generate realistic clothed human avatars, represented as controllable neural SDFs, given only monocular depth images.
arXiv Detail & Related papers (2021-06-22T17:30:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.