Pixel Codec Avatars
- URL: http://arxiv.org/abs/2104.04638v1
- Date: Fri, 9 Apr 2021 23:17:36 GMT
- Title: Pixel Codec Avatars
- Authors: Shugao Ma, Tomas Simon, Jason Saragih, Dawei Wang, Yuecheng Li,
Fernando De La Torre, Yaser Sheikh
- Abstract summary: Pixel Codec Avatars (PiCA) is a deep generative model of 3D human faces.
On a single Oculus Quest 2 mobile VR headset, 5 avatars are rendered in realtime in the same scene.
- Score: 99.36561532588831
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Telecommunication with photorealistic avatars in virtual or augmented reality
is a promising path for achieving authentic face-to-face communication in 3D
over remote physical distances. In this work, we present the Pixel Codec
Avatars (PiCA): a deep generative model of 3D human faces that achieves state
of the art reconstruction performance while being computationally efficient and
adaptive to the rendering conditions during execution. Our model combines two
core ideas: (1) a fully convolutional architecture for decoding spatially
varying features, and (2) a rendering-adaptive per-pixel decoder. Both
techniques are integrated via a dense surface representation that is learned in
a weakly-supervised manner from low-topology mesh tracking over training
images. We demonstrate that PiCA improves reconstruction over existing
techniques across testing expressions and views on persons of different gender
and skin tone. Importantly, we show that the PiCA model is much smaller than
the state-of-art baseline model, and makes multi-person telecommunicaiton
possible: on a single Oculus Quest 2 mobile VR headset, 5 avatars are rendered
in realtime in the same scene.
Related papers
- R2Human: Real-Time 3D Human Appearance Rendering from a Single Image [42.74145788079571]
R2Human is the first approach for real-time inference and rendering of 3D human appearance from a single image.
We present an end-to-end network that performs high-fidelity color reconstruction of visible areas and provides reliable color inference for occluded regions.
arXiv Detail & Related papers (2023-12-10T08:59:43Z) - GETAvatar: Generative Textured Meshes for Animatable Human Avatars [69.56959932421057]
We study the problem of 3D-aware full-body human generation, aiming at creating animatable human avatars with high-quality geometries and textures.
We propose GETAvatar, a Generative model that directly generates Explicit Textured 3D rendering for animatable human Avatar.
arXiv Detail & Related papers (2023-10-04T10:30:24Z) - HQ3DAvatar: High Quality Controllable 3D Head Avatar [65.70885416855782]
This paper presents a novel approach to building highly photorealistic digital head avatars.
Our method learns a canonical space via an implicit function parameterized by a neural network.
At test time, our method is driven by a monocular RGB video.
arXiv Detail & Related papers (2023-03-25T13:56:33Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z) - LiP-Flow: Learning Inference-time Priors for Codec Avatars via
Normalizing Flows in Latent Space [90.74976459491303]
We introduce a prior model that is conditioned on the runtime inputs and tie this prior space to the 3D face model via a normalizing flow in the latent space.
A normalizing flow bridges the two representation spaces and transforms latent samples from one domain to another, allowing us to define a latent likelihood objective.
We show that our approach leads to an expressive and effective prior, capturing facial dynamics and subtle expressions better.
arXiv Detail & Related papers (2022-03-15T13:22:57Z) - Robust Egocentric Photo-realistic Facial Expression Transfer for Virtual
Reality [68.18446501943585]
Social presence will fuel the next generation of communication systems driven by digital humans in virtual reality (VR)
The best 3D video-realistic VR avatars that minimize the uncanny effect rely on person-specific (PS) models.
This paper makes progress in overcoming these limitations by proposing an end-to-end multi-identity architecture.
arXiv Detail & Related papers (2021-04-10T15:48:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.