Expressive Telepresence via Modular Codec Avatars
- URL: http://arxiv.org/abs/2008.11789v1
- Date: Wed, 26 Aug 2020 20:16:43 GMT
- Title: Expressive Telepresence via Modular Codec Avatars
- Authors: Hang Chu, Shugao Ma, Fernando De la Torre, Sanja Fidler, Yaser Sheikh
- Abstract summary: VR telepresence consists of interacting with another human in a virtual space represented by an avatar.
This paper aims in this direction and presents Modular Codec Avatars (MCA), a method to generate hyper-realistic faces driven by the cameras in the VR headset.
MCA extends traditional Codec Avatars (CA) by replacing the holistic models with a learned modular representation.
- Score: 148.212743312768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: VR telepresence consists of interacting with another human in a virtual space
represented by an avatar. Today most avatars are cartoon-like, but soon the
technology will allow video-realistic ones. This paper aims in this direction
and presents Modular Codec Avatars (MCA), a method to generate hyper-realistic
faces driven by the cameras in the VR headset. MCA extends traditional Codec
Avatars (CA) by replacing the holistic models with a learned modular
representation. It is important to note that traditional person-specific CAs
are learned from few training samples, and typically lack robustness as well as
limited expressiveness when transferring facial expressions. MCAs solve these
issues by learning a modulated adaptive blending of different facial components
as well as an exemplar-based latent alignment. We demonstrate that MCA achieves
improved expressiveness and robustness w.r.t to CA in a variety of real-world
datasets and practical scenarios. Finally, we showcase new applications in VR
telepresence enabled by the proposed model.
Related papers
- Universal Facial Encoding of Codec Avatars from VR Headsets [32.60236093340087]
We present a method that can animate a photorealistic avatar in realtime from head-mounted cameras (HMCs) on a consumer VR headset.
We present a lightweight expression calibration mechanism that increases accuracy with minimal additional cost to run-time efficiency.
arXiv Detail & Related papers (2024-07-17T22:08:15Z) - MoRF: Mobile Realistic Fullbody Avatars from a Monocular Video [7.648034937040346]
We present a system to create Mobile Realistic Fullbody (MoRF) avatars.
MoRF avatars are rendered in real-time on mobile devices, learned from monocular videos, and have high realism.
arXiv Detail & Related papers (2023-03-17T23:14:04Z) - AvatarMAV: Fast 3D Head Avatar Reconstruction Using Motion-Aware Neural
Voxels [33.085274792188756]
We propose AvatarMAV, a fast 3D head avatar reconstruction method using Motion-Aware Neural Voxels.
AvatarMAV is the first to model both the canonical appearance and the decoupled expression motion by neural voxels for head avatar.
The proposed AvatarMAV can recover photo-realistic head avatars in just 5 minutes, which is significantly faster than the state-of-the-art facial reenactment methods.
arXiv Detail & Related papers (2022-11-23T18:49:31Z) - Capturing and Animation of Body and Clothing from Monocular Video [105.87228128022804]
We present SCARF, a hybrid model combining a mesh-based body with a neural radiance field.
integrating the mesh into the rendering enables us to optimize SCARF directly from monocular videos.
We demonstrate that SCARFs clothing with higher visual quality than existing methods, that the clothing deforms with changing body pose and body shape, and that clothing can be successfully transferred between avatars of different subjects.
arXiv Detail & Related papers (2022-10-04T19:34:05Z) - Robust Egocentric Photo-realistic Facial Expression Transfer for Virtual
Reality [68.18446501943585]
Social presence will fuel the next generation of communication systems driven by digital humans in virtual reality (VR)
The best 3D video-realistic VR avatars that minimize the uncanny effect rely on person-specific (PS) models.
This paper makes progress in overcoming these limitations by proposing an end-to-end multi-identity architecture.
arXiv Detail & Related papers (2021-04-10T15:48:53Z) - Pixel Codec Avatars [99.36561532588831]
Pixel Codec Avatars (PiCA) is a deep generative model of 3D human faces.
On a single Oculus Quest 2 mobile VR headset, 5 avatars are rendered in realtime in the same scene.
arXiv Detail & Related papers (2021-04-09T23:17:36Z) - High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation [117.32310997522394]
3D video avatars can empower virtual communications by providing compression, privacy, entertainment, and a sense of presence in AR/VR.
Existing person-specific 3D models are not robust to lighting, hence their results typically miss subtle facial behaviors and cause artifacts in the avatar.
This paper addresses previous limitations by learning a deep learning lighting model, that in combination with a high-quality 3D face tracking algorithm, provides a method for subtle and robust facial motion transfer from a regular video to a 3D photo-realistic avatar.
arXiv Detail & Related papers (2021-03-29T18:33:49Z) - Audio- and Gaze-driven Facial Animation of Codec Avatars [149.0094713268313]
We describe the first approach to animate Codec Avatars in real-time using audio and/or eye tracking.
Our goal is to display expressive conversations between individuals that exhibit important social signals.
arXiv Detail & Related papers (2020-08-11T22:28:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.