Expressive Telepresence via Modular Codec Avatars
- URL: http://arxiv.org/abs/2008.11789v1
- Date: Wed, 26 Aug 2020 20:16:43 GMT
- Title: Expressive Telepresence via Modular Codec Avatars
- Authors: Hang Chu, Shugao Ma, Fernando De la Torre, Sanja Fidler, Yaser Sheikh
- Abstract summary: VR telepresence consists of interacting with another human in a virtual space represented by an avatar.
This paper aims in this direction and presents Modular Codec Avatars (MCA), a method to generate hyper-realistic faces driven by the cameras in the VR headset.
MCA extends traditional Codec Avatars (CA) by replacing the holistic models with a learned modular representation.
- Score: 148.212743312768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: VR telepresence consists of interacting with another human in a virtual space
represented by an avatar. Today most avatars are cartoon-like, but soon the
technology will allow video-realistic ones. This paper aims in this direction
and presents Modular Codec Avatars (MCA), a method to generate hyper-realistic
faces driven by the cameras in the VR headset. MCA extends traditional Codec
Avatars (CA) by replacing the holistic models with a learned modular
representation. It is important to note that traditional person-specific CAs
are learned from few training samples, and typically lack robustness as well as
limited expressiveness when transferring facial expressions. MCAs solve these
issues by learning a modulated adaptive blending of different facial components
as well as an exemplar-based latent alignment. We demonstrate that MCA achieves
improved expressiveness and robustness w.r.t to CA in a variety of real-world
datasets and practical scenarios. Finally, we showcase new applications in VR
telepresence enabled by the proposed model.
Related papers
- EgoAvatar: Egocentric View-Driven and Photorealistic Full-body Avatars [56.56236652774294]
We propose a person-specific egocentric telepresence approach, which jointly models the photoreal digital avatar while also driving it from a single egocentric video.
Our experiments demonstrate a clear step towards egocentric and photoreal telepresence as our method outperforms baselines as well as competing methods.
arXiv Detail & Related papers (2024-09-22T22:50:27Z) - GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars [44.8290935585746]
Photo-realistic and controllable 3D avatars are crucial for various applications such as virtual and mixed reality (VR/MR), telepresence, gaming, and film production.
Traditional methods for avatar creation often involve time-consuming scanning and reconstruction processes for each avatar.
We propose a text-conditioned generative model that can generate photo-realistic facial avatars of diverse identities.
arXiv Detail & Related papers (2024-08-24T21:25:22Z) - Universal Facial Encoding of Codec Avatars from VR Headsets [32.60236093340087]
We present a method that can animate a photorealistic avatar in realtime from head-mounted cameras (HMCs) on a consumer VR headset.
We present a lightweight expression calibration mechanism that increases accuracy with minimal additional cost to run-time efficiency.
arXiv Detail & Related papers (2024-07-17T22:08:15Z) - Capturing and Animation of Body and Clothing from Monocular Video [105.87228128022804]
We present SCARF, a hybrid model combining a mesh-based body with a neural radiance field.
integrating the mesh into the rendering enables us to optimize SCARF directly from monocular videos.
We demonstrate that SCARFs clothing with higher visual quality than existing methods, that the clothing deforms with changing body pose and body shape, and that clothing can be successfully transferred between avatars of different subjects.
arXiv Detail & Related papers (2022-10-04T19:34:05Z) - Robust Egocentric Photo-realistic Facial Expression Transfer for Virtual
Reality [68.18446501943585]
Social presence will fuel the next generation of communication systems driven by digital humans in virtual reality (VR)
The best 3D video-realistic VR avatars that minimize the uncanny effect rely on person-specific (PS) models.
This paper makes progress in overcoming these limitations by proposing an end-to-end multi-identity architecture.
arXiv Detail & Related papers (2021-04-10T15:48:53Z) - Pixel Codec Avatars [99.36561532588831]
Pixel Codec Avatars (PiCA) is a deep generative model of 3D human faces.
On a single Oculus Quest 2 mobile VR headset, 5 avatars are rendered in realtime in the same scene.
arXiv Detail & Related papers (2021-04-09T23:17:36Z) - High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation [117.32310997522394]
3D video avatars can empower virtual communications by providing compression, privacy, entertainment, and a sense of presence in AR/VR.
Existing person-specific 3D models are not robust to lighting, hence their results typically miss subtle facial behaviors and cause artifacts in the avatar.
This paper addresses previous limitations by learning a deep learning lighting model, that in combination with a high-quality 3D face tracking algorithm, provides a method for subtle and robust facial motion transfer from a regular video to a 3D photo-realistic avatar.
arXiv Detail & Related papers (2021-03-29T18:33:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.