Relightable Neural Video Portrait
- URL: http://arxiv.org/abs/2107.14735v1
- Date: Fri, 30 Jul 2021 16:20:45 GMT
- Title: Relightable Neural Video Portrait
- Authors: Youjia Wang, Taotao Zhou, Minzhang Li, Teng Xu, Minye Wu, Lan Xu,
Jingyi Yu
- Abstract summary: Photo-realistic facial video portrait reenactment benefits virtual production and numerous VR/AR experiences.
We present a relightable neural video portrait, a simultaneous relighting and reenactment scheme that transfers the head pose and facial expressions from a source actor to a portrait video of a target actor with arbitrary new backgrounds and lighting conditions.
- Score: 36.67623086400362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Photo-realistic facial video portrait reenactment benefits virtual production
and numerous VR/AR experiences. The task remains challenging as the portrait
should maintain high realism and consistency with the target environment. In
this paper, we present a relightable neural video portrait, a simultaneous
relighting and reenactment scheme that transfers the head pose and facial
expressions from a source actor to a portrait video of a target actor with
arbitrary new backgrounds and lighting conditions. Our approach combines 4D
reflectance field learning, model-based facial performance capture and
target-aware neural rendering. Specifically, we adopt a rendering-to-video
translation network to first synthesize high-quality OLAT imagesets and alpha
mattes from hybrid facial performance capture results. We then design a
semantic-aware facial normalization scheme to enable reliable explicit control
as well as a multi-frame multi-task learning strategy to encode content,
segmentation and temporal information simultaneously for high-quality
reflectance field inference. After training, our approach further enables
photo-realistic and controllable video portrait editing of the target
performer. Reliable face poses and expression editing is obtained by applying
the same hybrid facial capture and normalization scheme to the source video
input, while our explicit alpha and OLAT output enable high-quality relit and
background editing. With the ability to achieve simultaneous relighting and
reenactment, we are able to improve the realism in a variety of virtual
production and video rewrite applications.
Related papers
- Lite2Relight: 3D-aware Single Image Portrait Relighting [87.62069509622226]
Lite2Relight is a novel technique that can predict 3D consistent head poses of portraits.
By utilizing a pre-trained geometry-aware encoder and a feature alignment module, we map input images into a relightable 3D space.
This includes producing 3D-consistent results of the full head, including hair, eyes, and expressions.
arXiv Detail & Related papers (2024-07-15T07:16:11Z) - ReliTalk: Relightable Talking Portrait Generation from a Single Video [62.47116237654984]
ReliTalk is a novel framework for relightable audio-driven talking portrait generation from monocular videos.
Our key insight is to decompose the portrait's reflectance from implicitly learned audio-driven facial normals and images.
arXiv Detail & Related papers (2023-09-05T17:59:42Z) - HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and
Retarget Faces [47.27033282706179]
We present our method for neural face reenactment, called HyperReenact, that aims to generate realistic talking head images of a source identity.
Our method operates under the one-shot setting (i.e., using a single source frame) and allows for cross-subject reenactment, without requiring subject-specific fine-tuning.
We compare our method both quantitatively and qualitatively against several state-of-the-art techniques on the standard benchmarks of VoxCeleb1 and VoxCeleb2.
arXiv Detail & Related papers (2023-07-20T11:59:42Z) - Learning to Relight Portrait Images via a Virtual Light Stage and
Synthetic-to-Real Adaptation [76.96499178502759]
Relighting aims to re-illuminate the person in the image as if the person appeared in an environment with the target lighting.
Recent methods rely on deep learning to achieve high-quality results.
We propose a new approach that can perform on par with the state-of-the-art (SOTA) relighting methods without requiring a light stage.
arXiv Detail & Related papers (2022-09-21T17:15:58Z) - NARRATE: A Normal Assisted Free-View Portrait Stylizer [42.38374601073052]
NARRATE is a novel pipeline that enables simultaneously editing portrait lighting and perspective in a photorealistic manner.
We experimentally demonstrate that NARRATE achieves more photorealistic, reliable results over prior works.
We showcase vivid free-view facial animations as well as 3D-aware relightableization, which help facilitate various AR/VR applications.
arXiv Detail & Related papers (2022-07-03T07:54:05Z) - High-Quality Real Time Facial Capture Based on Single Camera [0.0]
We train a convolutional neural network to produce high-quality continuous blendshape weight output from video training.
We demonstrate compelling animation inference in challenging areas such as eyes and lips.
arXiv Detail & Related papers (2021-11-15T06:42:27Z) - Image-to-Video Generation via 3D Facial Dynamics [78.01476554323179]
We present a versatile model, FaceAnime, for various video generation tasks from still images.
Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
arXiv Detail & Related papers (2021-05-31T02:30:11Z) - Neural Reflectance Fields for Appearance Acquisition [61.542001266380375]
We present Neural Reflectance Fields, a novel deep scene representation that encodes volume density, normal and reflectance properties at any 3D point in a scene.
We combine this representation with a physically-based differentiable ray marching framework that can render images from a neural reflectance field under any viewpoint and light.
arXiv Detail & Related papers (2020-08-09T22:04:36Z) - Head2Head++: Deep Facial Attributes Re-Targeting [6.230979482947681]
We leverage the 3D geometry of faces and Generative Adversarial Networks (GANs) to design a novel deep learning architecture for the task of facial and head reenactment.
We manage to capture the complex non-rigid facial motion from the driving monocular performances and synthesise temporally consistent videos.
Our system performs end-to-end reenactment in nearly real-time speed (18 fps)
arXiv Detail & Related papers (2020-06-17T23:38:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.