Mesh Guided One-shot Face Reenactment using Graph Convolutional Networks
- URL: http://arxiv.org/abs/2008.07783v2
- Date: Fri, 18 Sep 2020 11:21:18 GMT
- Title: Mesh Guided One-shot Face Reenactment using Graph Convolutional Networks
- Authors: Guangming Yao, Yi Yuan, Tianjia Shao, Kun Zhou
- Abstract summary: We introduce a method for one-shot face reenactment, which uses the reconstructed 3D meshes as guidance to learn the optical flow needed for the reenacted face synthesis.
We propose a motion net to learn the face motion, which is an asymmetric autoencoder.
Our method can generate high-quality results and outperforms state-of-the-art methods in both qualitative and quantitative comparisons.
- Score: 31.083072922977568
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Face reenactment aims to animate a source face image to a different pose and
expression provided by a driving image. Existing approaches are either designed
for a specific identity, or suffer from the identity preservation problem in
the one-shot or few-shot scenarios. In this paper, we introduce a method for
one-shot face reenactment, which uses the reconstructed 3D meshes (i.e., the
source mesh and driving mesh) as guidance to learn the optical flow needed for
the reenacted face synthesis. Technically, we explicitly exclude the driving
face's identity information in the reconstructed driving mesh. In this way, our
network can focus on the motion estimation for the source face without the
interference of driving face shape. We propose a motion net to learn the face
motion, which is an asymmetric autoencoder. The encoder is a graph
convolutional network (GCN) that learns a latent motion vector from the meshes,
and the decoder serves to produce an optical flow image from the latent vector
with CNNs. Compared to previous methods using sparse keypoints to guide the
optical flow learning, our motion net learns the optical flow directly from 3D
dense meshes, which provide the detailed shape and pose information for the
optical flow, so it can achieve more accurate expression and pose on the
reenacted face. Extensive experiments show that our method can generate
high-quality results and outperforms state-of-the-art methods in both
qualitative and quantitative comparisons.
Related papers
- G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation.
Our novel approach empowers the face animation model to incorporate 3D information using only 2D images.
In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z) - FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features [17.531847357428454]
The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image.
Most existing methods are CNN-based and estimate optical flow from the source image to the current driving frame.
We propose a transformer-based encoder for computing a set-latent representation of the source image.
arXiv Detail & Related papers (2024-04-15T12:37:26Z) - One-shot Neural Face Reenactment via Finding Directions in GAN's Latent
Space [37.357842761713705]
We present a framework for neural face/head reenactment whose goal is to transfer the 3D head orientation and expression of a target face to a source face.
Our method features several favorable properties including using a single source image (one-shot) and enabling cross-person reenactment.
arXiv Detail & Related papers (2024-02-05T22:12:42Z) - Semantic-aware One-shot Face Re-enactment with Dense Correspondence
Estimation [100.60938767993088]
One-shot face re-enactment is a challenging task due to the identity mismatch between source and driving faces.
This paper proposes to use 3D Morphable Model (3DMM) for explicit facial semantic decomposition and identity disentanglement.
arXiv Detail & Related papers (2022-11-23T03:02:34Z) - Video2StyleGAN: Encoding Video in Latent Space for Manipulation [63.03250800510085]
We propose a novel network to encode face videos into the latent space of StyleGAN for semantic face video manipulation.
Our approach can significantly outperform existing single image methods, while achieving real-time (66 fps) speed.
arXiv Detail & Related papers (2022-06-27T06:48:15Z) - Finding Directions in GAN's Latent Space for Neural Face Reenactment [45.67273942952348]
This paper is on face/head reenactment where the goal is to transfer the facial pose (3D head orientation and expression) of a target face to a source face.
We take a different approach, bypassing the training of such networks, by using (fine-tuned) pre-trained GANs.
We show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces.
arXiv Detail & Related papers (2022-01-31T19:14:03Z) - Face Forgery Detection by 3D Decomposition [72.22610063489248]
We consider a face image as the production of the intervention of the underlying 3D geometry and the lighting environment.
By disentangling the face image into 3D shape, common texture, identity texture, ambient light, and direct light, we find the devil lies in the direct light and the identity texture.
We propose to utilize facial detail, which is the combination of direct light and identity texture, as the clue to detect the subtle forgery patterns.
arXiv Detail & Related papers (2020-11-19T09:25:44Z) - Head2Head++: Deep Facial Attributes Re-Targeting [6.230979482947681]
We leverage the 3D geometry of faces and Generative Adversarial Networks (GANs) to design a novel deep learning architecture for the task of facial and head reenactment.
We manage to capture the complex non-rigid facial motion from the driving monocular performances and synthesise temporally consistent videos.
Our system performs end-to-end reenactment in nearly real-time speed (18 fps)
arXiv Detail & Related papers (2020-06-17T23:38:37Z) - DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation [56.56575063461169]
DeepFaceFlow is a robust, fast, and highly-accurate framework for the estimation of 3D non-rigid facial flow.
Our framework was trained and tested on two very large-scale facial video datasets.
Given registered pairs of images, our framework generates 3D flow maps at 60 fps.
arXiv Detail & Related papers (2020-05-14T23:56:48Z) - Self-Supervised Linear Motion Deblurring [112.75317069916579]
Deep convolutional neural networks are state-of-the-art for image deblurring.
We present a differentiable reblur model for self-supervised motion deblurring.
Our experiments demonstrate that self-supervised single image deblurring is really feasible.
arXiv Detail & Related papers (2020-02-10T20:15:21Z) - Exploiting Semantics for Face Image Deblurring [121.44928934662063]
We propose an effective and efficient face deblurring algorithm by exploiting semantic cues via deep convolutional neural networks.
We incorporate face semantic labels as input priors and propose an adaptive structural loss to regularize facial local structures.
The proposed method restores sharp images with more accurate facial features and details.
arXiv Detail & Related papers (2020-01-19T13:06:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.