HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and
Retarget Faces
- URL: http://arxiv.org/abs/2307.10797v1
- Date: Thu, 20 Jul 2023 11:59:42 GMT
- Title: HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and
Retarget Faces
- Authors: Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis
Patras, Georgios Tzimiropoulos
- Abstract summary: We present our method for neural face reenactment, called HyperReenact, that aims to generate realistic talking head images of a source identity.
Our method operates under the one-shot setting (i.e., using a single source frame) and allows for cross-subject reenactment, without requiring subject-specific fine-tuning.
We compare our method both quantitatively and qualitatively against several state-of-the-art techniques on the standard benchmarks of VoxCeleb1 and VoxCeleb2.
- Score: 47.27033282706179
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we present our method for neural face reenactment, called
HyperReenact, that aims to generate realistic talking head images of a source
identity, driven by a target facial pose. Existing state-of-the-art face
reenactment methods train controllable generative models that learn to
synthesize realistic facial images, yet producing reenacted faces that are
prone to significant visual artifacts, especially under the challenging
condition of extreme head pose changes, or requiring expensive few-shot
fine-tuning to better preserve the source identity characteristics. We propose
to address these limitations by leveraging the photorealistic generation
ability and the disentangled properties of a pretrained StyleGAN2 generator, by
first inverting the real images into its latent space and then using a
hypernetwork to perform: (i) refinement of the source identity characteristics
and (ii) facial pose re-targeting, eliminating this way the dependence on
external editing methods that typically produce artifacts. Our method operates
under the one-shot setting (i.e., using a single source frame) and allows for
cross-subject reenactment, without requiring any subject-specific fine-tuning.
We compare our method both quantitatively and qualitatively against several
state-of-the-art techniques on the standard benchmarks of VoxCeleb1 and
VoxCeleb2, demonstrating the superiority of our approach in producing
artifact-free images, exhibiting remarkable robustness even under extreme head
pose changes. We make the code and the pretrained models publicly available at:
https://github.com/StelaBou/HyperReenact .
Related papers
- AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models [33.39336530229545]
Face reenactment refers to the process of transferring the pose and facial expressions from a reference (driving) video onto a static facial (source) image.
Previous research in this domain has made significant progress by training controllable deep generative models to generate faces.
This paper proposes a new method based on Stable Diffusion, called AniFaceDiff, incorporating a new conditioning module for high-fidelity face reenactment.
arXiv Detail & Related papers (2024-06-19T07:08:48Z) - DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment [34.821255203019554]
Video-driven neural face reenactment aims to synthesize realistic facial images that successfully preserve the identity and appearance of a source face.
Recent advances in Diffusion Probabilistic Models (DPMs) enable the generation of high-quality realistic images.
We present Diffusion, a novel method that leverages the photo-realistic image generation of diffusion models to perform neural face reenactment.
arXiv Detail & Related papers (2024-03-25T21:46:53Z) - Effective Adapter for Face Recognition in the Wild [72.75516495170199]
We tackle the challenge of face recognition in the wild, where images often suffer from low quality and real-world distortions.
Traditional approaches-either training models directly on degraded images or their enhanced counterparts using face restoration techniques-have proven ineffective.
We propose an effective adapter for augmenting existing face recognition models trained on high-quality facial datasets.
arXiv Detail & Related papers (2023-12-04T08:55:46Z) - Attribute-preserving Face Dataset Anonymization via Latent Code
Optimization [64.4569739006591]
We present a task-agnostic anonymization procedure that directly optimize the images' latent representation in the latent space of a pre-trained GAN.
We demonstrate through a series of experiments that our method is capable of anonymizing the identity of the images whilst -- crucially -- better-preserving the facial attributes.
arXiv Detail & Related papers (2023-03-20T17:34:05Z) - Semantic-aware One-shot Face Re-enactment with Dense Correspondence
Estimation [100.60938767993088]
One-shot face re-enactment is a challenging task due to the identity mismatch between source and driving faces.
This paper proposes to use 3D Morphable Model (3DMM) for explicit facial semantic decomposition and identity disentanglement.
arXiv Detail & Related papers (2022-11-23T03:02:34Z) - StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face
Reenactment [47.27033282706179]
We propose a framework that learns to disentangle the identity characteristics of the face from its pose.
We show that the proposed method produces higher quality results even on extreme pose variations.
arXiv Detail & Related papers (2022-09-27T13:22:35Z) - Thinking the Fusion Strategy of Multi-reference Face Reenactment [4.1509697008011175]
We show that simple extension by using multiple reference images significantly improves generation quality.
We show this by 1) conducting the reconstruction task on publicly available dataset, 2) conducting facial motion transfer on our original dataset which consists of multi-person's head movement video sequences, and 3) using a newly proposed evaluation metric to validate that our method achieves better quantitative results.
arXiv Detail & Related papers (2022-02-22T09:17:26Z) - Finding Directions in GAN's Latent Space for Neural Face Reenactment [45.67273942952348]
This paper is on face/head reenactment where the goal is to transfer the facial pose (3D head orientation and expression) of a target face to a source face.
We take a different approach, bypassing the training of such networks, by using (fine-tuned) pre-trained GANs.
We show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces.
arXiv Detail & Related papers (2022-01-31T19:14:03Z) - Head2Head: Video-based Neural Head Synthesis [50.32988828989691]
We propose a novel machine learning architecture for facial reenactment.
We show that the proposed method can transfer facial expressions, pose and gaze of a source actor to a target video in a photo-realistic fashion more accurately than state-of-the-art methods.
arXiv Detail & Related papers (2020-05-22T00:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.