StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face
Reenactment
- URL: http://arxiv.org/abs/2209.13375v1
- Date: Tue, 27 Sep 2022 13:22:35 GMT
- Title: StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face
Reenactment
- Authors: Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis
Patras, Georgios Tzimiropoulos
- Abstract summary: We propose a framework that learns to disentangle the identity characteristics of the face from its pose.
We show that the proposed method produces higher quality results even on extreme pose variations.
- Score: 47.27033282706179
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper we address the problem of neural face reenactment, where, given
a pair of a source and a target facial image, we need to transfer the target's
pose (defined as the head pose and its facial expressions) to the source image,
by preserving at the same time the source's identity characteristics (e.g.,
facial shape, hair style, etc), even in the challenging case where the source
and the target faces belong to different identities. In doing so, we address
some of the limitations of the state-of-the-art works, namely, a) that they
depend on paired training data (i.e., source and target faces have the same
identity), b) that they rely on labeled data during inference, and c) that they
do not preserve identity in large head pose changes. More specifically, we
propose a framework that, using unpaired randomly generated facial images,
learns to disentangle the identity characteristics of the face from its pose by
incorporating the recently introduced style space $\mathcal{S}$ of StyleGAN2, a
latent representation space that exhibits remarkable disentanglement
properties. By capitalizing on this, we learn to successfully mix a pair of
source and target style codes using supervision from a 3D model. The resulting
latent code, that is subsequently used for reenactment, consists of latent
units corresponding to the facial pose of the target only and of units
corresponding to the identity of the source only, leading to notable
improvement in the reenactment performance compared to recent state-of-the-art
methods. In comparison to state of the art, we quantitatively and qualitatively
show that the proposed method produces higher quality results even on extreme
pose variations. Finally, we report results on real images by first embedding
them on the latent space of the pretrained generator. We make the code and
pretrained models publicly available at: https://github.com/StelaBou/StyleMask
Related papers
- StableIdentity: Inserting Anybody into Anywhere at First Sight [57.99693188913382]
We propose StableIdentity, which allows identity-consistent recontextualization with just one face image.
We are the first to directly inject the identity learned from a single image into video/3D generation without finetuning.
arXiv Detail & Related papers (2024-01-29T09:06:15Z) - When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for
Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images.
We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models.
Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z) - HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and
Retarget Faces [47.27033282706179]
We present our method for neural face reenactment, called HyperReenact, that aims to generate realistic talking head images of a source identity.
Our method operates under the one-shot setting (i.e., using a single source frame) and allows for cross-subject reenactment, without requiring subject-specific fine-tuning.
We compare our method both quantitatively and qualitatively against several state-of-the-art techniques on the standard benchmarks of VoxCeleb1 and VoxCeleb2.
arXiv Detail & Related papers (2023-07-20T11:59:42Z) - Attribute-preserving Face Dataset Anonymization via Latent Code
Optimization [64.4569739006591]
We present a task-agnostic anonymization procedure that directly optimize the images' latent representation in the latent space of a pre-trained GAN.
We demonstrate through a series of experiments that our method is capable of anonymizing the identity of the images whilst -- crucially -- better-preserving the facial attributes.
arXiv Detail & Related papers (2023-03-20T17:34:05Z) - Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with
Conditional StyleGAN [88.62422914645066]
We present an algorithm for re-rendering a person from a single image under arbitrary poses.
Existing methods often have difficulties in hallucinating occluded contents photo-realistically while preserving the identity and fine details in the source image.
We show that our method compares favorably against the state-of-the-art algorithms in both quantitative evaluation and visual comparison.
arXiv Detail & Related papers (2021-09-13T17:59:33Z) - FaR-GAN for One-Shot Face Reenactment [20.894596219099164]
We present a one-shot face reenactment model, FaR-GAN, that takes only one face image of any given source identity and a target expression as input.
The proposed method makes no assumptions about the source identity, facial expression, head pose, or even image background.
arXiv Detail & Related papers (2020-05-13T16:15:37Z) - One-Shot Identity-Preserving Portrait Reenactment [16.889479797252783]
We present a deep learning-based framework for portrait reenactment from a single picture of a target (one-shot) and a video of a driving subject.
We aim to address identity preservation in cross-subject portrait reenactment from a single picture.
arXiv Detail & Related papers (2020-04-26T18:30:33Z) - ActGAN: Flexible and Efficient One-shot Face Reenactment [1.8431600219151503]
ActGAN is a novel end-to-end generative adversarial network (GAN) for one-shot face reenactment.
We introduce a "many-to-many" approach, which allows arbitrary persons both for source and target without additional retraining.
We also introduce a solution to preserve a person's identity between synthesized and target person by adopting the state-of-the-art approach in deep face recognition domain.
arXiv Detail & Related papers (2020-03-30T22:03:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.