Multi-Modal Face Stylization with a Generative Prior
- URL: http://arxiv.org/abs/2305.18009v2
- Date: Mon, 25 Sep 2023 03:29:59 GMT
- Title: Multi-Modal Face Stylization with a Generative Prior
- Authors: Mengtian Li, Yi Dong, Minxuan Lin, Haibin Huang, Pengfei Wan,
Chongyang Ma
- Abstract summary: MMFS supports multi-modal face stylization by leveraging the strengths of StyleGAN.
We introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN.
In the second stage, the entire network is fine-tuned with artistic data for stylized face generation.
- Score: 27.79677001997915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we introduce a new approach for face stylization. Despite
existing methods achieving impressive results in this task, there is still room
for improvement in generating high-quality artistic faces with diverse styles
and accurate facial reconstruction. Our proposed framework, MMFS, supports
multi-modal face stylization by leveraging the strengths of StyleGAN and
integrates it into an encoder-decoder architecture. Specifically, we use the
mid-resolution and high-resolution layers of StyleGAN as the decoder to
generate high-quality faces, while aligning its low-resolution layer with the
encoder to extract and preserve input facial details. We also introduce a
two-stage training strategy, where we train the encoder in the first stage to
align the feature maps with StyleGAN and enable a faithful reconstruction of
input faces. In the second stage, the entire network is fine-tuned with
artistic data for stylized face generation. To enable the fine-tuned model to
be applied in zero-shot and one-shot stylization tasks, we train an additional
mapping network from the large-scale Contrastive-Language-Image-Pre-training
(CLIP) space to a latent $w+$ space of fine-tuned StyleGAN. Qualitative and
quantitative experiments show that our framework achieves superior performance
in both one-shot and zero-shot face stylization tasks, outperforming
state-of-the-art methods by a large margin.
Related papers
- ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - Ada-adapter:Fast Few-shot Style Personlization of Diffusion Model with Pre-trained Image Encoder [57.574544285878794]
Ada-Adapter is a novel framework for few-shot style personalization of diffusion models.
Our method enables efficient zero-shot style transfer utilizing a single reference image.
We demonstrate the effectiveness of our approach on various artistic styles, including flat art, 3D rendering, and logo design.
arXiv Detail & Related papers (2024-07-08T02:00:17Z) - E2F-Net: Eyes-to-Face Inpainting via StyleGAN Latent Space [4.110419543591102]
We propose a Generative Adversarial Network (GAN)-based model called Eyes-to-Face Network (E2F-Net)
The proposed approach extracts identity and non-identity features from the periocular region using two dedicated encoders.
We show that our method successfully reconstructs the whole face with high quality, surpassing current techniques.
arXiv Detail & Related papers (2024-03-18T19:11:34Z) - High-Fidelity Face Swapping with Style Blending [16.024260677867076]
We propose an innovative end-to-end framework for high-fidelity face swapping.
First, we introduce a StyleGAN-based facial attributes encoder that extracts essential features from faces and inverts them into a latent style code.
Second, we introduce an attention-based style blending module to effectively transfer Face IDs from source to target.
arXiv Detail & Related papers (2023-12-17T23:22:37Z) - Face Cartoonisation For Various Poses Using StyleGAN [0.7673339435080445]
This paper presents an innovative approach to achieve face cartoonisation while preserving the original identity and accommodating various poses.
We achieve this by introducing an encoder that captures both pose and identity information from images and generates a corresponding embedding within the StyleGAN latent space.
We show by extensive experimentation how our encoder adapts the StyleGAN output to better preserve identity when the objective is cartoonisation.
arXiv Detail & Related papers (2023-09-26T13:10:25Z) - StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces [103.54337984566877]
We use dilated convolutions to rescale the receptive fields of shallow layers in StyleGAN without altering any model parameters.
This allows fixed-size small features at shallow layers to be extended into larger ones that can accommodate variable resolutions.
We validate our method using unaligned face inputs of various resolutions in a diverse set of face manipulation tasks.
arXiv Detail & Related papers (2023-03-10T18:59:33Z) - End-to-end Face-swapping via Adaptive Latent Representation Learning [12.364688530047786]
This paper proposes a novel and end-to-end integrated framework for high resolution and attribute preservation face swapping.
Our framework integrating facial perceiving and blending into the end-to-end training and testing process can achieve high realistic face-swapping on wild faces.
arXiv Detail & Related papers (2023-03-07T19:16:20Z) - StyleSwap: Style-Based Generator Empowers Robust Face Swapping [90.05775519962303]
We introduce a concise and effective framework named StyleSwap.
Our core idea is to leverage a style-based generator to empower high-fidelity and robust face swapping.
We identify that with only minimal modifications, a StyleGAN2 architecture can successfully handle the desired information from both source and target.
arXiv Detail & Related papers (2022-09-27T16:35:16Z) - VToonify: Controllable High-Resolution Portrait Video Style Transfer [103.54337984566877]
We introduce a novel VToonify framework for controllable high-resolution portrait video style transfer.
We leverage the mid- and high-resolution layers of StyleGAN to render artistic portraits based on the multi-scale content features extracted by an encoder.
Our framework is compatible with existing StyleGAN-based image toonification models to extend them to video toonification, and inherits appealing features of these models for flexible style control on color and intensity.
arXiv Detail & Related papers (2022-09-22T17:59:10Z) - Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer [103.54337984566877]
Recent studies on StyleGAN show high performance on artistic portrait generation by transfer learning with limited data.
We introduce a novel DualStyleGAN with flexible control of dual styles of the original face domain and the extended artistic portrait domain.
Experiments demonstrate the superiority of DualStyleGAN over state-of-the-art methods in high-quality portrait style transfer and flexible style control.
arXiv Detail & Related papers (2022-03-24T17:57:11Z) - BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation [9.370501805054344]
We propose BlendGAN for arbitrary stylized face generation.
We first train a self-supervised style encoder on the generic artistic dataset to extract the representations of arbitrary styles.
In addition, a weighted blending module (WBM) is proposed to blend face and style representations implicitly and control the arbitrary stylization effect.
arXiv Detail & Related papers (2021-10-22T12:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.