WEM-GAN: Wavelet transform based facial expression manipulation
- URL: http://arxiv.org/abs/2412.02530v1
- Date: Tue, 03 Dec 2024 16:23:02 GMT
- Title: WEM-GAN: Wavelet transform based facial expression manipulation
- Authors: Dongya Sun, Yunfei Hu, Xianzhe Zhang, Yingsong Hu,
- Abstract summary: We propose WEM-GAN, in short for wavelet-based expression manipulation GAN.
We take advantage of the wavelet transform technique and combine it with our generator with a U-net autoencoder backbone.
Our model performs better in preserving identity features, editing capability, and image generation quality on the AffectNet dataset.
- Score: 2.0918868193463207
- License:
- Abstract: Facial expression manipulation aims to change human facial expressions without affecting face recognition. In order to transform the facial expressions to target expressions, previous methods relied on expression labels to guide the manipulation process. However, these methods failed to preserve the details of facial features, which causes the weakening or the loss of identity information in the output image. In our work, we propose WEM-GAN, in short for wavelet-based expression manipulation GAN, which puts more efforts on preserving the details of the original image in the editing process. Firstly, we take advantage of the wavelet transform technique and combine it with our generator with a U-net autoencoder backbone, in order to improve the generator's ability to preserve more details of facial features. Secondly, we also implement the high-frequency component discriminator, and use high-frequency domain adversarial loss to further constrain the optimization of our model, providing the generated face image with more abundant details. Additionally, in order to narrow the gap between generated facial expressions and target expressions, we use residual connections between encoder and decoder, while also using relative action units (AUs) several times. Extensive qualitative and quantitative experiments have demonstrated that our model performs better in preserving identity features, editing capability, and image generation quality on the AffectNet dataset. It also shows superior performance in metrics such as Average Content Distance (ACD) and Expression Distance (ED).
Related papers
- OSDFace: One-Step Diffusion Model for Face Restoration [72.5045389847792]
Diffusion models have demonstrated impressive performance in face restoration.
We propose OSDFace, a novel one-step diffusion model for face restoration.
Results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics.
arXiv Detail & Related papers (2024-11-26T07:07:48Z) - G2Face: High-Fidelity Reversible Face Anonymization via Generative and Geometric Priors [71.69161292330504]
Reversible face anonymization seeks to replace sensitive identity information in facial images with synthesized alternatives.
This paper introduces Gtextsuperscript2Face, which leverages both generative and geometric priors to enhance identity manipulation.
Our method outperforms existing state-of-the-art techniques in face anonymization and recovery, while preserving high data utility.
arXiv Detail & Related papers (2024-08-18T12:36:47Z) - GaFET: Learning Geometry-aware Facial Expression Translation from
In-The-Wild Images [55.431697263581626]
We introduce a novel Geometry-aware Facial Expression Translation framework, which is based on parametric 3D facial representations and can stably decoupled expression.
We achieve higher-quality and more accurate facial expression transfer results compared to state-of-the-art methods, and demonstrate applicability of various poses and complex textures.
arXiv Detail & Related papers (2023-08-07T09:03:35Z) - SARGAN: Spatial Attention-based Residuals for Facial Expression
Manipulation [1.7056768055368383]
We present a novel method named SARGAN that addresses the limitations from three perspectives.
We exploited a symmetric encoder-decoder network to attend facial features at multiple scales.
Our proposed model performs significantly better than state-of-the-art methods.
arXiv Detail & Related papers (2023-03-30T08:15:18Z) - More comprehensive facial inversion for more effective expression
recognition [8.102564078640274]
We propose a novel generative method based on the image inversion mechanism for the FER task, termed Inversion FER (IFER)
ASIT is equipped with an image inversion discriminator that measures the cosine similarity of semantic features between source and generated images, constrained by a distribution alignment loss.
We extensively evaluate ASIT on facial datasets such as FFHQ and CelebA-HQ, showing that our approach achieves state-of-the-art facial inversion performance.
arXiv Detail & Related papers (2022-11-24T12:31:46Z) - Learning Disentangled Representation for One-shot Progressive Face Swapping [92.09538942684539]
We present a simple yet efficient method named FaceSwapper, for one-shot face swapping based on Generative Adversarial Networks.
Our method consists of a disentangled representation module and a semantic-guided fusion module.
Our method achieves state-of-the-art results on benchmark datasets with fewer training samples.
arXiv Detail & Related papers (2022-03-24T11:19:04Z) - Unsupervised Learning Facial Parameter Regressor for Action Unit
Intensity Estimation via Differentiable Renderer [51.926868759681014]
We present a framework to predict the facial parameters based on a bone-driven face model (BDFM) under different views.
The proposed framework consists of a feature extractor, a generator, and a facial parameter regressor.
arXiv Detail & Related papers (2020-08-20T09:49:13Z) - An Efficient Integration of Disentangled Attended Expression and
Identity FeaturesFor Facial Expression Transfer andSynthesis [6.383596973102899]
We present an Attention-based Identity Preserving Generative Adversarial Network (AIP-GAN) to overcome the identity leakage problem from a source image to a generated face image.
Our key insight is that the identity preserving network should be able to disentangle and compose shape, appearance, and expression information for efficient facial expression transfer and synthesis.
arXiv Detail & Related papers (2020-05-01T17:14:53Z) - Fine-grained Image-to-Image Transformation towards Visual Recognition [102.51124181873101]
We aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image.
We adopt a model based on generative adversarial networks to disentangle the identity related and unrelated factors of an image.
Experiments on the CompCars and Multi-PIE datasets demonstrate that our model preserves the identity of the generated images much better than the state-of-the-art image-to-image transformation models.
arXiv Detail & Related papers (2020-01-12T05:26:47Z) - Deep Feature Consistent Variational Autoencoder [46.25741696270528]
We present a novel method for constructing Variational Autoencoder (VAE)
Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE.
We also show that our method can produce latent vectors that can capture the semantic information of face expressions.
arXiv Detail & Related papers (2016-10-02T15:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.