SARGAN: Spatial Attention-based Residuals for Facial Expression
Manipulation
- URL: http://arxiv.org/abs/2303.17212v1
- Date: Thu, 30 Mar 2023 08:15:18 GMT
- Title: SARGAN: Spatial Attention-based Residuals for Facial Expression
Manipulation
- Authors: Arbish Akram and Nazar Khan
- Abstract summary: We present a novel method named SARGAN that addresses the limitations from three perspectives.
We exploited a symmetric encoder-decoder network to attend facial features at multiple scales.
Our proposed model performs significantly better than state-of-the-art methods.
- Score: 1.7056768055368383
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Encoder-decoder based architecture has been widely used in the generator of
generative adversarial networks for facial manipulation. However, we observe
that the current architecture fails to recover the input image color, rich
facial details such as skin color or texture and introduces artifacts as well.
In this paper, we present a novel method named SARGAN that addresses the
above-mentioned limitations from three perspectives. First, we employed spatial
attention-based residual block instead of vanilla residual blocks to properly
capture the expression-related features to be changed while keeping the other
features unchanged. Second, we exploited a symmetric encoder-decoder network to
attend facial features at multiple scales. Third, we proposed to train the
complete network with a residual connection which relieves the generator of
pressure to generate the input face image thereby producing the desired
expression by directly feeding the input image towards the end of the
generator. Both qualitative and quantitative experimental results show that our
proposed model performs significantly better than state-of-the-art methods. In
addition, existing models require much larger datasets for training but their
performance degrades on out-of-distribution images. In contrast, SARGAN can be
trained on smaller facial expressions datasets, which generalizes well on
out-of-distribution images including human photographs, portraits, avatars and
statues.
Related papers
- G2Face: High-Fidelity Reversible Face Anonymization via Generative and Geometric Priors [71.69161292330504]
Reversible face anonymization seeks to replace sensitive identity information in facial images with synthesized alternatives.
This paper introduces Gtextsuperscript2Face, which leverages both generative and geometric priors to enhance identity manipulation.
Our method outperforms existing state-of-the-art techniques in face anonymization and recovery, while preserving high data utility.
arXiv Detail & Related papers (2024-08-18T12:36:47Z) - 3D Facial Expressions through Analysis-by-Neural-Synthesis [30.2749903946587]
SMIRK (Spatial Modeling for Image-based Reconstruction of Kinesics) faithfully reconstructs expressive 3D faces from images.
We identify two key limitations in existing methods: shortcomings in their self-supervised training formulation, and a lack of expression diversity in the training images.
Our qualitative, quantitative and particularly our perceptual evaluations demonstrate that SMIRK achieves the new state-of-the art performance on accurate expression reconstruction.
arXiv Detail & Related papers (2024-04-05T14:00:07Z) - CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using
Score-Based Diffusion Models [57.9771859175664]
Recent generative-prior-based methods have shown promising blind face restoration performance.
Generating fine-grained facial details faithful to inputs remains a challenging problem.
We introduce a diffusion-based-prior inside a VQGAN architecture that focuses on learning the distribution over uncorrupted latent embeddings.
arXiv Detail & Related papers (2024-02-08T23:51:49Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - Implicit Neural Deformation for Multi-View Face Reconstruction [43.88676778013593]
We present a new method for 3D face reconstruction from multi-view RGB images.
Unlike previous methods which are built upon 3D morphable models, our method leverages an implicit representation to encode rich geometric features.
Our experimental results on several benchmark datasets demonstrate that our approach outperforms alternative baselines and achieves superior face reconstruction results compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-12-05T07:02:53Z) - FaceTuneGAN: Face Autoencoder for Convolutional Expression Transfer
Using Neural Generative Adversarial Networks [0.7043489166804575]
We present FaceTuneGAN, a new 3D face model representation decomposing and encoding separately facial identity and facial expression.
We propose a first adaptation of image-to-image translation networks, that have successfully been used in the 2D domain, to 3D face geometry.
arXiv Detail & Related papers (2021-12-01T14:42:03Z) - Inverting Generative Adversarial Renderer for Face Reconstruction [58.45125455811038]
In this work, we introduce a novel Generative Adversa Renderer (GAR)
GAR learns to model the complicated real-world image, instead of relying on the graphics rules, it is capable of producing realistic images.
Our method achieves state-of-the-art performances on multiple face reconstruction.
arXiv Detail & Related papers (2021-05-06T04:16:06Z) - High Resolution Face Editing with Masked GAN Latent Code Optimization [0.0]
Face editing is a popular research topic in the computer vision community.
Recent proposed methods are based on either training a conditional encoder-decoder Generative Adversarial Network (GAN) in an end-to-end fashion or on defining an operation in the latent space of a pre-trained vanilla GAN generator model.
We propose a GAN embedding optimization procedure with spatial and semantic constraints.
arXiv Detail & Related papers (2021-03-20T08:39:41Z) - OSTeC: One-Shot Texture Completion [86.23018402732748]
We propose an unsupervised approach for one-shot 3D facial texture completion.
The proposed approach rotates an input image in 3D and fill-in the unseen regions by reconstructing the rotated image in a 2D face generator.
We frontalize the target image by projecting the completed texture into the generator.
arXiv Detail & Related papers (2020-12-30T23:53:26Z) - Pose-Guided High-Resolution Appearance Transfer via Progressive Training [65.92031716146865]
We propose a pose-guided appearance transfer network for transferring a given reference appearance to a target pose in unprecedented image resolution.
Our network utilizes dense local descriptors including local perceptual loss and local discriminators to refine details.
Our model produces high-quality images, which can be further utilized in useful applications such as garment transfer between people.
arXiv Detail & Related papers (2020-08-27T03:18:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.