Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs
- URL: http://arxiv.org/abs/2402.09100v1
- Date: Wed, 14 Feb 2024 11:20:47 GMT
- Title: Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs
- Authors: Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorr
- Abstract summary: This study introduces a network designed for expression-based video inpainting.
It employs generative adversarial networks (GANs) to handle static and moving occlusions across all frames.
We further enhance emotional preservation through a customized facial expression recognition (FER) loss function, ensuring detailed inpainted outputs.
- Score: 0.27624021966289597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial video inpainting plays a crucial role in a wide range of applications,
including but not limited to the removal of obstructions in video conferencing
and telemedicine, enhancement of facial expression analysis, privacy
protection, integration of graphical overlays, and virtual makeup. This domain
presents serious challenges due to the intricate nature of facial features and
the inherent human familiarity with faces, heightening the need for accurate
and persuasive completions. In addressing challenges specifically related to
occlusion removal in this context, our focus is on the progressive task of
generating complete images from facial data covered by masks, ensuring both
spatial and temporal coherence. Our study introduces a network designed for
expression-based video inpainting, employing generative adversarial networks
(GANs) to handle static and moving occlusions across all frames. By utilizing
facial landmarks and an occlusion-free reference image, our model maintains the
user's identity consistently across frames. We further enhance emotional
preservation through a customized facial expression recognition (FER) loss
function, ensuring detailed inpainted outputs. Our proposed framework exhibits
proficiency in eliminating occlusions from facial videos in an adaptive form,
whether appearing static or dynamic on the frames, while providing realistic
and coherent results.
Related papers
- ID-Guard: A Universal Framework for Combating Facial Manipulation via Breaking Identification [60.73617868629575]
misuse of deep learning-based facial manipulation poses a potential threat to civil rights.
To prevent this fraud at its source, proactive defense technology was proposed to disrupt the manipulation process.
We propose a novel universal framework for combating facial manipulation, called ID-Guard.
arXiv Detail & Related papers (2024-09-20T09:30:08Z) - Expression-aware video inpainting for HMD removal in XR applications [0.27624021966289597]
Head-mounted displays (HMDs) serve as indispensable devices for observing extended reality (XR) environments and virtual content.
HMDs present an obstacle to external recording techniques as they block the upper face of the user.
We propose a new network for expression-aware video inpainting for HMD removal based on generative adversarial networks (GANs)
arXiv Detail & Related papers (2024-01-25T12:32:21Z) - Emotion Separation and Recognition from a Facial Expression by Generating the Poker Face with Vision Transformers [57.1091606948826]
We propose a novel FER model, named Poker Face Vision Transformer or PF-ViT, to address these challenges.
PF-ViT aims to separate and recognize the disturbance-agnostic emotion from a static facial image via generating its corresponding poker face.
PF-ViT utilizes vanilla Vision Transformers, and its components are pre-trained as Masked Autoencoders on a large facial expression dataset.
arXiv Detail & Related papers (2022-07-22T13:39:06Z) - Graph-based Generative Face Anonymisation with Pose Preservation [49.18049578591058]
AnonyGAN is a GAN-based solution for face anonymisation.
It replaces the visual information corresponding to a source identity with a condition identity provided as any single image.
arXiv Detail & Related papers (2021-12-10T12:58:17Z) - UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video
Editing [78.26925404508994]
We propose a unified temporally consistent facial video editing framework termed UniFaceGAN.
Our framework is designed to handle face swapping and face reenactment simultaneously.
Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.
arXiv Detail & Related papers (2021-08-12T10:35:22Z) - A Latent Transformer for Disentangled and Identity-Preserving Face
Editing [3.1542695050861544]
We propose to edit facial attributes via the latent space of a StyleGAN generator.
We train a dedicated latent transformation network and incorporate explicit disentanglement and identity preservation terms in the loss function.
Our model achieves a disentangled, controllable, and identity-preserving facial attribute editing, even in the challenging case of real (i.e., non-synthetic) images and videos.
arXiv Detail & Related papers (2021-06-22T16:04:30Z) - Image-to-Video Generation via 3D Facial Dynamics [78.01476554323179]
We present a versatile model, FaceAnime, for various video generation tasks from still images.
Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
arXiv Detail & Related papers (2021-05-31T02:30:11Z) - Foreground-guided Facial Inpainting with Fidelity Preservation [7.5089719291325325]
We propose a foreground-guided facial inpainting framework that can extract and generate facial features using convolutional neural network layers.
Specifically, we propose a new loss function with semantic capability reasoning of facial expressions, natural and unnatural features (make-up)
Our proposed method achieved comparable quantitative results when compare to the state of the art but qualitatively, it demonstrated high-fidelity preservation of facial components.
arXiv Detail & Related papers (2021-05-07T15:50:58Z) - Occlusion-Adaptive Deep Network for Robust Facial Expression Recognition [56.11054589916299]
We propose a landmark-guided attention branch to find and discard corrupted features from occluded regions.
An attention map is first generated to indicate if a specific facial part is occluded and guide our model to attend to non-occluded regions.
This results in more diverse and discriminative features, enabling the expression recognition system to recover even though the face is partially occluded.
arXiv Detail & Related papers (2020-05-12T20:42:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.