Mover: Mask and Recovery based Facial Part Consistency Aware Method for
Deepfake Video Detection
- URL: http://arxiv.org/abs/2303.01740v2
- Date: Sat, 6 May 2023 02:23:25 GMT
- Title: Mover: Mask and Recovery based Facial Part Consistency Aware Method for
Deepfake Video Detection
- Authors: Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin,
Mike Zheng Shou
- Abstract summary: Mover is a new Deepfake detection model that exploits unspecific facial part inconsistencies.
We propose a novel model with dual networks that utilize the pretrained encoder and masked autoencoder.
Our experiments on standard benchmarks demonstrate that Mover is highly effective.
- Score: 33.29744034340998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deepfake techniques have been widely used for malicious purposes, prompting
extensive research interest in developing Deepfake detection methods. Deepfake
manipulations typically involve tampering with facial parts, which can result
in inconsistencies across different parts of the face. For instance, Deepfake
techniques may change smiling lips to an upset lip, while the eyes remain
smiling. Existing detection methods depend on specific indicators of forgery,
which tend to disappear as the forgery patterns are improved. To address the
limitation, we propose Mover, a new Deepfake detection model that exploits
unspecific facial part inconsistencies, which are inevitable weaknesses of
Deepfake videos. Mover randomly masks regions of interest (ROIs) and recovers
faces to learn unspecific features, which makes it difficult for fake faces to
be recovered, while real faces can be easily recovered. Specifically, given a
real face image, we first pretrain a masked autoencoder to learn facial part
consistency by dividing faces into three parts and randomly masking ROIs, which
are then recovered based on the unmasked facial parts. Furthermore, to maximize
the discrepancy between real and fake videos, we propose a novel model with
dual networks that utilize the pretrained encoder and masked autoencoder,
respectively. 1) The pretrained encoder is finetuned for capturing the encoding
of inconsistent information in the given video. 2) The pretrained masked
autoencoder is utilized for mapping faces and distinguishing real and fake
videos. Our extensive experiments on standard benchmarks demonstrate that Mover
is highly effective.
Related papers
- Deepfake detection in videos with multiple faces using geometric-fakeness features [79.16635054977068]
Deepfakes of victims or public figures can be used by fraudsters for blackmailing, extorsion and financial fraud.
In our research we propose to use geometric-fakeness features (GFF) that characterize a dynamic degree of a face presence in a video.
We employ our approach to analyze videos with multiple faces that are simultaneously present in a video.
arXiv Detail & Related papers (2024-10-10T13:10:34Z) - Learning Expressive And Generalizable Motion Features For Face Forgery
Detection [52.54404879581527]
We propose an effective sequence-based forgery detection framework based on an existing video classification method.
To make the motion features more expressive for manipulation detection, we propose an alternative motion consistency block.
We make a general video classification network achieve promising results on three popular face forgery datasets.
arXiv Detail & Related papers (2024-03-08T09:25:48Z) - Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via
Recovering Faces and Mapping Recovered Faces [35.04806736119123]
We propose Recap, a novel Deepfake detection model that exposes unspecific facial part inconsistencies by recovering faces.
In the recovering stage, the model focuses on randomly masking regions of interest and reconstructing real faces without unpredictable tampered traces.
In the mapping stage, the output of the recovery phase serves as supervision to guide the facial mapping process.
arXiv Detail & Related papers (2023-08-19T06:18:11Z) - Face Forgery Detection Based on Facial Region Displacement Trajectory
Series [10.338298543908339]
We develop a method for detecting manipulated videos based on the trajectory of the facial region displacement.
This information was used to construct a network for exposing multidimensional artifacts in the trajectory sequences of manipulated videos.
arXiv Detail & Related papers (2022-12-07T14:47:54Z) - Restricted Black-box Adversarial Attack Against DeepFake Face Swapping [70.82017781235535]
We introduce a practical adversarial attack that does not require any queries to the facial image forgery model.
Our method is built on a substitute model persuing for face reconstruction and then transfers adversarial examples from the substitute model directly to inaccessible black-box DeepFake models.
arXiv Detail & Related papers (2022-04-26T14:36:06Z) - Watch Those Words: Video Falsification Detection Using Word-Conditioned
Facial Motion [82.06128362686445]
We propose a multi-modal semantic forensic approach to handle both cheapfakes and visually persuasive deepfakes.
We leverage the idea of attribution to learn person-specific biometric patterns that distinguish a given speaker from others.
Unlike existing person-specific approaches, our method is also effective against attacks that focus on lip manipulation.
arXiv Detail & Related papers (2021-12-21T01:57:04Z) - End2End Occluded Face Recognition by Masking Corrupted Features [82.27588990277192]
State-of-the-art general face recognition models do not generalize well to occluded face images.
This paper presents a novel face recognition method that is robust to occlusions based on a single end-to-end deep neural network.
Our approach, named FROM (Face Recognition with Occlusion Masks), learns to discover the corrupted features from the deep convolutional neural networks, and clean them by the dynamically learned masks.
arXiv Detail & Related papers (2021-08-21T09:08:41Z) - Robust Face-Swap Detection Based on 3D Facial Shape Information [59.32489266682952]
Face-swap images and videos have attracted more and more malicious attackers to discredit some key figures.
Previous pixel-level artifacts based detection techniques always focus on some unclear patterns but ignore some available semantic clues.
We propose a biometric information based method to fully exploit the appearance and shape feature for face-swap detection of key figures.
arXiv Detail & Related papers (2021-04-28T09:35:48Z) - ID-Reveal: Identity-aware DeepFake Video Detection [24.79483180234883]
ID-Reveal is a new approach that learns temporal facial features, specific of how a person moves while talking.
We do not need any training data of fakes, but only train on real videos.
We obtain an average improvement of more than 15% in terms of accuracy for facial reenactment on high compressed videos.
arXiv Detail & Related papers (2020-12-04T10:43:16Z) - Deep Detection for Face Manipulation [10.551455590390418]
We introduce a deep learning method to detect face manipulation.
It consists of two stages: feature extraction and binary classification.
We show that it generates better performance than state-of-the-art techniques in most cases.
arXiv Detail & Related papers (2020-09-13T06:48:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.