Leveraging Real Talking Faces via Self-Supervision for Robust Forgery
Detection
- URL: http://arxiv.org/abs/2201.07131v1
- Date: Tue, 18 Jan 2022 17:14:54 GMT
- Title: Leveraging Real Talking Faces via Self-Supervision for Robust Forgery
Detection
- Authors: Alexandros Haliassos, Rodrigo Mira, Stavros Petridis, Maja Pantic
- Abstract summary: We develop a method to detect face-manipulated videos using real talking faces.
We show that our method achieves state-of-the-art performance on cross-manipulation generalisation and robustness experiments.
Our results suggest that leveraging natural and unlabelled videos is a promising direction for the development of more robust face forgery detectors.
- Score: 112.96004727646115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the most pressing challenges for the detection of face-manipulated
videos is generalising to forgery methods not seen during training while
remaining effective under common corruptions such as compression. In this
paper, we question whether we can tackle this issue by harnessing videos of
real talking faces, which contain rich information on natural facial appearance
and behaviour and are readily available in large quantities online. Our method,
termed RealForensics, consists of two stages. First, we exploit the natural
correspondence between the visual and auditory modalities in real videos to
learn, in a self-supervised cross-modal manner, temporally dense video
representations that capture factors such as facial movements, expression, and
identity. Second, we use these learned representations as targets to be
predicted by our forgery detector along with the usual binary forgery
classification task; this encourages it to base its real/fake decision on said
factors. We show that our method achieves state-of-the-art performance on
cross-manipulation generalisation and robustness experiments, and examine the
factors that contribute to its performance. Our results suggest that leveraging
natural and unlabelled videos is a promising direction for the development of
more robust face forgery detectors.
Related papers
- UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - Learning Natural Consistency Representation for Face Forgery Video Detection [23.53549629885891]
We propose to learn the Natural representation (NACO) real face videos in a self-supervised manner.
Our method outperforms other state-of-the-art methods with impressive robustness.
arXiv Detail & Related papers (2024-07-15T09:00:02Z) - Learning Expressive And Generalizable Motion Features For Face Forgery
Detection [52.54404879581527]
We propose an effective sequence-based forgery detection framework based on an existing video classification method.
To make the motion features more expressive for manipulation detection, we propose an alternative motion consistency block.
We make a general video classification network achieve promising results on three popular face forgery datasets.
arXiv Detail & Related papers (2024-03-08T09:25:48Z) - Watch Those Words: Video Falsification Detection Using Word-Conditioned
Facial Motion [82.06128362686445]
We propose a multi-modal semantic forensic approach to handle both cheapfakes and visually persuasive deepfakes.
We leverage the idea of attribution to learn person-specific biometric patterns that distinguish a given speaker from others.
Unlike existing person-specific approaches, our method is also effective against attacks that focus on lip manipulation.
arXiv Detail & Related papers (2021-12-21T01:57:04Z) - Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery
Detection [118.37239586697139]
LipForensics is a detection approach capable of both generalising manipulations and withstanding various distortions.
It consists in first pretraining a-temporal network to perform visual speech recognition (lipreading)
A temporal network is subsequently finetuned on fixed mouth embeddings of real and forged data in order to detect fake videos based on mouth movements without over-fitting to low-level, manipulation-specific artefacts.
arXiv Detail & Related papers (2020-12-14T15:53:56Z) - ID-Reveal: Identity-aware DeepFake Video Detection [24.79483180234883]
ID-Reveal is a new approach that learns temporal facial features, specific of how a person moves while talking.
We do not need any training data of fakes, but only train on real videos.
We obtain an average improvement of more than 15% in terms of accuracy for facial reenactment on high compressed videos.
arXiv Detail & Related papers (2020-12-04T10:43:16Z) - Spoof Face Detection Via Semi-Supervised Adversarial Training [34.99908561729825]
Face spoofing causes severe security threats in face recognition systems.
We propose a semi-supervised adversarial learning framework for spoof face detection.
Our approach is free of the spoof faces, thus being robust and general to different types of spoof, even unknown spoof.
arXiv Detail & Related papers (2020-05-22T04:32:33Z) - VideoForensicsHQ: Detecting High-quality Manipulated Face Videos [77.60295082172098]
We show how the performance of forgery detectors depends on the presence of artefacts that the human eye can see.
We introduce a new benchmark dataset for face video forgery detection, of unprecedented quality.
arXiv Detail & Related papers (2020-05-20T21:17:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.