Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption
- URL: http://arxiv.org/abs/2207.10402v2
- Date: Sun, 25 Jun 2023 13:26:20 GMT
- Title: Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption
- Authors: Jiazhi Guan, Hang Zhou, Mingming Gong, Errui Ding, Jingdong Wang,
Youjian Zhao
- Abstract summary: We propose to boost the generalization of deepfake detection by distinguishing the "regularity disruption" that does not appear in real videos.
Specifically, by carefully examining the spatial and temporal properties, we propose to disrupt a real video through a Pseudo-fake Generator.
Such practice allows us to achieve deepfake detection without using fake videos and improves the generalization ability in a simple and efficient manner.
- Score: 94.5031244215761
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite encouraging progress in deepfake detection, generalization to unseen
forgery types remains a significant challenge due to the limited forgery clues
explored during training. In contrast, we notice a common phenomenon in
deepfake: fake video creation inevitably disrupts the statistical regularity in
original videos. Inspired by this observation, we propose to boost the
generalization of deepfake detection by distinguishing the "regularity
disruption" that does not appear in real videos. Specifically, by carefully
examining the spatial and temporal properties, we propose to disrupt a real
video through a Pseudo-fake Generator and create a wide range of pseudo-fake
videos for training. Such practice allows us to achieve deepfake detection
without using fake videos and improves the generalization ability in a simple
and efficient manner. To jointly capture the spatial and temporal disruptions,
we propose a Spatio-Temporal Enhancement block to learn the regularity
disruption across space and time on our self-created videos. Through
comprehensive experiments, our method exhibits excellent performance on several
datasets.
Related papers
- DIP: Diffusion Learning of Inconsistency Pattern for General DeepFake Detection [18.116004258266535]
A transformer-based framework for Diffusion Inconsistency Learning (DIP) is proposed, which exploits directional inconsistencies for deepfake video detection.
Our method could effectively identify forgery clues and achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-10-31T06:26:00Z) - Deepfake detection in videos with multiple faces using geometric-fakeness features [79.16635054977068]
Deepfakes of victims or public figures can be used by fraudsters for blackmailing, extorsion and financial fraud.
In our research we propose to use geometric-fakeness features (GFF) that characterize a dynamic degree of a face presence in a video.
We employ our approach to analyze videos with multiple faces that are simultaneously present in a video.
arXiv Detail & Related papers (2024-10-10T13:10:34Z) - Shaking the Fake: Detecting Deepfake Videos in Real Time via Active Probes [3.6308756891251392]
Real-time deepfake, a type of generative AI, is capable of "creating" non-existing contents (e.g., swapping one's face with another) in a video.
It has been misused to produce deepfake videos for malicious purposes, including financial scams and political misinformation.
We propose SFake, a new real-time deepfake detection method that exploits deepfake models' inability to adapt to physical interference.
arXiv Detail & Related papers (2024-09-17T04:58:30Z) - Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs)
Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z) - Latent Spatiotemporal Adaptation for Generalized Face Forgery Video Detection [22.536129731902783]
We propose a Latemporal Spatio(LAST) approach to facilitate generalized face video detection.
We first model thetemporal patterns face videos by incorporating a lightweight CNN to extract local spatial features of each frame.
Then we learn the long-termtemporal representations in latent space videos, which should contain more clues than in pixel space.
arXiv Detail & Related papers (2023-09-09T13:40:44Z) - Undercover Deepfakes: Detecting Fake Segments in Videos [1.2609216345578933]
deepfake generation is a new paradigm of deepfakes which are mostly real videos altered slightly to distort the truth.
In this paper, we present a deepfake detection method that can address this issue by performing deepfake prediction at the frame and video levels.
In particular, the paradigm we address will form a powerful tool for the moderation of deepfakes, where human oversight can be better targeted to the parts of videos suspected of being deepfakes.
arXiv Detail & Related papers (2023-05-11T04:43:10Z) - Deepfake Video Detection with Spatiotemporal Dropout Transformer [32.577096083927884]
This paper proposes a simple yet effective patch-level approach to facilitate deepfake video detection via a dropout transformer.
The approach reorganizes each input video into bag of patches that is then fed into a vision transformer to achieve robust representation.
arXiv Detail & Related papers (2022-07-14T02:04:42Z) - Leveraging Real Talking Faces via Self-Supervision for Robust Forgery
Detection [112.96004727646115]
We develop a method to detect face-manipulated videos using real talking faces.
We show that our method achieves state-of-the-art performance on cross-manipulation generalisation and robustness experiments.
Our results suggest that leveraging natural and unlabelled videos is a promising direction for the development of more robust face forgery detectors.
arXiv Detail & Related papers (2022-01-18T17:14:54Z) - Watch Those Words: Video Falsification Detection Using Word-Conditioned
Facial Motion [82.06128362686445]
We propose a multi-modal semantic forensic approach to handle both cheapfakes and visually persuasive deepfakes.
We leverage the idea of attribution to learn person-specific biometric patterns that distinguish a given speaker from others.
Unlike existing person-specific approaches, our method is also effective against attacks that focus on lip manipulation.
arXiv Detail & Related papers (2021-12-21T01:57:04Z) - Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery
Detection [118.37239586697139]
LipForensics is a detection approach capable of both generalising manipulations and withstanding various distortions.
It consists in first pretraining a-temporal network to perform visual speech recognition (lipreading)
A temporal network is subsequently finetuned on fixed mouth embeddings of real and forged data in order to detect fake videos based on mouth movements without over-fitting to low-level, manipulation-specific artefacts.
arXiv Detail & Related papers (2020-12-14T15:53:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.