Spatio-temporal Features for Generalized Detection of Deepfake Videos
- URL: http://arxiv.org/abs/2010.11844v1
- Date: Thu, 22 Oct 2020 16:28:50 GMT
- Title: Spatio-temporal Features for Generalized Detection of Deepfake Videos
- Authors: Ipek Ganiyusufoglu, L. Minh Ng\^o, Nedko Savov, Sezer Karaoglu, Theo
Gevers
- Abstract summary: We propose-temporal features, modeled by 3D CNNs, to extend the capabilities to detect new sorts of deep videos.
We show that our approach outperforms existing methods in terms of generalization capabilities.
- Score: 12.453288832098314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For deepfake detection, video-level detectors have not been explored as
extensively as image-level detectors, which do not exploit temporal data. In
this paper, we empirically show that existing approaches on image and sequence
classifiers generalize poorly to new manipulation techniques. To this end, we
propose spatio-temporal features, modeled by 3D CNNs, to extend the
generalization capabilities to detect new sorts of deepfake videos. We show
that spatial features learn distinct deepfake-method-specific attributes, while
spatio-temporal features capture shared attributes between deepfake methods. We
provide an in-depth analysis of how the sequential and spatio-temporal video
encoders are utilizing temporal information using DFDC dataset
arXiv:2006.07397. Thus, we unravel that our approach captures local
spatio-temporal relations and inconsistencies in the deepfake videos while
existing sequence encoders are indifferent to it. Through large scale
experiments conducted on the FaceForensics++ arXiv:1901.08971 and Deeper
Forensics arXiv:2001.03024 datasets, we show that our approach outperforms
existing methods in terms of generalization capabilities.
Related papers
- Unearthing Common Inconsistency for Generalisable Deepfake Detection [8.327980745153216]
Video-level one shows its potential to have both generalization across multiple domains and robustness to compression.
We propose a detection approach by capturing frame inconsistency that broadly exists in different forgery techniques.
We introduce a temporally-preserved module method to introduce spatial noise perturbations, directing the model's attention towards temporal information.
arXiv Detail & Related papers (2023-11-20T06:04:09Z) - Exploring Spatial-Temporal Features for Deepfake Detection and
Localization [0.0]
We propose a Deepfake network that simultaneously explores spatial and temporal features for detecting and localizing forged regions.
Specifically, we design a new Anchor-Mesh Motion (AMM) algorithm to extract temporal (motion) features by modeling the precise geometric movements of the facial micro-expression.
The superiority of our ST-DDL network is verified by experimental comparisons with several state-of-the-art competitors.
arXiv Detail & Related papers (2022-10-28T03:38:49Z) - Multimodal Graph Learning for Deepfake Detection [10.077496841634135]
Existing deepfake detectors face several challenges in achieving robustness and generalization.
We propose a novel framework, namely Multimodal Graph Learning (MGL), that leverages information from multiple modalities.
Our proposed method aims to effectively identify and utilize distinguishing features for deepfake detection.
arXiv Detail & Related papers (2022-09-12T17:17:49Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption [94.5031244215761]
We propose to boost the generalization of deepfake detection by distinguishing the "regularity disruption" that does not appear in real videos.
Specifically, by carefully examining the spatial and temporal properties, we propose to disrupt a real video through a Pseudo-fake Generator.
Such practice allows us to achieve deepfake detection without using fake videos and improves the generalization ability in a simple and efficient manner.
arXiv Detail & Related papers (2022-07-21T10:42:34Z) - Delving into Sequential Patches for Deepfake Detection [64.19468088546743]
Recent advances in face forgery techniques produce nearly untraceable deepfake videos, which could be leveraged with malicious intentions.
Previous studies has identified the importance of local low-level cues and temporal information in pursuit to generalize well across deepfake methods.
We propose the Local- & Temporal-aware Transformer-based Deepfake Detection framework, which adopts a local-to-global learning protocol.
arXiv Detail & Related papers (2022-07-06T16:46:30Z) - Voice-Face Homogeneity Tells Deepfake [56.334968246631725]
Existing detection approaches contribute to exploring the specific artifacts in deepfake videos.
We propose to perform the deepfake detection from an unexplored voice-face matching view.
Our model obtains significantly improved performance as compared to other state-of-the-art competitors.
arXiv Detail & Related papers (2022-03-04T09:08:50Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - A Plug-and-play Scheme to Adapt Image Saliency Deep Model for Video Data [54.198279280967185]
This paper proposes a novel plug-and-play scheme to weakly retrain a pretrained image saliency deep model for video data.
Our method is simple yet effective for adapting any off-the-shelf pre-trained image saliency deep model to obtain high-quality video saliency detection.
arXiv Detail & Related papers (2020-08-02T13:23:14Z) - Dynamic texture analysis for detecting fake faces in video sequences [6.1356022122903235]
This work explores the analysis of texture-temporal dynamics of the video signal.
The goal is to characterizing and distinguishing real fake sequences.
We propose to build multiple binary decision on the joint analysis of temporal segments.
arXiv Detail & Related papers (2020-07-30T07:21:24Z) - Deepfake Detection using Spatiotemporal Convolutional Networks [0.0]
Deepfake detection methods only use individual frames and therefore fail to learn from temporal information.
We created a benchmark of performance using Celeb-DF dataset.
Our methods outperformed state-of-theart frame-based detection methods.
arXiv Detail & Related papers (2020-06-26T01:32:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.