Related papers: Spatio-temporal Features for Generalized Detection of Deepfake Videos

Spatio-temporal Features for Generalized Detection of Deepfake Videos

URL: http://arxiv.org/abs/2010.11844v1
Date: Thu, 22 Oct 2020 16:28:50 GMT
Title: Spatio-temporal Features for Generalized Detection of Deepfake Videos
Authors: Ipek Ganiyusufoglu, L. Minh Ng\^o, Nedko Savov, Sezer Karaoglu, Theo Gevers
Abstract summary: We propose-temporal features, modeled by 3D CNNs, to extend the capabilities to detect new sorts of deep videos. We show that our approach outperforms existing methods in terms of generalization capabilities.
Score: 12.453288832098314
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: For deepfake detection, video-level detectors have not been explored as extensively as image-level detectors, which do not exploit temporal data. In this paper, we empirically show that existing approaches on image and sequence classifiers generalize poorly to new manipulation techniques. To this end, we propose spatio-temporal features, modeled by 3D CNNs, to extend the generalization capabilities to detect new sorts of deepfake videos. We show that spatial features learn distinct deepfake-method-specific attributes, while spatio-temporal features capture shared attributes between deepfake methods. We provide an in-depth analysis of how the sequential and spatio-temporal video encoders are utilizing temporal information using DFDC dataset arXiv:2006.07397. Thus, we unravel that our approach captures local spatio-temporal relations and inconsistencies in the deepfake videos while existing sequence encoders are indifferent to it. Through large scale experiments conducted on the FaceForensics++ arXiv:1901.08971 and Deeper Forensics arXiv:2001.03024 datasets, we show that our approach outperforms existing methods in terms of generalization capabilities.

Related papers

Deepfake Detection with Spatio-Temporal Consistency and Attention [46.1135899490656]
Deepfake videos are causing growing concerns among communities due to their ever-increasing realism. Current methods for detecting forged videos rely mainly on global frame features. We propose a neural Deepfake detector that focuses on the localized manipulative signatures of the forged videos.
arXiv Detail & Related papers (2025-02-12T08:51:33Z)
Vulnerability-Aware Spatio-Temporal Learning for Generalizable and Interpretable Deepfake Video Detection [14.586314545834934]
Deepfake videos are highly challenging to detect due to the complex intertwined temporal and spatial artifacts in forged sequences. Most recent approaches rely on binary classifiers trained on both real and fake data. We introduce a multi-task learning framework with additional spatial and temporal branches that enable the model to focus on subtle artifacts. Second, we propose a video-level data algorithm that generates pseudo-fake videos with subtle artifacts, providing the model with high-quality samples and ground truth data.
arXiv Detail & Related papers (2025-01-02T10:21:34Z)
DIP: Diffusion Learning of Inconsistency Pattern for General DeepFake Detection [18.116004258266535]
A transformer-based framework for Diffusion Inconsistency Learning (DIP) is proposed, which exploits directional inconsistencies for deepfake video detection. Our method could effectively identify forgery clues and achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-10-31T06:26:00Z)
Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs) Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z)
Exploring Spatial-Temporal Features for Deepfake Detection and Localization [0.0]
We propose a Deepfake network that simultaneously explores spatial and temporal features for detecting and localizing forged regions. Specifically, we design a new Anchor-Mesh Motion (AMM) algorithm to extract temporal (motion) features by modeling the precise geometric movements of the facial micro-expression. The superiority of our ST-DDL network is verified by experimental comparisons with several state-of-the-art competitors.
arXiv Detail & Related papers (2022-10-28T03:38:49Z)
Multimodal Graph Learning for Deepfake Detection [10.077496841634135]
Existing deepfake detectors face several challenges in achieving robustness and generalization. We propose a novel framework, namely Multimodal Graph Learning (MGL), that leverages information from multiple modalities. Our proposed method aims to effectively identify and utilize distinguishing features for deepfake detection.
arXiv Detail & Related papers (2022-09-12T17:17:49Z)
Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally. Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy. The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z)
Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption [94.5031244215761]
We propose to boost the generalization of deepfake detection by distinguishing the "regularity disruption" that does not appear in real videos. Specifically, by carefully examining the spatial and temporal properties, we propose to disrupt a real video through a Pseudo-fake Generator. Such practice allows us to achieve deepfake detection without using fake videos and improves the generalization ability in a simple and efficient manner.
arXiv Detail & Related papers (2022-07-21T10:42:34Z)
Delving into Sequential Patches for Deepfake Detection [64.19468088546743]
Recent advances in face forgery techniques produce nearly untraceable deepfake videos, which could be leveraged with malicious intentions. Previous studies has identified the importance of local low-level cues and temporal information in pursuit to generalize well across deepfake methods. We propose the Local- & Temporal-aware Transformer-based Deepfake Detection framework, which adopts a local-to-global learning protocol.
arXiv Detail & Related papers (2022-07-06T16:46:30Z)
Voice-Face Homogeneity Tells Deepfake [56.334968246631725]
Existing detection approaches contribute to exploring the specific artifacts in deepfake videos. We propose to perform the deepfake detection from an unexplored voice-face matching view. Our model obtains significantly improved performance as compared to other state-of-the-art competitors.
arXiv Detail & Related papers (2022-03-04T09:08:50Z)
Video Salient Object Detection via Contrastive Features and Attention Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection. A co-attention formulation is utilized to combine the low-level and high-level features. We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z)
Deepfake Detection using Spatiotemporal Convolutional Networks [0.0]
Deepfake detection methods only use individual frames and therefore fail to learn from temporal information. We created a benchmark of performance using Celeb-DF dataset. Our methods outperformed state-of-theart frame-based detection methods.
arXiv Detail & Related papers (2020-06-26T01:32:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.