Learning Spatiotemporal Inconsistency via Thumbnail Layout for Face Deepfake Detection
- URL: http://arxiv.org/abs/2403.10261v2
- Date: Wed, 20 Mar 2024 13:15:28 GMT
- Title: Learning Spatiotemporal Inconsistency via Thumbnail Layout for Face Deepfake Detection
- Authors: Yuting Xu, Jian Liang, Lijun Sheng, Xiao-Yu Zhang,
- Abstract summary: Deepfake threats to society and cybersecurity have provoked significant public apprehension.
This paper introduces an elegantly simple yet effective strategy named Thumbnail Layout (TALL)
TALL transforms a video clip into a pre-defined layout to realize the preservation of spatial and temporal dependencies.
- Score: 41.35861722481721
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The deepfake threats to society and cybersecurity have provoked significant public apprehension, driving intensified efforts within the realm of deepfake video detection. Current video-level methods are mostly based on {3D CNNs} resulting in high computational demands, although have achieved good performance. This paper introduces an elegantly simple yet effective strategy named Thumbnail Layout (TALL), which transforms a video clip into a pre-defined layout to realize the preservation of spatial and temporal dependencies. This transformation process involves sequentially masking frames at the same positions within each frame. These frames are then resized into sub-frames and reorganized into the predetermined layout, forming thumbnails. TALL is model-agnostic and has remarkable simplicity, necessitating only minimal code modifications. Furthermore, we introduce a graph reasoning block (GRB) and semantic consistency (SC) loss to strengthen TALL, culminating in TALL++. GRB enhances interactions between different semantic regions to capture semantic-level inconsistency clues. The semantic consistency loss imposes consistency constraints on semantic features to improve model generalization ability. Extensive experiments on intra-dataset, cross-dataset, diffusion-generated image detection, and deepfake generation method recognition show that TALL++ achieves results surpassing or comparable to the state-of-the-art methods, demonstrating the effectiveness of our approaches for various deepfake detection problems. The code is available at https://github.com/rainy-xu/TALL4Deepfake.
Related papers
- UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - GRACE: Graph-Regularized Attentive Convolutional Entanglement with Laplacian Smoothing for Robust DeepFake Video Detection [7.591187423217017]
This paper introduces a novel method for robust DeepFake video detection based on graph convolutional network with graph Laplacian.
The proposed method delivers state-of-the-art performance in DeepFake video detection under noisy face sequences.
arXiv Detail & Related papers (2024-06-28T14:17:16Z) - TALL: Thumbnail Layout for Deepfake Video Detection [84.12790488801264]
This paper introduces a simple yet effective strategy named Thumbnail Layout (TALL)
TALL transforms a video clip into a pre-defined layout to realize the preservation of spatial and temporal dependencies.
Inspired by the success of vision transformers, we incorporate TALL into Swin Transformer, forming an efficient and effective method TALL-Swin.
arXiv Detail & Related papers (2023-07-14T17:27:22Z) - Glitch in the Matrix: A Large Scale Benchmark for Content Driven
Audio-Visual Forgery Detection and Localization [20.46053083071752]
We propose and benchmark a new dataset, Localized Visual DeepFake (LAV-DF)
LAV-DF consists of strategic content-driven audio, visual and audio-visual manipulations.
The proposed baseline method, Boundary Aware Temporal Forgery Detection (BA-TFD), is a 3D Convolutional Neural Network-based architecture.
arXiv Detail & Related papers (2023-05-03T08:48:45Z) - UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision
Transformer for Face Forgery Detection [52.91782218300844]
We propose a novel Unsupervised Inconsistency-Aware method based on Vision Transformer, called UIA-ViT.
Due to the self-attention mechanism, the attention map among patch embeddings naturally represents the consistency relation, making the vision Transformer suitable for the consistency representation learning.
arXiv Detail & Related papers (2022-10-23T15:24:47Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Block shuffling learning for Deepfake Detection [9.180904212520355]
Deepfake detection methods based on convolutional neural networks (CNN) have demonstrated high accuracy.
These methods often suffer from decreased performance when faced with unknown forgery methods and common transformations.
We propose a novel block shuffling regularization method to address this issue.
arXiv Detail & Related papers (2022-02-06T17:16:46Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.