Delving into Sequential Patches for Deepfake Detection
- URL: http://arxiv.org/abs/2207.02803v1
- Date: Wed, 6 Jul 2022 16:46:30 GMT
- Title: Delving into Sequential Patches for Deepfake Detection
- Authors: Jiazhi Guan, Hang Zhou, Zhibin Hong, Errui Ding, Jingdong Wang,
Chengbin Quan, Youjian Zhao
- Abstract summary: Recent advances in face forgery techniques produce nearly untraceable deepfake videos, which could be leveraged with malicious intentions.
Previous studies has identified the importance of local low-level cues and temporal information in pursuit to generalize well across deepfake methods.
We propose the Local- & Temporal-aware Transformer-based Deepfake Detection framework, which adopts a local-to-global learning protocol.
- Score: 64.19468088546743
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in face forgery techniques produce nearly visually
untraceable deepfake videos, which could be leveraged with malicious
intentions. As a result, researchers have been devoted to deepfake detection.
Previous studies has identified the importance of local low-level cues and
temporal information in pursuit to generalize well across deepfake methods,
however, they still suffer from robustness problem against post-processings. In
this work, we propose the Local- & Temporal-aware Transformer-based Deepfake
Detection (LTTD) framework, which adopts a local-to-global learning protocol
with a particular focus on the valuable temporal information within local
sequences. Specifically, we propose a Local Sequence Transformer (LST), which
models the temporal consistency on sequences of restricted spatial regions,
where low-level information is hierarchically enhanced with shallow layers of
learned 3D filters. Based on the local temporal embeddings, we then achieve the
final classification in a global contrastive way. Extensive experiments on
popular datasets validate that our approach effectively spots local forgery
cues and achieves state-of-the-art performance.
Related papers
- Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs)
Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z) - UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF)
Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts.
It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - Towards Generalizable Deepfake Detection by Primary Region
Regularization [52.41801719896089]
This paper enhances the generalization capability from a novel regularization perspective.
Our method consists of two stages, namely the static localization for primary region maps, and the dynamic exploitation of primary region masks.
We conduct extensive experiments over three widely used deepfake datasets - DFDC, DF-1.0, and Celeb-DF with five backbones.
arXiv Detail & Related papers (2023-07-24T05:43:34Z) - Detect Any Deepfakes: Segment Anything Meets Face Forgery Detection and
Localization [30.317619885984005]
We introduce the well-trained vision segmentation foundation model, i.e., Segment Anything Model (SAM) in face forgery detection and localization.
Based on SAM, we propose the Detect Any Deepfakes (DADF) framework with the Multiscale Adapter.
The proposed framework seamlessly integrates end-to-end forgery localization and detection optimization.
arXiv Detail & Related papers (2023-06-29T16:25:04Z) - LatentForensics: Towards frugal deepfake detection in the StyleGAN latent space [2.629091178090276]
We propose a deepfake detection method that operates in the latent space of a state-of-the-art generative adversarial network (GAN) trained on high-quality face images.
Experimental results on standard datasets reveal that the proposed approach outperforms other state-of-the-art deepfake classification methods.
arXiv Detail & Related papers (2023-03-30T08:36:48Z) - Cross-Domain Local Characteristic Enhanced Deepfake Video Detection [18.430287055542315]
Deepfake detection has attracted increasing attention due to security concerns.
Many detectors cannot achieve accurate results when detecting unseen manipulations.
We propose a novel pipeline, Cross-Domain Local Forensics, for more general deepfake video detection.
arXiv Detail & Related papers (2022-11-07T07:44:09Z) - Exploring Spatial-Temporal Features for Deepfake Detection and
Localization [0.0]
We propose a Deepfake network that simultaneously explores spatial and temporal features for detecting and localizing forged regions.
Specifically, we design a new Anchor-Mesh Motion (AMM) algorithm to extract temporal (motion) features by modeling the precise geometric movements of the facial micro-expression.
The superiority of our ST-DDL network is verified by experimental comparisons with several state-of-the-art competitors.
arXiv Detail & Related papers (2022-10-28T03:38:49Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Spatio-temporal Features for Generalized Detection of Deepfake Videos [12.453288832098314]
We propose-temporal features, modeled by 3D CNNs, to extend the capabilities to detect new sorts of deep videos.
We show that our approach outperforms existing methods in terms of generalization capabilities.
arXiv Detail & Related papers (2020-10-22T16:28:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.