Detection of Deepfake Videos Using Long Distance Attention
- URL: http://arxiv.org/abs/2106.12832v1
- Date: Thu, 24 Jun 2021 08:33:32 GMT
- Title: Detection of Deepfake Videos Using Long Distance Attention
- Authors: Wei Lu, Lingyi Liu, Junwei Luo, Xianfeng Zhao, Yicong Zhou, Jiwu Huang
- Abstract summary: Most existing detection methods treat the problem as a vanilla binary classification problem.
In this paper, the problem is treated as a special fine-grained classification problem since the differences between fake and real faces are very subtle.
A spatial-temporal model is proposed which has two components for capturing spatial and temporal forgery traces in global perspective.
- Score: 73.6659488380372
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid progress of deepfake techniques in recent years, facial video
forgery can generate highly deceptive video contents and bring severe security
threats. And detection of such forgery videos is much more urgent and
challenging. Most existing detection methods treat the problem as a vanilla
binary classification problem. In this paper, the problem is treated as a
special fine-grained classification problem since the differences between fake
and real faces are very subtle. It is observed that most existing face forgery
methods left some common artifacts in the spatial domain and time domain,
including generative defects in the spatial domain and inter-frame
inconsistencies in the time domain. And a spatial-temporal model is proposed
which has two components for capturing spatial and temporal forgery traces in
global perspective respectively. The two components are designed using a novel
long distance attention mechanism. The one component of the spatial domain is
used to capture artifacts in a single frame, and the other component of the
time domain is used to capture artifacts in consecutive frames. They generate
attention maps in the form of patches. The attention method has a broader
vision which contributes to better assembling global information and extracting
local statistic information. Finally, the attention maps are used to guide the
network to focus on pivotal parts of the face, just like other fine-grained
classification methods. The experimental results on different public datasets
demonstrate that the proposed method achieves the state-of-the-art performance,
and the proposed long distance attention method can effectively capture pivotal
parts for face forgery.
Related papers
- UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection [56.7599217711363]
Face forgery recognition methods can only process one face at a time.
Most face forgery recognition methods can only process one face at a time.
We propose COMICS, an end-to-end framework for multi-face forgery detection.
arXiv Detail & Related papers (2023-08-03T03:37:13Z) - AltFreezing for More General Video Face Forgery Detection [138.5732617371004]
We propose to capture both spatial and unseen temporal artifacts in one model for face forgery detection.
We present a novel training strategy called AltFreezing for more general face forgery detection.
arXiv Detail & Related papers (2023-07-17T08:24:58Z) - Cross-Domain Local Characteristic Enhanced Deepfake Video Detection [18.430287055542315]
Deepfake detection has attracted increasing attention due to security concerns.
Many detectors cannot achieve accurate results when detecting unseen manipulations.
We propose a novel pipeline, Cross-Domain Local Forensics, for more general deepfake video detection.
arXiv Detail & Related papers (2022-11-07T07:44:09Z) - Multimodal Graph Learning for Deepfake Detection [10.077496841634135]
Existing deepfake detectors face several challenges in achieving robustness and generalization.
We propose a novel framework, namely Multimodal Graph Learning (MGL), that leverages information from multiple modalities.
Our proposed method aims to effectively identify and utilize distinguishing features for deepfake detection.
arXiv Detail & Related papers (2022-09-12T17:17:49Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Multi-attentional Deepfake Detection [79.80308897734491]
Face forgery by deepfake is widely spread over the internet and has raised severe societal concerns.
We propose a new multi-attentional deepfake detection network. Specifically, it consists of three key components: 1) multiple spatial attention heads to make the network attend to different local parts; 2) textural feature enhancement block to zoom in the subtle artifacts in shallow features; 3) aggregate the low-level textural feature and high-level semantic features guided by the attention maps.
arXiv Detail & Related papers (2021-03-03T13:56:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.