Deep Convolutional Pooling Transformer for Deepfake Detection
- URL: http://arxiv.org/abs/2209.05299v4
- Date: Wed, 29 Mar 2023 02:53:29 GMT
- Title: Deep Convolutional Pooling Transformer for Deepfake Detection
- Authors: Tianyi Wang, Harry Cheng, Kam Pui Chow, Liqiang Nie
- Abstract summary: We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
- Score: 54.10864860009834
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, Deepfake has drawn considerable public attention due to security
and privacy concerns in social media digital forensics. As the wildly spreading
Deepfake videos on the Internet become more realistic, traditional detection
techniques have failed in distinguishing between real and fake. Most existing
deep learning methods mainly focus on local features and relations within the
face image using convolutional neural networks as a backbone. However, local
features and relations are insufficient for model training to learn enough
general information for Deepfake detection. Therefore, the existing Deepfake
detection methods have reached a bottleneck to further improve the detection
performance. To address this issue, we propose a deep convolutional Transformer
to incorporate the decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the
extracted features and enhance efficacy. Moreover, we employ the barely
discussed image keyframes in model training for performance improvement and
visualize the feature quantity gap between the key and normal image frames
caused by video compression. We finally illustrate the transferability with
extensive experiments on several Deepfake benchmark datasets. The proposed
solution consistently outperforms several state-of-the-art baselines on both
within- and cross-dataset experiments.
Related papers
- Contextual Cross-Modal Attention for Audio-Visual Deepfake Detection and Localization [3.9440964696313485]
In the digital age, the emergence of deepfakes and synthetic media presents a significant threat to societal and political integrity.
Deepfakes based on multi-modal manipulation, such as audio-visual, are more realistic and pose a greater threat.
We propose a novel multi-modal attention framework based on recurrent neural networks (RNNs) that leverages contextual information for audio-visual deepfake detection.
arXiv Detail & Related papers (2024-08-02T18:45:01Z) - DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake
Detection [67.3143177137102]
Deepfake detection refers to detecting artificially generated or edited faces in images or videos.
We propose a novel Deepfake detection framework named DeepFidelity to adaptively distinguish real and fake faces.
arXiv Detail & Related papers (2023-12-07T07:19:45Z) - CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF)
Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts.
It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake
Detection [50.33525966541906]
Existing multimodal detection methods capture audio-visual inconsistencies to expose Deepfake videos.
We propose a novel Deepfake detection method to mine the correlation between Non-critical Phonemes and Visemes, termed NPVForensics.
Our model can be easily adapted to the downstream Deepfake datasets with fine-tuning.
arXiv Detail & Related papers (2023-06-12T06:06:05Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z) - Multi-attentional Deepfake Detection [79.80308897734491]
Face forgery by deepfake is widely spread over the internet and has raised severe societal concerns.
We propose a new multi-attentional deepfake detection network. Specifically, it consists of three key components: 1) multiple spatial attention heads to make the network attend to different local parts; 2) textural feature enhancement block to zoom in the subtle artifacts in shallow features; 3) aggregate the low-level textural feature and high-level semantic features guided by the attention maps.
arXiv Detail & Related papers (2021-03-03T13:56:14Z) - Improving DeepFake Detection Using Dynamic Face Augmentation [0.8793721044482612]
Most publicly available DeepFake detection datasets have limited variations.
Deep neural networks tend to overfit to the facial features instead of learning to detect manipulation features of DeepFake content.
We introduce Face-Cutout, a data augmentation method for training Convolutional Neural Networks (CNN) to improve DeepFake detection.
arXiv Detail & Related papers (2021-02-18T20:25:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.