Hybrid Transformer Network for Deepfake Detection
- URL: http://arxiv.org/abs/2208.05820v1
- Date: Thu, 11 Aug 2022 13:30:42 GMT
- Title: Hybrid Transformer Network for Deepfake Detection
- Authors: Sohail Ahmed Khan and Duc-Tien Dang-Nguyen
- Abstract summary: We propose a novel hybrid transformer network utilizing early feature fusion strategy for deepfake video detection.
Our model achieves comparable results to other more advanced state-of-the-art approaches when evaluated on FaceForensics++ and DFDC benchmarks.
We also propose novel face cut-out augmentations, as well as random cut-out augmentations.
- Score: 2.644723682054489
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deepfake media is becoming widespread nowadays because of the easily
available tools and mobile apps which can generate realistic looking deepfake
videos/images without requiring any technical knowledge. With further advances
in this field of technology in the near future, the quantity and quality of
deepfake media is also expected to flourish, while making deepfake media a
likely new practical tool to spread mis/disinformation. Because of these
concerns, the deepfake media detection tools are becoming a necessity. In this
study, we propose a novel hybrid transformer network utilizing early feature
fusion strategy for deepfake video detection. Our model employs two different
CNN networks, i.e., (1) XceptionNet and (2) EfficientNet-B4 as feature
extractors. We train both feature extractors along with the transformer in an
end-to-end manner on FaceForensics++, DFDC benchmarks. Our model, while having
relatively straightforward architecture, achieves comparable results to other
more advanced state-of-the-art approaches when evaluated on FaceForensics++ and
DFDC benchmarks. Besides this, we also propose novel face cut-out
augmentations, as well as random cut-out augmentations. We show that the
proposed augmentations improve the detection performance of our model and
reduce overfitting. In addition to that, we show that our model is capable of
learning from considerably small amount of data.
Related papers
- Data-Independent Operator: A Training-Free Artifact Representation
Extractor for Generalizable Deepfake Detection [105.9932053078449]
In this work, we show that, on the contrary, the small and training-free filter is sufficient to capture more general artifact representations.
Due to its unbias towards both the training and test sources, we define it as Data-Independent Operator (DIO) to achieve appealing improvements on unseen sources.
Our detector achieves a remarkable improvement of $13.3%$, establishing a new state-of-the-art performance.
arXiv Detail & Related papers (2024-03-11T15:22:28Z) - Deepfake Video Detection Using Generative Convolutional Vision
Transformer [3.8297637120486496]
We propose a Generative Convolutional Vision Transformer (GenConViT) for deepfake video detection.
Our model combines ConvNeXt and Swin Transformer models for feature extraction.
By learning from the visual artifacts and latent data distribution, GenConViT achieves improved performance in detecting a wide range of deepfake videos.
arXiv Detail & Related papers (2023-07-13T19:27:40Z) - Undercover Deepfakes: Detecting Fake Segments in Videos [1.2609216345578933]
deepfake generation is a new paradigm of deepfakes which are mostly real videos altered slightly to distort the truth.
In this paper, we present a deepfake detection method that can address this issue by performing deepfake prediction at the frame and video levels.
In particular, the paradigm we address will form a powerful tool for the moderation of deepfakes, where human oversight can be better targeted to the parts of videos suspected of being deepfakes.
arXiv Detail & Related papers (2023-05-11T04:43:10Z) - Leveraging Deep Learning Approaches for Deepfake Detection: A Review [0.0]
Deepfakes are fabricated media generated by AI that are difficult to set apart from the real media.
This paper aims to explore different methodologies with an intention to achieve a cost-effective model.
arXiv Detail & Related papers (2023-04-04T16:04:42Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Cross-Forgery Analysis of Vision Transformers and CNNs for Deepfake
Image Detection [11.944111906027144]
We show that EfficientNetV2 has a greater tendency to specialize often obtaining better results on training methods.
We also show that Vision Transformers exhibit a superior generalization ability that makes them more competent even on images generated with new methodologies.
arXiv Detail & Related papers (2022-06-28T08:50:22Z) - Activating More Pixels in Image Super-Resolution Transformer [53.87533738125943]
Transformer-based methods have shown impressive performance in low-level vision tasks, such as image super-resolution.
We propose a novel Hybrid Attention Transformer (HAT) to activate more input pixels for better reconstruction.
Our overall method significantly outperforms the state-of-the-art methods by more than 1dB.
arXiv Detail & Related papers (2022-05-09T17:36:58Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z) - Adversarially robust deepfake media detection using fused convolutional
neural network predictions [79.00202519223662]
Current deepfake detection systems struggle against unseen data.
We employ three different deep Convolutional Neural Network (CNN) models to classify fake and real images extracted from videos.
The proposed technique outperforms state-of-the-art models with 96.5% accuracy.
arXiv Detail & Related papers (2021-02-11T11:28:00Z) - Two-branch Recurrent Network for Isolating Deepfakes in Videos [17.59209853264258]
We present a method for deepfake detection based on a two-branch network structure.
One branch propagates the original information, while the other branch suppresses the face content.
Our two novel components show promising results on the FaceForensics++, Celeb-DF, and Facebook's DFDC preview benchmarks.
arXiv Detail & Related papers (2020-08-08T01:38:56Z) - Artificial Fingerprinting for Generative Models: Rooting Deepfake
Attribution in Training Data [64.65952078807086]
Photorealistic image generation has reached a new level of quality due to the breakthroughs of generative adversarial networks (GANs)
Yet, the dark side of such deepfakes, the malicious use of generated media, raises concerns about visual misinformation.
We seek a proactive and sustainable solution on deepfake detection by introducing artificial fingerprints into the models.
arXiv Detail & Related papers (2020-07-16T16:49:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.