Related papers: Common Sense Reasoning for Deepfake Detection

Common Sense Reasoning for Deepfake Detection

URL: http://arxiv.org/abs/2402.00126v2
Date: Thu, 18 Jul 2024 07:59:36 GMT
Title: Common Sense Reasoning for Deepfake Detection
Authors: Yue Zhang, Ben Colman, Xiao Guo, Ali Shahriyari, Gaurav Bharaj,
Abstract summary: State-of-the-art deepfake detection approaches rely on image-based features extracted via neural networks. We frame deepfake detection as a Deepfake Detection VQA (DD-VQA) task and model human intuition. We introduce a new annotated dataset and propose a Vision and Language Transformer-based framework for the DD-VQA task.
Score: 13.502008402754658
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: State-of-the-art deepfake detection approaches rely on image-based features extracted via neural networks. While these approaches trained in a supervised manner extract likely fake features, they may fall short in representing unnatural `non-physical' semantic facial attributes -- blurry hairlines, double eyebrows, rigid eye pupils, or unnatural skin shading. However, such facial attributes are easily perceived by humans and used to discern the authenticity of an image based on human common sense. Furthermore, image-based feature extraction methods that provide visual explanations via saliency maps can be hard to interpret for humans. To address these challenges, we frame deepfake detection as a Deepfake Detection VQA (DD-VQA) task and model human intuition by providing textual explanations that describe common sense reasons for labeling an image as real or fake. We introduce a new annotated dataset and propose a Vision and Language Transformer-based framework for the DD-VQA task. We also incorporate text and image-aware feature alignment formulation to enhance multi-modal representation learning. As a result, we improve upon existing deepfake detection models by integrating our learned vision representations, which reason over common sense knowledge from the DD-VQA task. We provide extensive empirical results demonstrating that our method enhances detection performance, generalization ability, and language-based interpretability in the deepfake detection task.

Related papers

TruthLens:A Training-Free Paradigm for DeepFake Detection [4.64982780843177]
We introduce TruthLens, a training-free framework that reimagines deepfake detection as a visual question-answering (VQA) task. TruthLens utilizes state-of-the-art large vision-language models (LVLMs) to observe and describe visual artifacts. By adopting a multimodal approach, TruthLens seamlessly integrates visual and semantic reasoning to not only classify images as real or fake but also provide interpretable explanations.
arXiv Detail & Related papers (2025-03-19T15:41:32Z)
Knowledge-Guided Prompt Learning for Deepfake Facial Image Detection [54.26588902144298]
We propose a knowledge-guided prompt learning method for deepfake facial image detection. Specifically, we retrieve forgery-related prompts from large language models as expert knowledge to guide the optimization of learnable prompts. Our proposed approach notably outperforms state-of-the-art methods.
arXiv Detail & Related papers (2025-01-01T02:18:18Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models. In this paper, we investigate how detection performance varies across model backbones, types, and datasets. We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
EEG-Features for Generalized Deepfake Detection [3.7117930046173173]
We explore a novel approach to Deepfake detection by utilizing electroencephalography (EEG) measured from the neural processing of a human. Preliminary results indicate that human neural processing signals can be successfully integrated into Deepfake detection frameworks. Our study provides next steps towards the understanding of how digital realism is embedded in the human cognitive system.
arXiv Detail & Related papers (2024-05-14T12:06:44Z)
Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method [77.65459419417533]
We put face forgery in a semantic context and define that computational methods that alter semantic face attributes are sources of face forgery. We construct a large face forgery image dataset, where each image is associated with a set of labels organized in a hierarchical graph. We propose a semantics-oriented face forgery detection method that captures label relations and prioritizes the primary task.
arXiv Detail & Related papers (2024-05-14T10:24:19Z)
FakeBench: Probing Explainable Fake Image Detection via Large Multimodal Models [62.66610648697744]
We introduce a taxonomy of generative visual forgery concerning human perception, based on which we collect forgery descriptions in human natural language. FakeBench examines LMMs with four evaluation criteria: detection, reasoning, interpretation and fine-grained forgery analysis. This research presents a paradigm shift towards transparency for the fake image detection area.
arXiv Detail & Related papers (2024-04-20T07:28:55Z)
Individualized Deepfake Detection Exploiting Traces Due to Double Neural-Network Operations [32.33331065408444]
Existing deepfake detectors are not optimized for this detection task when an image is associated with a specific and identifiable individual. This study focuses on the deepfake detection of facial images of individual public figures. We demonstrate that the detection performance can be improved by exploiting the idempotency property of neural networks.
arXiv Detail & Related papers (2023-12-13T10:21:00Z)
DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake Detection [67.3143177137102]
Deepfake detection refers to detecting artificially generated or edited faces in images or videos. We propose a novel Deepfake detection framework named DeepFidelity to adaptively distinguish real and fake faces.
arXiv Detail & Related papers (2023-12-07T07:19:45Z)
Integrating Language-Derived Appearance Elements with Visual Cues in Pedestrian Detection [51.66174565170112]
We introduce a novel approach to utilize the strengths of large language models in understanding contextual appearance variations. We propose to formulate language-derived appearance elements and incorporate them with visual cues in pedestrian detection.
arXiv Detail & Related papers (2023-11-02T06:38:19Z)
AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors [24.78672820633581]
Deep generative models can create remarkably fake images while raising concerns about misinformation and copyright infringement. Deepfake detection technique is developed to distinguish between real and fake images. We propose a novel approach called AntifakePrompt, using Vision-Language Models and prompt tuning techniques.
arXiv Detail & Related papers (2023-10-26T14:23:45Z)
ImaginaryNet: Learning Object Detectors without Real Images and Annotations [66.30908705345973]
We propose a framework to synthesize images by combining pretrained language model and text-to-image model. With the synthesized images and class labels, weakly supervised object detection can then be leveraged to accomplish Imaginary-Supervised Object Detection. Experiments show that ImaginaryNet can (i) obtain about 70% performance in ISOD compared with the weakly supervised counterpart of the same backbone trained on real data.
arXiv Detail & Related papers (2022-10-13T10:25:22Z)
Detect and Locate: A Face Anti-Manipulation Approach with Semantic and Noise-level Supervision [67.73180660609844]
We propose a conceptually simple but effective method to efficiently detect forged faces in an image. The proposed scheme relies on a segmentation map that delivers meaningful high-level semantic information clues about the image. The proposed model achieves state-of-the-art detection accuracy and remarkable localization performance.
arXiv Detail & Related papers (2021-07-13T02:59:31Z)
Fighting Deepfake by Exposing the Convolutional Traces on Images [0.0]
Mobile apps like FACEAPP make use of the most advanced Generative Adversarial Networks (GAN) to produce extreme transformations on human face photos. This kind of media object took the name of Deepfake and raised a new challenge in the multimedia forensics field: the Deepfake detection challenge. In this paper, a new approach aimed to extract a Deepfake fingerprint from images is proposed.
arXiv Detail & Related papers (2020-08-07T08:49:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.