TruthLens: Explainable DeepFake Detection for Face Manipulated and Fully Synthetic Data
- URL: http://arxiv.org/abs/2503.15867v1
- Date: Thu, 20 Mar 2025 05:40:42 GMT
- Title: TruthLens: Explainable DeepFake Detection for Face Manipulated and Fully Synthetic Data
- Authors: Rohit Kundu, Athula Balachandran, Amit K. Roy-Chowdhury,
- Abstract summary: We propose TruthLens, a novel framework for DeepFake detection.<n>TruthLens handles both face-manipulated DeepFakes and fully AI-generated content.<n>It provides detailed textual reasoning for its predictions.
- Score: 21.315907061821424
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detecting DeepFakes has become a crucial research area as the widespread use of AI image generators enables the effortless creation of face-manipulated and fully synthetic content, yet existing methods are often limited to binary classification (real vs. fake) and lack interpretability. To address these challenges, we propose TruthLens, a novel and highly generalizable framework for DeepFake detection that not only determines whether an image is real or fake but also provides detailed textual reasoning for its predictions. Unlike traditional methods, TruthLens effectively handles both face-manipulated DeepFakes and fully AI-generated content while addressing fine-grained queries such as "Does the eyes/nose/mouth look real or fake?" The architecture of TruthLens combines the global contextual understanding of multimodal large language models like PaliGemma2 with the localized feature extraction capabilities of vision-only models like DINOv2. This hybrid design leverages the complementary strengths of both models, enabling robust detection of subtle manipulations while maintaining interpretability. Extensive experiments on diverse datasets demonstrate that TruthLens outperforms state-of-the-art methods in detection accuracy (by 2-14%) and explainability, in both in-domain and cross-data settings, generalizing effectively across traditional and emerging manipulation techniques.
Related papers
- FakeScope: Large Multimodal Expert Model for Transparent AI-Generated Image Forensics [66.14786900470158]
We propose FakeScope, an expert multimodal model (LMM) tailored for AI-generated image forensics.
FakeScope identifies AI-synthetic images with high accuracy and provides rich, interpretable, and query-driven forensic insights.
FakeScope achieves state-of-the-art performance in both closed-ended and open-ended forensic scenarios.
arXiv Detail & Related papers (2025-03-31T16:12:48Z) - TruthLens:A Training-Free Paradigm for DeepFake Detection [4.64982780843177]
We introduce TruthLens, a training-free framework that reimagines deepfake detection as a visual question-answering (VQA) task.<n>TruthLens utilizes state-of-the-art large vision-language models (LVLMs) to observe and describe visual artifacts.<n>By adopting a multimodal approach, TruthLens seamlessly integrates visual and semantic reasoning to not only classify images as real or fake but also provide interpretable explanations.
arXiv Detail & Related papers (2025-03-19T15:41:32Z) - Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation [15.442558725312976]
We introduce FakeVLM, a specialized large multimodal model for both general synthetic image and DeepFake detection tasks.<n>FakeVLM excels in distinguishing real from fake images and provides clear, natural language explanations for image artifacts.<n>We present FakeClue, a comprehensive dataset containing over 100,000 images across seven categories, annotated with fine-grained artifact clues in natural language.
arXiv Detail & Related papers (2025-03-19T05:14:44Z) - HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection [4.908389661988192]
HFMF is a comprehensive two-stage deepfake detection framework.<n>It integrates vision Transformers and convolutional nets through a hierarchical feature fusion mechanism.<n>We demonstrate that our architecture achieves superior performance across diverse dataset benchmarks.
arXiv Detail & Related papers (2025-01-10T00:20:29Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion [94.46904504076124]
Deepfake technology has made face swapping highly realistic, raising concerns about the malicious use of fabricated facial content.
Existing methods often struggle to generalize to unseen domains due to the diverse nature of facial manipulations.
We introduce DiffusionFake, a novel framework that reverses the generative process of face forgeries to enhance the generalization of detection models.
arXiv Detail & Related papers (2024-10-06T06:22:43Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - Real Face Foundation Representation Learning for Generalized Deepfake
Detection [74.4691295738097]
The emergence of deepfake technologies has become a matter of social concern as they pose threats to individual privacy and public security.
It is almost impossible to collect sufficient representative fake faces, and it is hard for existing detectors to generalize to all types of manipulation.
We propose Real Face Foundation Representation Learning (RFFR), which aims to learn a general representation from large-scale real face datasets.
arXiv Detail & Related papers (2023-03-15T08:27:56Z) - Voice-Face Homogeneity Tells Deepfake [56.334968246631725]
Existing detection approaches contribute to exploring the specific artifacts in deepfake videos.
We propose to perform the deepfake detection from an unexplored voice-face matching view.
Our model obtains significantly improved performance as compared to other state-of-the-art competitors.
arXiv Detail & Related papers (2022-03-04T09:08:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.