Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning
- URL: http://arxiv.org/abs/2508.21048v1
- Date: Thu, 28 Aug 2025 17:53:05 GMT
- Title: Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning
- Authors: Hao Tan, Jun Lan, Zichang Tan, Ajian Liu, Chuanbiao Song, Senyuan Shi, Huijia Zhu, Weiqiang Wang, Jun Wan, Zhen Lei,
- Abstract summary: We introduce HydraFake, a dataset that simulates real-world challenges with hierarchical generalization testing.<n>Specifically, HydraFake involves diversified deepfake techniques and in-the-wild forgeries, along with rigorous training and evaluation protocol.<n>We propose Veritas, a multi-modal large language model (MLLM) based deepfake detector.
- Score: 45.99344620383706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deepfake detection remains a formidable challenge due to the complex and evolving nature of fake content in real-world scenarios. However, existing academic benchmarks suffer from severe discrepancies from industrial practice, typically featuring homogeneous training sources and low-quality testing images, which hinder the practical deployments of current detectors. To mitigate this gap, we introduce HydraFake, a dataset that simulates real-world challenges with hierarchical generalization testing. Specifically, HydraFake involves diversified deepfake techniques and in-the-wild forgeries, along with rigorous training and evaluation protocol, covering unseen model architectures, emerging forgery techniques and novel data domains. Building on this resource, we propose Veritas, a multi-modal large language model (MLLM) based deepfake detector. Different from vanilla chain-of-thought (CoT), we introduce pattern-aware reasoning that involves critical reasoning patterns such as "planning" and "self-reflection" to emulate human forensic process. We further propose a two-stage training pipeline to seamlessly internalize such deepfake reasoning capacities into current MLLMs. Experiments on HydraFake dataset reveal that although previous detectors show great generalization on cross-model scenarios, they fall short on unseen forgeries and data domains. Our Veritas achieves significant gains across different OOD scenarios, and is capable of delivering transparent and faithful detection outputs.
Related papers
- Deep Learning for Contextualized NetFlow-Based Network Intrusion Detection: Methods, Data, Evaluation and Deployment [5.402853794565817]
This paper synthesizes recent research on context-aware deep learning for flow-based intrusion detection.<n>We organize existing methods into a four-dimensional taxonomy covering temporal context, graph or relational context, multimodal context, and multi-resolution context.<n>We review common failure modes that can inflate reported results, including temporal leakage, data splitting, dataset design flaws, limited dataset diversity, and weak cross-dataset generalization.
arXiv Detail & Related papers (2026-02-05T12:25:18Z) - Improving Generalization in Deepfake Detection with Face Foundation Models and Metric Learning [11.097006771680896]
We present a robust video deepfake detection framework with strong generalization.<n>Our method is built on top of FSFM, a self-supervised model trained on real face data.<n>We incorporate triplet loss variants during training, guiding the model to produce more separable embeddings between real and fake samples.
arXiv Detail & Related papers (2025-08-27T09:46:45Z) - DDL: A Large-Scale Datasets for Deepfake Detection and Localization in Diversified Real-World Scenarios [51.916287988122406]
We present a novel large-scale deepfake detection and localization (textbfDDL) dataset containing over $textbf1.4M+$ forged samples.<n>Our DDL not only provides a more challenging benchmark for complex real-world forgeries but also offers crucial support for building next-generation deepfake detection, localization, and interpretability methods.
arXiv Detail & Related papers (2025-06-29T15:29:03Z) - Adaptive Meta-Learning for Robust Deepfake Detection: A Multi-Agent Framework to Data Drift and Model Generalization [6.589206192038365]
This paper proposes an adversarial meta-learning algorithm using task-specific adaptive sample synthesis and consistency regularization.
It boosts both robustness and generalization of the model.
Experimental results demonstrate the model's consistent performance across various datasets, outperforming the models in comparison.
arXiv Detail & Related papers (2024-11-12T19:55:07Z) - GM-DF: Generalized Multi-Scenario Deepfake Detection [49.072106087564144]
Existing face forgery detection usually follows the paradigm of training models in a single domain.
In this paper, we elaborately investigate the generalization capacity of deepfake detection models when jointly trained on multiple face forgery detection datasets.
arXiv Detail & Related papers (2024-06-28T17:42:08Z) - DF40: Toward Next-Generation Deepfake Detection [62.073997142001424]
existing works identify top-notch detection algorithms and models by adhering to the common practice: training detectors on one specific dataset and testing them on other prevalent deepfake datasets.
But can these stand-out "winners" be truly applied to tackle the myriad of realistic and diverse deepfakes lurking in the real world?
We construct a highly diverse deepfake detection dataset called DF40, which comprises 40 distinct deepfake techniques.
arXiv Detail & Related papers (2024-06-19T12:35:02Z) - CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF)
Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts.
It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection [55.70982767084996]
A critical yet frequently overlooked challenge in the field of deepfake detection is the lack of a standardized, unified, comprehensive benchmark.
We present the first comprehensive benchmark for deepfake detection, called DeepfakeBench, which offers three key contributions.
DeepfakeBench contains 15 state-of-the-art detection methods, 9CL datasets, a series of deepfake detection evaluation protocols and analysis tools, as well as comprehensive evaluations.
arXiv Detail & Related papers (2023-07-04T01:34:41Z) - SeeABLE: Soft Discrepancies and Bounded Contrastive Learning for
Exposing Deepfakes [7.553507857251396]
We propose a novel deepfake detector, called SeeABLE, that formalizes the detection problem as a (one-class) out-of-distribution detection task.
SeeABLE pushes perturbed faces towards predefined prototypes using a novel regression-based bounded contrastive loss.
We show that our model convincingly outperforms competing state-of-the-art detectors, while exhibiting highly encouraging generalization capabilities.
arXiv Detail & Related papers (2022-11-21T09:38:30Z) - Metamorphic Testing-based Adversarial Attack to Fool Deepfake Detectors [2.0649235321315285]
There is a dire need for deepfake detection technology to help spot deepfake media.
Current deepfake detection models are able to achieve outstanding accuracy (>90%)
This study identifies makeup application as an adversarial attack that could fool deepfake detectors.
arXiv Detail & Related papers (2022-04-19T02:24:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.