RAIDX: A Retrieval-Augmented Generation and GRPO Reinforcement Learning Framework for Explainable Deepfake Detection
- URL: http://arxiv.org/abs/2508.04524v1
- Date: Wed, 06 Aug 2025 15:08:16 GMT
- Title: RAIDX: A Retrieval-Augmented Generation and GRPO Reinforcement Learning Framework for Explainable Deepfake Detection
- Authors: Tianxiao Li, Zhenglin Huang, Haiquan Wen, Yiwei He, Shuchang Lyu, Baoyuan Wu, Guangliang Cheng,
- Abstract summary: RAIDX is a novel deepfake detection framework that integrates Retrieval-Augmented Generation (RAG) and Group Relative Policy Optimization ( GRPO)<n>RAG incorporates external knowledge for improved detection accuracy and employs GRPO to autonomously generate fine-grained textual explanations and saliency maps.<n>Experiments on multiple benchmarks demonstrate RAIDX's effectiveness in identifying real or fake, and providing interpretable rationales in both textual descriptions and saliency maps.
- Score: 32.48195434906769
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid advancement of AI-generation models has enabled the creation of hyperrealistic imagery, posing ethical risks through widespread misinformation. Current deepfake detection methods, categorized as face specific detectors or general AI-generated detectors, lack transparency by framing detection as a classification task without explaining decisions. While several LLM-based approaches offer explainability, they suffer from coarse-grained analyses and dependency on labor-intensive annotations. This paper introduces RAIDX (Retrieval-Augmented Image Deepfake Detection and Explainability), a novel deepfake detection framework integrating Retrieval-Augmented Generation (RAG) and Group Relative Policy Optimization (GRPO) to enhance detection accuracy and decision explainability. Specifically, RAIDX leverages RAG to incorporate external knowledge for improved detection accuracy and employs GRPO to autonomously generate fine-grained textual explanations and saliency maps, eliminating the need for extensive manual annotations. Experiments on multiple benchmarks demonstrate RAIDX's effectiveness in identifying real or fake, and providing interpretable rationales in both textual descriptions and saliency maps, achieving state-of-the-art detection performance while advancing transparency in deepfake identification. RAIDX represents the first unified framework to synergize RAG and GRPO, addressing critical gaps in accuracy and explainability. Our code and models will be publicly available.
Related papers
- Unveiling Perceptual Artifacts: A Fine-Grained Benchmark for Interpretable AI-Generated Image Detection [95.08316274158165]
X-AIGD provides pixel-level, categorized annotations of perceptual artifacts, spanning low-level distortions, high-level semantics, and cognitive-level counterfactuals.<n>Existing AIGI detectors demonstrate negligible reliance on perceptual artifacts, even at the most basic distortion level.<n>Explicitly aligning model attention with artifact regions can increase the interpretability and generalization of detectors.
arXiv Detail & Related papers (2026-01-27T10:09:17Z) - Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection [76.91230292971115]
Large language model (LLM)-based multi-agent systems (MAS) have shown strong capabilities in solving complex tasks.<n>XG-Guard is an explainable and fine-grained safeguarding framework for detecting malicious agents in MAS.
arXiv Detail & Related papers (2025-12-21T13:46:36Z) - HKRAG: Holistic Knowledge Retrieval-Augmented Generation Over Visually-Rich Documents [18.42875699937102]
We propose HKRAG, a new holistic RAG framework designed to explicitly capture and integrate both knowledge types.<n>Our framework features two key components: (1) a Hybrid Masking-based Holistic Retriever that employs explicit masking strategies to separately model salient and fine-print knowledge, ensuring a query-relevant holistic information retrieval; and (2) an Uncertainty-guided Agentic Generator that dynamically assesses the uncertainty of initial answers and actively decides how to integrate the two distinct knowledge streams for optimal response generation.
arXiv Detail & Related papers (2025-11-25T11:59:52Z) - LLM-Powered Text-Attributed Graph Anomaly Detection via Retrieval-Augmented Reasoning [20.426942100536003]
Anomaly detection on text-attributed graphs (TAGs) plays an essential role in applications such as fraud detection, intrusion monitoring, and misinformation analysis.<n>We introduce TAG-AD, a comprehensive benchmark for anomaly node detection on TAGs.<n>We propose a retrieval-augmented generation (RAG)-assisted, LLM-based zero-shot anomaly detection framework.
arXiv Detail & Related papers (2025-11-16T05:21:14Z) - Seeing the Unseen: Towards Zero-Shot Inspection for Wind Turbine Blades using Knowledge-Augmented Vision Language Models [10.230967860299504]
We propose a zero-shot-oriented inspection framework that integrates Retrieval-Augmented Generation with Vision-Language Models.<n>A multimodal knowledge base is constructed, comprising technical documentation, representative reference images, and domain-specific guidelines.<n>We evaluate the framework on 30 labeled blade images covering diverse damage categories.
arXiv Detail & Related papers (2025-10-26T23:19:28Z) - Semantic Visual Anomaly Detection and Reasoning in AI-Generated Images [96.43608872116347]
AnomReason is a large-scale benchmark with structured annotations as quadruple textbfAnomAgent<n>AnomReason and AnomAgent serve as a foundation for measuring and improving the semantic plausibility of AI-generated images.
arXiv Detail & Related papers (2025-10-11T14:09:24Z) - Enhancing Retrieval Augmentation via Adversarial Collaboration [50.117273835877334]
We propose the Adrial Collaboration RAG (AC-RAG) framework to address "Retrieval Hallucinations"<n>AC-RAG employs two heterogeneous agents: a generalist Detector that identifies knowledge gaps, and a domain-specialized Resolver that provides precise solutions.<n>Experiments show that AC-RAG significantly improves retrieval accuracy and outperforms state-of-the-art RAG methods across various vertical domains.
arXiv Detail & Related papers (2025-09-18T08:54:20Z) - AuthGuard: Generalizable Deepfake Detection via Language Guidance [39.18916434250689]
Existing deepfake detection techniques struggle to keep-up with the ever-evolving novel, unseen forgeries methods.<n>We propose that incorporating language guidance can improve deepfake detection generalization.<n>We train an expert deepfake vision encoder by combining discriminative classification with image-text contrastive learning.
arXiv Detail & Related papers (2025-06-04T22:50:07Z) - AGENT-X: Adaptive Guideline-based Expert Network for Threshold-free AI-generated teXt detection [44.66668435489055]
AGENT-X is a zero-shot multi-agent framework for AI-generated text detection.<n>We organize detection guidelines into semantic, stylistic, and structural dimensions, each independently evaluated by specialized linguistic agents.<n>A meta agent integrates these assessments through confidence-aware aggregation, enabling threshold-free, interpretable classification.<n>Experiments on diverse datasets demonstrate that AGENT-X substantially surpasses state-of-the-art supervised and zero-shot approaches in accuracy, interpretability, and generalization.
arXiv Detail & Related papers (2025-05-21T08:39:18Z) - FakeScope: Large Multimodal Expert Model for Transparent AI-Generated Image Forensics [66.14786900470158]
We propose FakeScope, an expert multimodal model (LMM) tailored for AI-generated image forensics.<n>FakeScope identifies AI-synthetic images with high accuracy and provides rich, interpretable, and query-driven forensic insights.<n>FakeScope achieves state-of-the-art performance in both closed-ended and open-ended forensic scenarios.
arXiv Detail & Related papers (2025-03-31T16:12:48Z) - DeepRAG: Thinking to Retrieve Step by Step for Large Language Models [92.87532210660456]
We propose DeepRAG, a framework that models retrieval-augmented reasoning as a Markov Decision Process (MDP)<n>By iteratively decomposing queries, DeepRAG dynamically determines whether to retrieve external knowledge or rely on parametric reasoning at each step.<n> Experiments show that DeepRAG improves retrieval efficiency and boosts answer accuracy by 26.4%, demonstrating its effectiveness in enhancing retrieval-augmented reasoning.
arXiv Detail & Related papers (2025-02-03T08:22:45Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Conditioned Prompt-Optimization for Continual Deepfake Detection [11.634681724245933]
This paper introduces Prompt2Guard, a novel solution for photorealistic-free continual deepfake detection of images.
We leverage a prediction ensembling technique with read-only prompts, mitigating the need for multiple forward passes.
Our method exploits a text-prompt conditioning tailored to deepfake detection, which we demonstrate is beneficial in our setting.
arXiv Detail & Related papers (2024-07-31T12:22:57Z) - Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG)
Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection.
It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z) - CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF)
Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts.
It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - Detect Any Deepfakes: Segment Anything Meets Face Forgery Detection and
Localization [30.317619885984005]
We introduce the well-trained vision segmentation foundation model, i.e., Segment Anything Model (SAM) in face forgery detection and localization.
Based on SAM, we propose the Detect Any Deepfakes (DADF) framework with the Multiscale Adapter.
The proposed framework seamlessly integrates end-to-end forgery localization and detection optimization.
arXiv Detail & Related papers (2023-06-29T16:25:04Z) - Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual
Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
Existing literature addresses this challenge by employing local-based representation approaches.
This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z) - SeeABLE: Soft Discrepancies and Bounded Contrastive Learning for
Exposing Deepfakes [7.553507857251396]
We propose a novel deepfake detector, called SeeABLE, that formalizes the detection problem as a (one-class) out-of-distribution detection task.
SeeABLE pushes perturbed faces towards predefined prototypes using a novel regression-based bounded contrastive loss.
We show that our model convincingly outperforms competing state-of-the-art detectors, while exhibiting highly encouraging generalization capabilities.
arXiv Detail & Related papers (2022-11-21T09:38:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.