RADAR: Retrieval-Augmented Detector with Adversarial Refinement for Robust Fake News Detection
- URL: http://arxiv.org/abs/2601.03981v1
- Date: Wed, 07 Jan 2026 14:52:15 GMT
- Title: RADAR: Retrieval-Augmented Detector with Adversarial Refinement for Robust Fake News Detection
- Authors: Song-Duo Ma, Yi-Hung Liu, Hsin-Yu Lin, Pin-Yu Chen, Hong-Yan Huang, Shau-Yung Hsu, Yun-Nung Chen,
- Abstract summary: We present RADAR, a retrieval-augmented detector with adversarial refinement for robust fake news detection.<n>Our approach employs a generator that rewrites real articles with factual perturbations, paired with a lightweight detector that verifies claims using dense passage retrieval.
- Score: 50.073924438848316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To efficiently combat the spread of LLM-generated misinformation, we present RADAR, a retrieval-augmented detector with adversarial refinement for robust fake news detection. Our approach employs a generator that rewrites real articles with factual perturbations, paired with a lightweight detector that verifies claims using dense passage retrieval. To enable effective co-evolution, we introduce verbal adversarial feedback (VAF). Rather than relying on scalar rewards, VAF issues structured natural-language critiques; these guide the generator toward more sophisticated evasion attempts, compelling the detector to adapt and improve. On a fake news detection benchmark, RADAR achieves 86.98% ROC-AUC, significantly outperforming general-purpose LLMs with retrieval. Ablation studies confirm that detector-side retrieval yields the largest gains, while VAF and few-shot demonstrations provide critical signals for robust training.
Related papers
- Rethinking the Reranker: Boundary-Aware Evidence Selection for Robust Retrieval-Augmented Generation [64.09110141948693]
Retrieval-Augmented Generation (RAG) systems remain brittle under realistic retrieval noise.<n>We propose BAR-RAG, which reframes the reranker as a boundary-aware evidence selector that targets the generator's Goldilocks Zone.<n>Bar-RAG consistently improves end-to-end performance under noisy retrieval.
arXiv Detail & Related papers (2026-02-03T16:08:23Z) - Self-Disguise Attack: Induce the LLM to disguise itself for AIGT detection evasion [16.94434185181644]
Self-Disguise Attack (SDA) is a novel approach that enables Large Language Models to actively disguise its output.<n>We show that the SDA effectively reduces the average detection accuracy of various AIGT detectors across texts generated by three different LLMs.
arXiv Detail & Related papers (2025-08-20T04:17:03Z) - Injecting External Knowledge into the Reasoning Process Enhances Retrieval-Augmented Generation [41.28340070471627]
Retrieval-augmented generation (RAG) has been widely adopted to augment large language models (LLMs) with external knowledge for knowledge-intensive tasks.<n>RAG's effectiveness is often undermined by the presence of noisy (i.e., low-quality) retrieved passages.<n>We propose Passage Injection to enhance RAG's ability to recognize and resist noisy passages.
arXiv Detail & Related papers (2025-07-25T14:43:31Z) - Effective and Efficient Adversarial Detection for Vision-Language Models via A Single Vector [97.92369017531038]
We build a new laRge-scale Adervsarial images dataset with Diverse hArmful Responses (RADAR)
We then develop a novel iN-time Embedding-based AdveRSarial Image DEtection (NEARSIDE) method, which exploits a single vector that distilled from the hidden states of Visual Language Models (VLMs) to achieve the detection of adversarial images against benign ones in the input.
arXiv Detail & Related papers (2024-10-30T10:33:10Z) - Real-time Factuality Assessment from Adversarial Feedback [11.742257531343814]
We show that evaluations for assessing the factuality of news from conventional sources result in high accuracies over time for LLM-based detectors.<n>We argue that a proper factuality evaluation dataset should test a model's ability to reason about current events by retrieving and reading related evidence.
arXiv Detail & Related papers (2024-10-18T17:47:11Z) - RAFT: Realistic Attacks to Fool Text Detectors [16.749257564123194]
Large language models (LLMs) have exhibited remarkable fluency across various tasks.
Their unethical applications, such as disseminating disinformation, have become a growing concern.
We present RAFT: a grammar error-free black-box attack against existing LLM detectors.
arXiv Detail & Related papers (2024-10-04T17:59:00Z) - OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with
Adversarially Generated Examples [44.118047780553006]
OUTFOX is a framework that improves the robustness of LLM-generated-text detectors by allowing both the detector and the attacker to consider each other's output.
Experiments show that the proposed detector improves the detection performance on the attacker-generated texts by up to +41.3 points F1-score.
The detector shows a state-of-the-art detection performance: up to 96.9 points F1-score, beating existing detectors on non-attacked texts.
arXiv Detail & Related papers (2023-07-21T17:40:47Z) - RADAR: Robust AI-Text Detection via Adversarial Learning [69.5883095262619]
RADAR is based on adversarial training of a paraphraser and a detector.
The paraphraser's goal is to generate realistic content to evade AI-text detection.
RADAR uses the feedback from the detector to update the paraphraser, and vice versa.
arXiv Detail & Related papers (2023-07-07T21:13:27Z) - MGTBench: Benchmarking Machine-Generated Text Detection [54.81446366272403]
This paper proposes the first benchmark framework for MGT detection against powerful large language models (LLMs)
We show that a larger number of words in general leads to better performance and most detection methods can achieve similar performance with much fewer training samples.
Our findings indicate that the model-based detection methods still perform well in the text attribution task.
arXiv Detail & Related papers (2023-03-26T21:12:36Z) - Robust and Accurate Object Detection via Adversarial Learning [111.36192453882195]
This work augments the fine-tuning stage for object detectors by exploring adversarial examples.
Our approach boosts the performance of state-of-the-art EfficientDets by +1.1 mAP on the object detection benchmark.
arXiv Detail & Related papers (2021-03-23T19:45:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.