Related papers: FVA-RAG: Falsification-Verification Alignment for Mitigating Sycophantic Hallucinations

FVA-RAG: Falsification-Verification Alignment for Mitigating Sycophantic Hallucinations

URL: http://arxiv.org/abs/2512.07015v1
Date: Sun, 07 Dec 2025 21:28:42 GMT
Title: FVA-RAG: Falsification-Verification Alignment for Mitigating Sycophantic Hallucinations
Authors: Mayank Ravishankara,
Abstract summary: Falsification-Verification Alignment RAG (FVA-RAG) is a framework that shifts the retrieval paradigm from Inductive Verification (seeking support) to Deductive Falsification (seeking disproof)<n>We introduce a dual-verification mechanism that explicitly weighs the draft answer against this "Anti-Context"<n>Preliminary experiments on a dataset of common misconceptions demonstrate that FVA-RAG significantly improves robustness against sycophantic hallucinations compared to standard RAG baselines.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retrieval-Augmented Generation (RAG) systems have significantly reduced hallucinations in Large Language Models (LLMs) by grounding responses in external context. However, standard RAG architectures suffer from a critical vulnerability: Retrieval Sycophancy. When presented with a query based on a false premise or a common misconception, vector-based retrievers tend to fetch documents that align with the user's bias rather than objective truth, leading the model to "hallucinate with citations." In this work, we introduce Falsification-Verification Alignment RAG (FVA-RAG), a framework that shifts the retrieval paradigm from Inductive Verification (seeking support) to Deductive Falsification (seeking disproof). Unlike existing "Self-Correction" methods that rely on internal consistency, FVA-RAG deploys a distinct Adversarial Retrieval Policy that actively generates "Kill Queries"-targeted search terms designed to surface contradictory evidence. We introduce a dual-verification mechanism that explicitly weighs the draft answer against this "Anti-Context." Preliminary experiments on a dataset of common misconceptions demonstrate that FVA-RAG significantly improves robustness against sycophantic hallucinations compared to standard RAG baselines, effectively acting as an inference-time "Red Team" for factual generation.

Related papers

Abductive Inference in Retrieval-Augmented Language Models: Generating and Validating Missing Premises [0.0]
We propose a framework that integrates abductive inference into retrieval-augmented LLMs.<n> Experimental results on abductive reasoning and multi-hop QA benchmarks show that our approach improves both answer accuracy and reasoning faithfulness.<n>This work highlights abductive inference as a promising direction for enhancing the robustness and explainability of RAG systems.
arXiv Detail & Related papers (2025-11-06T03:37:24Z)
Two Heads Are Better Than One: Audio-Visual Speech Error Correction with Dual Hypotheses [71.34350093068473]
This paper introduces a new paradigm for generative error correction (GER) framework in audio-visual speech recognition (AVSR)<n>Our framework, DualHyp, empowers a large language model (LLM) to compose independent N-best hypotheses from separate automatic speech recognition (ASR) and visual speech recognition (VSR) models.<n>Our framework attains up to 57.7% error rate gain on the LRS2 benchmark over standard ASR baseline, contrary to single-stream GER approaches that achieve only 10% gain.
arXiv Detail & Related papers (2025-10-15T08:27:16Z)
Enhancing Retrieval Augmentation via Adversarial Collaboration [50.117273835877334]
We propose the Adrial Collaboration RAG (AC-RAG) framework to address "Retrieval Hallucinations"<n>AC-RAG employs two heterogeneous agents: a generalist Detector that identifies knowledge gaps, and a domain-specialized Resolver that provides precise solutions.<n>Experiments show that AC-RAG significantly improves retrieval accuracy and outperforms state-of-the-art RAG methods across various vertical domains.
arXiv Detail & Related papers (2025-09-18T08:54:20Z)
MetaRAG: Metamorphic Testing for Hallucination Detection in RAG Systems [0.0]
We present MetaRAG, a testing framework for hallucination detection in Retrieval-Augmented Generation (RAG) systems.<n> MetaRAG operates in a real-time, unsupervised, black-box setting, requiring neither ground-truth references nor access to model internals.<n>Crucially for identity-aware AI, MetaRAG localizes unsupported claims at the factoid span where they occur.
arXiv Detail & Related papers (2025-09-11T11:18:23Z)
Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation [108.13261761812517]
We introduce FRANQ (Faithfulness-based Retrieval Augmented UNcertainty Quantification), a novel method for hallucination detection in RAG outputs.<n>We present a new long-form Question Answering (QA) dataset annotated for both factuality and faithfulness.
arXiv Detail & Related papers (2025-05-27T11:56:59Z)
The Silent Saboteur: Imperceptible Adversarial Attacks against Black-Box Retrieval-Augmented Generation Systems [101.68501850486179]
We explore adversarial attacks against retrieval-augmented generation (RAG) systems to identify their vulnerabilities.<n>This task aims to find imperceptible perturbations that retrieve a target document, originally excluded from the initial top-$k$ candidate set.<n>We propose ReGENT, a reinforcement learning-based framework that tracks interactions between the attacker and the target RAG.
arXiv Detail & Related papers (2025-05-24T08:19:25Z)
Retrieval is Not Enough: Enhancing RAG Reasoning through Test-Time Critique and Optimization [58.390885294401066]
Retrieval-augmented generation (RAG) has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>We propose AlignRAG, a novel iterative framework grounded in Critique-Driven Alignment (CDA)<n>We introduce AlignRAG-auto, an autonomous variant that dynamically terminates refinement, removing the need to pre-specify the number of critique iterations.
arXiv Detail & Related papers (2025-04-21T04:56:47Z)
CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation [3.8808821719659763]
We introduce Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation (CDF-RAG)<n>CDF-RAG iteratively refines queries, retrieves structured causal graphs, and enables multi-hop causal reasoning across interconnected knowledge sources.<n>We evaluate CDF-RAG on four diverse datasets, demonstrating its ability to improve response accuracy and causal correctness over existing RAG-based methods.
arXiv Detail & Related papers (2025-04-17T01:15:13Z)
Worse than Zero-shot? A Fact-Checking Dataset for Evaluating the Robustness of RAG Against Misleading Retrievals [5.605770511387228]
RAGuard is the first benchmark to evaluate the robustness of RAG systems against misleading retrievals.<n>Unlike prior benchmarks that rely on synthetic noise, our fact-checking dataset captures naturally occurring misinformation.
arXiv Detail & Related papers (2025-02-22T05:50:15Z)
Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework [77.45983464131977]
We focus on how likely it is that a RAG model's prediction is incorrect, resulting in uncontrollable risks in real-world applications.<n>Our research identifies two critical latent factors affecting RAG's confidence in its predictions.<n>We develop a counterfactual prompting framework that induces the models to alter these factors and analyzes the effect on their answers.
arXiv Detail & Related papers (2024-09-24T14:52:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.