Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2406.13692v2
- Date: Thu, 03 Oct 2024 20:55:18 GMT
- Title: Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation
- Authors: Di Wu, Jia-Chen Gu, Fan Yin, Nanyun Peng, Kai-Wei Chang,
- Abstract summary: Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks.
This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decoding dynamics to detect unfaithful sentences.
We also introduce FOD, a faithfulness-oriented decoding algorithm guided by beam search for long-form retrieval-augmented generation.
- Score: 96.78845113346809
- License:
- Abstract: Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks. However, there are significant trustworthiness concerns as RALMs are prone to generating unfaithful outputs, including baseless information or contradictions with the retrieved context. This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decoding dynamics including sequence likelihood, uncertainty quantification, context influence, and semantic alignment to synchronously detect unfaithful sentences. By integrating efficiently measurable and complementary signals, SynCheck enables accurate and immediate feedback and intervention, achieving 0.85 AUROC in detecting faithfulness errors across six long-form retrieval-augmented generation tasks, improving prior best method by 4%. Leveraging SynCheck, we further introduce FOD, a faithfulness-oriented decoding algorithm guided by beam search for long-form retrieval-augmented generation. Empirical results demonstrate that FOD outperforms traditional strategies such as abstention, reranking, or contrastive decoding significantly in terms of faithfulness, achieving over 10% improvement across six datasets.
Related papers
- Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge [90.8674158031845]
We propose Crowd-based Comparative Evaluation, which introduces additional crowd responses to compare with the candidate responses.
This process effectively guides LLM-as-a-Judge to provide a more detailed chain-of-thought (CoT) judgment.
Our method produces higher-quality CoTs that facilitate judge distillation and exhibit superior performance in rejection sampling.
arXiv Detail & Related papers (2025-02-18T03:31:06Z) - RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects [12.5122702720856]
We propose Robust Fine-Tuning (RbFT) to enhance the resilience of large language models against retrieval defects.
Experimental results demonstrate that RbFT significantly improves the robustness of RAG systems across diverse retrieval conditions.
arXiv Detail & Related papers (2025-01-30T14:15:09Z) - Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization [35.269343563526675]
We propose RHIO, a framework to teach large language models to explicitly discriminate between faithful and unfaithful generations.
RHIO first augments unfaithful samples that simulate realistic model-intrinsic errors by selectively masking retrieval heads.
These samples are incorporated into joint training, enabling the model to distinguish unfaithful outputs from faithful ones conditioned on control tokens.
arXiv Detail & Related papers (2025-01-23T11:23:25Z) - Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models [26.51079570548107]
Large language models (LLMs) often suffer from context faithfulness hallucinations.
We propose Dynamic Attention-Guided Context Decoding (DAGCD)
DAGCD integrates attention distributions and uncertainty signals in a single-pass decoding process.
arXiv Detail & Related papers (2025-01-02T05:07:06Z) - UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation [93.38604803625294]
We present UncertaintyRAG, a novel approach for long-context Retrieval-Augmented Generation (RAG)
We use Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks.
UncertaintyRAG outperforms baselines by 2.03% on LLaMA-2-7B, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-03T17:39:38Z) - Disentangled Noisy Correspondence Learning [56.06801962154915]
Cross-modal retrieval is crucial in understanding latent correspondences across modalities.
DisNCL is a novel information-theoretic framework for feature Disentanglement in Noisy Correspondence Learning.
arXiv Detail & Related papers (2024-08-10T09:49:55Z) - Enhancing Retrieval-Augmented LMs with a Two-stage Consistency Learning Compressor [4.35807211471107]
This work proposes a novel two-stage consistency learning approach for retrieved information compression in retrieval-augmented language models.
The proposed method is empirically validated across multiple datasets, demonstrating notable enhancements in precision and efficiency for question-answering tasks.
arXiv Detail & Related papers (2024-06-04T12:43:23Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.