Related papers: Not All Contexts Are Equal: Teaching LLMs Credibility-aware Generation

Not All Contexts Are Equal: Teaching LLMs Credibility-aware Generation

URL: http://arxiv.org/abs/2404.06809v3
Date: Wed, 09 Oct 2024 17:16:15 GMT
Title: Not All Contexts Are Equal: Teaching LLMs Credibility-aware Generation
Authors: Ruotong Pan, Boxi Cao, Hongyu Lin, Xianpei Han, Jia Zheng, Sirui Wang, Xunliang Cai, Le Sun,
Abstract summary: Credibility-aware Generation (CAG) aims to equip models with the ability to discern and process information based on its credibility. Our model can effectively understand and utilize credibility for generation, significantly outperform other models with retrieval augmentation, and exhibit resilience against the disruption caused by noisy documents.
Score: 47.42366169887162
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid development of large language models has led to the widespread adoption of Retrieval-Augmented Generation (RAG), which integrates external knowledge to alleviate knowledge bottlenecks and mitigate hallucinations. However, the existing RAG paradigm inevitably suffers from the impact of flawed information introduced during the retrieval phrase, thereby diminishing the reliability and correctness of the generated outcomes. In this paper, we propose Credibility-aware Generation (CAG), a universally applicable framework designed to mitigate the impact of flawed information in RAG. At its core, CAG aims to equip models with the ability to discern and process information based on its credibility. To this end, we propose an innovative data transformation framework that generates data based on credibility, thereby effectively endowing models with the capability of CAG. Furthermore, to accurately evaluate the models' capabilities of CAG, we construct a comprehensive benchmark covering three critical real-world scenarios. Experimental results demonstrate that our model can effectively understand and utilize credibility for generation, significantly outperform other models with retrieval augmentation, and exhibit resilience against the disruption caused by noisy documents, thereby maintaining robust performance. Moreover, our model supports customized credibility, offering a wide range of potential applications.

Related papers

RADIANT: Retrieval AugmenteD entIty-context AligNmenT -- Introducing RAG-ability and Entity-Context Divergence [5.066415370344766]
Retrieval-Augmented Generation (RAG) is a technique to enhance factual accuracy by integrating external knowledge into the generation process.<n>This paper introduces Radiant, a framework that merges RAG with alignment designed to optimize the interplay between retrieved evidence and generated content.
arXiv Detail & Related papers (2025-06-28T21:40:35Z)
KARE-RAG: Knowledge-Aware Refinement and Enhancement for RAG [63.82127103851471]
Retrieval-Augmented Generation (RAG) enables large language models to access broader knowledge sources.<n>We demonstrate that enhancing generative models' capacity to process noisy content is equally critical for robust performance.<n>We present KARE-RAG, which improves knowledge utilization through three key innovations.
arXiv Detail & Related papers (2025-06-03T06:31:17Z)
Resolving Conflicting Evidence in Automated Fact-Checking: A Study on Retrieval-Augmented LLMs [12.923119372847834]
This paper presents the first systematic evaluation of Retrieval-Augmented Generation (RAG) models for fact-checking.<n>Experiments reveal critical vulnerabilities in state-of-the-art RAG methods, particularly in resolving conflicts stemming from differences in media source credibility.<n>Our results show that effectively incorporating source credibility significantly enhances the ability of RAG models to resolve conflicting evidence and improve fact-checking performance.
arXiv Detail & Related papers (2025-05-23T11:35:03Z)
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling [48.15636223774418]
Large language models (LLMs) frequently hallucinate due to misaligned self-awareness. Existing approaches mitigate hallucinations via uncertainty estimation or query rejection. We propose the Explicit Knowledge Boundary Modeling framework to integrate fast and slow reasoning systems.
arXiv Detail & Related papers (2025-03-04T03:16:02Z)
ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation [91.20492150248106]
We investigate the internal mechanisms behind unfaithful generation and identify a subset of mid-to-deep feed-forward networks (FFNs) that are disproportionately activated in such cases.<n>We propose Parametric Knowledge Muting through FFN Suppression (ParamMute), a framework that improves contextual faithfulness by suppressing the activation of unfaithfulness-associated FFNs.<n> Experimental results show that ParamMute significantly enhances faithfulness across both CoFaithfulQA and the established ConFiQA benchmark, achieving substantial reductions in reliance on parametric memory.
arXiv Detail & Related papers (2025-02-21T15:50:41Z)
Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey [92.36487127683053]
Retrieval-Augmented Generation (RAG) is an advanced technique designed to address the challenges of Artificial Intelligence-Generated Content (AIGC) RAG provides reliable and up-to-date external knowledge, reduces hallucinations, and ensures relevant context across a wide range of tasks. Despite RAG's success and potential, recent studies have shown that the RAG paradigm also introduces new risks, including privacy concerns, adversarial attacks, and accountability issues.
arXiv Detail & Related papers (2025-02-08T06:50:47Z)
RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects [12.5122702720856]
We propose Robust Fine-Tuning (RbFT) to enhance the resilience of large language models against retrieval defects. Experimental results demonstrate that RbFT significantly improves the robustness of RAG systems across diverse retrieval conditions.
arXiv Detail & Related papers (2025-01-30T14:15:09Z)
Towards More Robust Retrieval-Augmented Generation: Evaluating RAG Under Adversarial Poisoning Attacks [45.07581174558107]
Retrieval-Augmented Generation (RAG) systems have emerged as a promising solution to mitigate hallucinations. RAG systems are vulnerable to adversarial poisoning attacks, where malicious passages injected into retrieval databases can mislead the model into generating factually incorrect outputs. This paper investigates both the retrieval and the generation components of RAG systems to understand how to enhance their robustness against such attacks.
arXiv Detail & Related papers (2024-12-21T17:31:52Z)
KBAlign: Efficient Self Adaptation on Specific Knowledge Bases [73.34893326181046]
We present KBAlign, a self-supervised framework that enhances RAG systems through efficient model adaptation.<n>Our key insight is to leverage the model's intrinsic capabilities for knowledge alignment through two innovative mechanisms.<n> Experiments demonstrate that KBAlign can achieve 90% of the performance gain obtained through GPT-4-supervised adaptation.
arXiv Detail & Related papers (2024-11-22T08:21:03Z)
Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework [77.45983464131977]
We focus on how likely it is that a RAG model's prediction is incorrect, resulting in uncontrollable risks in real-world applications. Our research identifies two critical latent factors affecting RAG's confidence in its predictions. We develop a counterfactual prompting framework that induces the models to alter these factors and analyzes the effect on their answers.
arXiv Detail & Related papers (2024-09-24T14:52:14Z)
Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs) We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z)
Graph Retrieval Augmented Trustworthiness Reasoning [1.1660282484277826]
We introduce the Graph Retrieval Augmented Reasoning (GRATR) framework to bolster trustworthiness reasoning in agents. GRATR constructs a dynamic trustworthiness graph, updating it in real-time with evidential information. Our results demonstrate GRATR surpasses the baseline methods by over 30% in winning rate, with superior reasoning performance.
arXiv Detail & Related papers (2024-08-22T12:21:22Z)
Improving Retrieval Augmented Language Model with Self-Reasoning [20.715106330314605]
We propose a novel self-reasoning framework aimed at improving the reliability and traceability of RALMs. The framework involves constructing self-reason trajectories with three processes: a relevance-aware process, an evidence-aware selective process, and a trajectory analysis process. We have evaluated our framework across four public datasets to demonstrate the superiority of our method.
arXiv Detail & Related papers (2024-07-29T09:05:10Z)
Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models [21.01313168005792]
We reveal the vulnerabilities of Retrieval-Enhanced Generative (RAG) models when faced with black-box attacks for opinion manipulation. We explore the impact of such attacks on user cognition and decision-making.
arXiv Detail & Related papers (2024-07-18T17:55:55Z)
"Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models [74.05368440735468]
Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs) In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases.
arXiv Detail & Related papers (2024-06-26T05:36:23Z)
The Risk of Federated Learning to Skew Fine-Tuning Features and Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model. We introduce three robustness indicators and conduct experiments across diverse robust datasets. Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z)
Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free Metric [5.592917884093537]
The potential of quality estimation (QE) metrics is introduced and evaluated as a novel tool to enhance explainable artificial intelligence (XAI) in ASR systems. The capabilities of the NoRefER metric are explored in identifying word-level errors to aid post-editors in refining ASR hypotheses.
arXiv Detail & Related papers (2024-01-20T16:48:55Z)
Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions. We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.