Related papers: To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts

To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts

URL: http://arxiv.org/abs/2410.14675v2
Date: Mon, 17 Mar 2025 04:47:58 GMT
Title: To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts
Authors: Yukun Huang, Sanxing Chen, Hongyi Cai, Bhuwan Dhingra,
Abstract summary: Large Language Models (LLMs) are often augmented with external contexts, such as those used in retrieval-augmented generation (RAG)<n>We show that when provided with both correct and incorrect contexts, both open-source and proprietary models tend to overly rely on external information.<n>We propose two approaches: Self-Guided Confidence Reasoning (SCR) and Rule-Based Confidence Reasoning (RCR)
Score: 10.748768620243982
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) are often augmented with external contexts, such as those used in retrieval-augmented generation (RAG). However, these contexts can be inaccurate or intentionally misleading, leading to conflicts with the model's internal knowledge. We argue that robust LLMs should demonstrate situated faithfulness, dynamically calibrating their trust in external information based on their confidence in the internal knowledge and the external context to resolve knowledge conflicts. To benchmark this capability, we evaluate LLMs across several QA datasets, including a newly created dataset featuring in-the-wild incorrect contexts sourced from Reddit posts. We show that when provided with both correct and incorrect contexts, both open-source and proprietary models tend to overly rely on external information, regardless of its factual accuracy. To enhance situated faithfulness, we propose two approaches: Self-Guided Confidence Reasoning (SCR) and Rule-Based Confidence Reasoning (RCR). SCR enables models to self-assess the confidence of external information relative to their own internal knowledge to produce the most accurate answer. RCR, in contrast, extracts explicit confidence signals from the LLM and determines the final answer using predefined rules. Our results show that for LLMs with strong reasoning capabilities, such as GPT-4o and GPT-4o mini, SCR outperforms RCR, achieving improvements of up to 24.2% over a direct input augmentation baseline. Conversely, for a smaller model like Llama-3-8B, RCR outperforms SCR. Fine-tuning SCR with our proposed Confidence Reasoning Direct Preference Optimization (CR-DPO) method improves performance on both seen and unseen datasets, yielding an average improvement of 8.9% on Llama-3-8B. In addition to quantitative results, we offer insights into the relative strengths of SCR and RCR.

Related papers

Assessing Judging Bias in Large Reasoning Models: An Empirical Study [99.86300466350013]
Large Reasoning Models (LRMs) like DeepSeek-R1 and OpenAI-o1 have demonstrated remarkable reasoning capabilities. We present a benchmark comparing judging biases between LLMs and LRMs across both subjective preference-alignment datasets and objective fact-based datasets.
arXiv Detail & Related papers (2025-04-14T07:14:27Z)
Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models [39.73834207174728]
Retrieval-Augmented Generation (RAG) mitigates hallucinations in Large Language Models (LLMs) by integrating external knowledge. Conflicts between parametric knowledge and retrieved context pose challenges for RAGs. We propose **CK-PLUG**, a plug-and-play method for controlling RAG's reliance on parametric and contextual knowledge.
arXiv Detail & Related papers (2025-03-20T06:26:28Z)
CER: Confidence Enhanced Reasoning in LLMs [2.4392539322920763]
We introduce an uncertainty-aware framework designed to enhance the accuracy of Large Language Models responses. We quantify the confidence of intermediate answers such as numerical results in mathematical reasoning and proper nouns in open-domain generation. Results consistently validate the effectiveness of our novel confidence aggregation method.
arXiv Detail & Related papers (2025-02-20T15:16:42Z)
Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception [58.62352010928591]
Large language models (LLMs) exhibit impressive performance across diverse tasks but often struggle to accurately gauge their knowledge boundaries. This paper explores leveraging LLMs' internal states to enhance their perception of knowledge boundaries from efficiency and risk perspectives.
arXiv Detail & Related papers (2025-02-17T11:11:09Z)
Aligning Large Language Models for Faithful Integrity Against Opposing Argument [71.33552795870544]
Large Language Models (LLMs) have demonstrated impressive capabilities in complex reasoning tasks. They can be easily misled by unfaithful arguments during conversations, even when their original statements are correct. We propose a novel framework, named Alignment for Faithful Integrity with Confidence Estimation.
arXiv Detail & Related papers (2025-01-02T16:38:21Z)
Fact-Level Confidence Calibration and Self-Correction [64.40105513819272]
We propose a Fact-Level framework that calibrates confidence to relevance-weighted correctness at the fact level. We also develop Confidence-Guided Fact-level Self-Correction ($textbfConFix$), which uses high-confidence facts within a response as additional knowledge to improve low-confidence ones.
arXiv Detail & Related papers (2024-11-20T14:15:18Z)
Provenance: A Light-weight Fact-checker for Retrieval Augmented LLM Generation Output [49.893971654861424]
We present a light-weight approach for detecting nonfactual outputs from retrieval-augmented generation (RAG) We compute a factuality score that can be thresholded to yield a binary decision. Our experiments show high area under the ROC curve (AUC) across a wide range of relevant open source datasets.
arXiv Detail & Related papers (2024-11-01T20:44:59Z)
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation [93.38604803625294]
We present UncertaintyRAG, a novel approach for long-context Retrieval-Augmented Generation (RAG) We use Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks. UncertaintyRAG outperforms baselines by 2.03% on LLaMA-2-7B, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-03T17:39:38Z)
SFR-RAG: Towards Contextually Faithful LLMs [57.666165819196486]
Retrieval Augmented Generation (RAG) is a paradigm that integrates external contextual information with large language models (LLMs) to enhance factual accuracy and relevance. We introduce SFR-RAG, a small LLM that is instruction-textual with an emphasis on context-grounded generation and hallucination. We also present ConBench, a new evaluation framework compiling multiple popular and diverse RAG benchmarks.
arXiv Detail & Related papers (2024-09-16T01:08:18Z)
Confidence Estimation for LLM-Based Dialogue State Tracking [9.305763502526833]
Estimation of a model's confidence on its outputs is critical for Conversational AI systems based on large language models (LLMs) We provide an exhaustive exploration of methods, including approaches proposed for open- and closed-weight LLMs. Our findings suggest that fine-tuning open-weight LLMs can result in enhanced AUC performance, indicating better confidence score calibration.
arXiv Detail & Related papers (2024-09-15T06:44:26Z)
Improving Retrieval Augmented Language Model with Self-Reasoning [20.715106330314605]
We propose a novel self-reasoning framework aimed at improving the reliability and traceability of RALMs. The framework involves constructing self-reason trajectories with three processes: a relevance-aware process, an evidence-aware selective process, and a trajectory analysis process. We have evaluated our framework across four public datasets to demonstrate the superiority of our method.
arXiv Detail & Related papers (2024-07-29T09:05:10Z)
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales [29.33581578047835]
SaySelf is a training framework that teaches large language models to express more accurate fine-grained confidence estimates. In addition, SaySelf directs LLMs to produce self-reflective rationales that clearly identify gaps in their parametric knowledge. We show that the generated self-reflective rationales are reasonable and can further contribute to the calibration.
arXiv Detail & Related papers (2024-05-31T16:21:16Z)
Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models [14.5291643644017]
We introduce the concept of Confidence-Probability Alignment. We probe the alignment between models' internal and expressed confidence. Among the models analyzed, OpenAI's GPT-4 showed the strongest confidence-probability alignment.
arXiv Detail & Related papers (2024-05-25T15:42:04Z)
TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness [58.721012475577716]
Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, prompting a surge in their practical applications. This paper introduces TrustScore, a framework based on the concept of Behavioral Consistency, which evaluates whether an LLMs response aligns with its intrinsic knowledge.
arXiv Detail & Related papers (2024-02-19T21:12:14Z)
The Calibration Gap between Model and Human Confidence in Large Language Models [14.539888672603743]
Large language models (LLMs) need to be well-calibrated in the sense that they can accurately assess and communicate how likely it is that their predictions are correct. Recent work has focused on the quality of internal LLM confidence assessments. This paper explores the disparity between external human confidence in an LLM's responses and the internal confidence of the model.
arXiv Detail & Related papers (2024-01-24T22:21:04Z)
Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in Self-Refined Open-Source Models [53.859446823312126]
SoTA open source models of varying sizes from 7B - 65B, on average, improve 8.2% from their baseline performance. Strikingly, even models with extremely small memory footprints, such as Vicuna-7B, show a 11.74% improvement overall and up to a 25.39% improvement in high-creativity, open ended tasks.
arXiv Detail & Related papers (2023-10-11T15:56:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.