Related papers: Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization

Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization

URL: http://arxiv.org/abs/2501.13573v1
Date: Thu, 23 Jan 2025 11:23:25 GMT
Title: Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization
Authors: Lei Huang, Xiaocheng Feng, Weitao Ma, Yuchun Fan, Xiachong Feng, Yangfan Ye, Weihong Zhong, Yuxuan Gu, Baoxin Wang, Dayong Wu, Guoping Hu, Bing Qin,
Abstract summary: We propose RHIO, a framework to teach large language models to explicitly discriminate between faithful and unfaithful generations.<n> RHIO first augments unfaithful samples that simulate realistic model-intrinsic errors by selectively masking retrieval heads.<n>These samples are incorporated into joint training, enabling the model to distinguish unfaithful outputs from faithful ones conditioned on control tokens.
Score: 35.269343563526675
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ensuring contextual faithfulness in retrieval-augmented large language models (LLMs) is crucial for building trustworthy information-seeking systems, particularly in long-form question-answering (LFQA) scenarios. In this work, we identify a salient correlation between LFQA faithfulness and retrieval heads, a set of attention heads responsible for retrieving contextual information. Leveraging this insight, we propose RHIO, a framework designed to teach LLMs to explicitly discriminate between faithful and unfaithful generations. RHIO first augments unfaithful samples that simulate realistic model-intrinsic errors by selectively masking retrieval heads. Then, these samples are incorporated into joint training, enabling the model to distinguish unfaithful outputs from faithful ones conditioned on control tokens. Furthermore, these control tokens are leveraged to self-induce contrastive outputs, amplifying their difference through contrastive decoding. Additionally, to facilitate the evaluation of contextual faithfulness, we also introduce GroundBench, a comprehensive benchmark compiled from five existing LFQA datasets. Extensive experimental results on GroundBench demonstrate that RHIO significantly improves faithfulness, even outperforming GPT-4o.

Related papers

ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation [91.20492150248106]
We investigate the internal mechanisms behind unfaithful generation and identify a subset of mid-to-deep feed-forward networks (FFNs) that are disproportionately activated in such cases.<n>We propose Parametric Knowledge Muting through FFN Suppression (ParamMute), a framework that improves contextual faithfulness by suppressing the activation of unfaithfulness-associated FFNs.<n> Experimental results show that ParamMute significantly enhances faithfulness across both CoFaithfulQA and the established ConFiQA benchmark, achieving substantial reductions in reliance on parametric memory.
arXiv Detail & Related papers (2025-02-21T15:50:41Z)
Aligning Large Language Models for Faithful Integrity Against Opposing Argument [71.33552795870544]
Large Language Models (LLMs) have demonstrated impressive capabilities in complex reasoning tasks.<n>They can be easily misled by unfaithful arguments during conversations, even when their original statements are correct.<n>We propose a novel framework, named Alignment for Faithful Integrity with Confidence Estimation.
arXiv Detail & Related papers (2025-01-02T16:38:21Z)
Graph-based Confidence Calibration for Large Language Models [22.394717844099684]
We propose using an auxiliary learning model to assess response correctness based on the self-consistency of multiple outputs generated by the large language models.<n>Our method builds a consistency graph to represent the agreement among multiple responses and uses a graph neural network (GNN) to estimate the likelihood that each response is correct.
arXiv Detail & Related papers (2024-11-03T20:36:44Z)
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows" [74.7488607599921]
FaithEval is a benchmark to evaluate the faithfulness of large language models (LLMs) in contextual scenarios. FaithEval comprises 4.9K high-quality problems in total, validated through a rigorous four-stage context construction and validation framework.
arXiv Detail & Related papers (2024-09-30T06:27:53Z)
Improving Retrieval Augmented Language Model with Self-Reasoning [20.715106330314605]
We propose a novel self-reasoning framework aimed at improving the reliability and traceability of RALMs.<n>The framework involves constructing self-reason trajectories with three processes: a relevance-aware process, an evidence-aware selective process, and a trajectory analysis process.<n>We have evaluated our framework across four public datasets to demonstrate the superiority of our method.
arXiv Detail & Related papers (2024-07-29T09:05:10Z)
Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation [96.78845113346809]
Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks. This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decoding dynamics to detect unfaithful sentences. We also introduce FOD, a faithfulness-oriented decoding algorithm guided by beam search for long-form retrieval-augmented generation.
arXiv Detail & Related papers (2024-06-19T16:42:57Z)
Test-Time Fairness and Robustness in Large Language Models [17.758735680493917]
Frontier Large Language Models (LLMs) can be socially discriminatory or sensitive to spurious features of their inputs. Existing solutions, which instruct the LLM to be fair or robust, rely on the model's implicit understanding of bias. We show that our prompting strategy, unlike implicit instructions, consistently reduces the bias of frontier LLMs.
arXiv Detail & Related papers (2024-06-11T20:05:15Z)
Enhancing Retrieval-Augmented LMs with a Two-stage Consistency Learning Compressor [4.35807211471107]
This work proposes a novel two-stage consistency learning approach for retrieved information compression in retrieval-augmented language models. The proposed method is empirically validated across multiple datasets, demonstrating notable enhancements in precision and efficiency for question-answering tasks.
arXiv Detail & Related papers (2024-06-04T12:43:23Z)
VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models [57.43276586087863]
Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs. Existing benchmarks are often limited in scope, focusing mainly on object hallucinations. We introduce a multi-dimensional benchmark covering objects, attributes, and relations, with challenging images selected based on associative biases.
arXiv Detail & Related papers (2024-04-22T04:49:22Z)
Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score. Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score. Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z)
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations [111.88727295707454]
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of NLP. We propose a benchmark construction protocol that ensures clear differentiation and challenging distribution shifts. We conduct experiments on pre-trained language models for analysis and evaluation of OOD robustness.
arXiv Detail & Related papers (2023-06-07T17:47:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.