Related papers: CoCoA: Confidence and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models

CoCoA: Confidence and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models

URL: http://arxiv.org/abs/2508.17670v2
Date: Wed, 27 Aug 2025 08:29:14 GMT
Title: CoCoA: Confidence and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models
Authors: Anant Khandelwal, Manish Gupta, Puneet Agrawal,
Abstract summary: CoCoA (Confidence- and Context-Aware Adaptive Decoding) is a novel token-level algorithm for principled conflict resolution and enhanced faithfulness.<n>CoCoA resolves conflict by utilizing confidence-aware measures (entropy gap and contextual peakedness) and the generalized divergence between the parametric and contextual distributions.
Score: 24.693047847053023
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Faithful generation in large language models (LLMs) is challenged by knowledge conflicts between parametric memory and external context. Existing contrastive decoding methods tuned specifically to handle conflict often lack adaptability and can degrade performance in low conflict settings. We introduce CoCoA (Confidence- and Context-Aware Adaptive Decoding), a novel token-level algorithm for principled conflict resolution and enhanced faithfulness. CoCoA resolves conflict by utilizing confidence-aware measures (entropy gap and contextual peakedness) and the generalized divergence between the parametric and contextual distributions. Crucially, CoCoA maintains strong performance even in low conflict settings. Extensive experiments across multiple LLMs on diverse Question Answering (QA), Summarization, and Long-Form Question Answering (LFQA) benchmarks demonstrate CoCoA's state-of-the-art performance over strong baselines like AdaCAD. It yields significant gains in QA accuracy, up to 9.2 points on average compared to the strong baseline AdaCAD, and improves factuality in summarization and LFQA by up to 2.5 points on average across key benchmarks. Additionally, it demonstrates superior sensitivity to conflict variations. CoCoA enables more informed, context-aware, and ultimately more faithful token generation.

Related papers

CC-VQA: Conflict- and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question Answering [53.7094431951084]
Knowledge-based visual question answering (KB-VQA) demonstrates significant potential for handling knowledge-intensive tasks.<n>Conflicts arise between static parametric knowledge in vision language models and dynamically retrieved information.<n>We propose textbfCC-VQA as a training-free, conflict- and correlation-aware method for KB-VQA.
arXiv Detail & Related papers (2026-02-27T11:56:26Z)
Diagnosing Knowledge Conflict in Multimodal Long-Chain Reasoning [78.86309644343295]
Multimodal large language models (MLLMs) in long chain-of-thought reasoning often fail when different knowledge sources provide conflicting signals.<n>We formalize these failures under a unified notion of knowledge conflict, distinguishing input-level objective conflict from process-level effective conflict.<n>Our findings provide a mechanism-level view of multimodal reasoning under knowledge conflict and enable principled diagnosis and control of long-CoT failures.
arXiv Detail & Related papers (2026-02-16T07:10:44Z)
Listen to the Layers: Mitigating Hallucinations with Inter-Layer Disagreement [0.24443539255794253]
Pretrained Large Language Models (LLMs) are prone to generating fluent yet factually incorrect text-a phenomenon known as hallucinations.<n>We propose a novel, training-free decoding algorithm that mitigates hallucinations at inference time by listening to these signals in the middle layers.
arXiv Detail & Related papers (2026-02-10T07:32:37Z)
That's Deprecated! Understanding, Detecting, and Steering Knowledge Conflicts in Language Models for Code Generation [55.78914774437411]
Large language models (LLMs) behave when faced with discrepancies between their parametric knowledge and conflicting information contained in a prompt.<n>We propose a domain-agnostic framework for constructing and interpreting such conflicts.<n>We show that activation-level steering can achieve up to a textbf12.6% improvement in steering success over a random baseline.
arXiv Detail & Related papers (2025-10-21T22:27:56Z)
Harnessing Consistency for Robust Test-Time LLM Ensemble [88.55393815158608]
CoRE is a plug-and-play technique that harnesses model consistency for robust LLM ensemble.<n> Token-level consistency captures fine-grained disagreements by applying a low-pass filter to downweight uncertain tokens.<n>Model-level consistency models global agreement by promoting model outputs with high self-confidence.
arXiv Detail & Related papers (2025-10-12T04:18:45Z)
Conflict-Aware Soft Prompting for Retrieval-Augmented Generation [7.20732238547724]
Retrieval-augmented generation (RAG) enhances the capabilities of large language models (LLMs) by incorporating external knowledge into their input prompts.<n>RAG often fails to resolve the conflict between incorrect external context and correct parametric knowledge, known as context-memory conflict.<n>We introduce Conflict-Aware REtrieval-Augmented Generation (CARE), consisting of a context assessor and a base LLM.<n>CARE effectively mitigates context-memory conflicts, leading to an average performance gain of 5.0% on QA and fact-checking benchmarks.
arXiv Detail & Related papers (2025-08-21T05:36:29Z)
KaFT: Knowledge-aware Fine-tuning for Boosting LLMs' Domain-specific Question-Answering Performance [83.99974309930072]
Supervised fine-tuning (SFT) is a common approach to improve the domain-specific question-answering (QA) performance of large language models (LLMs)
arXiv Detail & Related papers (2025-05-21T12:55:28Z)
ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation [91.20492150248106]
We investigate the internal mechanisms behind unfaithful generation and identify a subset of mid-to-deep feed-forward networks (FFNs) that are disproportionately activated in such cases.<n>We propose Parametric Knowledge Muting through FFN Suppression (ParamMute), a framework that improves contextual faithfulness by suppressing the activation of unfaithfulness-associated FFNs.<n> Experimental results show that ParamMute significantly enhances faithfulness across both CoFaithfulQA and the established ConFiQA benchmark, achieving substantial reductions in reliance on parametric memory.
arXiv Detail & Related papers (2025-02-21T15:50:41Z)
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge [57.66282463340297]
Knowledge conflict arises from discrepancies between information in the context of a large language model and the knowledge stored in its parameters.<n>We propose a fine-grained, instance-level approach called AdaCAD, which dynamically infers the weight of adjustment based on the degree of conflict.<n>We show that ADACAD consistently outperforms other decoding baselines with average QA accuracy gains of 14.21% (absolute) over a static contrastive baseline, and improves the factuality of summaries by 6.19 (AlignScore)
arXiv Detail & Related papers (2024-09-11T16:35:18Z)
Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint [20.543282448771336]
We propose an adaptive decoding method to discern whether the knowledge conflicts occur and resolve them. Experiments show that COIECD exhibits strong performance and robustness over knowledge conflicts in realistic datasets.
arXiv Detail & Related papers (2024-02-19T07:10:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.