Task Matters: Knowledge Requirements Shape LLM Responses to Context-Memory Conflict
- URL: http://arxiv.org/abs/2506.06485v2
- Date: Thu, 11 Sep 2025 15:55:28 GMT
- Title: Task Matters: Knowledge Requirements Shape LLM Responses to Context-Memory Conflict
- Authors: Kaiser Sun, Fan Bai, Mark Dredze,
- Abstract summary: Large Language Models require both contextual knowledge and parametric memory, but these sources can disagree.<n>We study this question with a model-agnostic diagnostic framework that automatically detects disagreements between a model's beliefs and a curated knowledge set.<n>We find that performance degradation from conflict correlates with a task's knowledge reliance.
- Score: 13.091464232666835
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models require both contextual knowledge and parametric memory, but these sources can disagree. Prior investigations on contextual question answering tasks report a preference toward parametric knowledge under conflict, yet they focus almost exclusively on tasks that should always rely on the given passage, leaving open how this behavior manifests when tasks demand different amounts and kinds of knowledge. We study this question with a model-agnostic diagnostic framework that (i) automatically detects disagreements between a model's beliefs and a curated knowledge set, and (ii) injects controlled conflicts into tasks. The resulting datasets span two orthogonal dimensions: task knowledge reliance and conflict plausibility. Evaluating representative open-source LLMs, we find that: (1) performance degradation from conflict correlates with a task's knowledge reliance; (2) explanatory rationales and simple reiteration both increase context reliance-helpful for context-only tasks but harmful when parametric knowledge should dominate; (3) These behaviors raise concerns about the validity of model-based evaluation and underscore the need to account for knowledge conflict in the deployment of LLMs.
Related papers
- CC-VQA: Conflict- and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question Answering [53.7094431951084]
Knowledge-based visual question answering (KB-VQA) demonstrates significant potential for handling knowledge-intensive tasks.<n>Conflicts arise between static parametric knowledge in vision language models and dynamically retrieved information.<n>We propose textbfCC-VQA as a training-free, conflict- and correlation-aware method for KB-VQA.
arXiv Detail & Related papers (2026-02-27T11:56:26Z) - That's Deprecated! Understanding, Detecting, and Steering Knowledge Conflicts in Language Models for Code Generation [55.78914774437411]
Large language models (LLMs) behave when faced with discrepancies between their parametric knowledge and conflicting information contained in a prompt.<n>We propose a domain-agnostic framework for constructing and interpreting such conflicts.<n>We show that activation-level steering can achieve up to a textbf12.6% improvement in steering success over a random baseline.
arXiv Detail & Related papers (2025-10-21T22:27:56Z) - Probing Latent Knowledge Conflict for Faithful Retrieval-Augmented Generation [46.03923254984181]
Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm to enhance the factuality of Large Language Models (LLMs)<n>Existing approaches to improving contextual faithfulness rely on external interventions, such as prompt engineering, decoding constraints, or reward-based fine-tuning.<n>We propose CLEAR (Conflict-Localized and Enhanced Attention for RAG), a framework that decomposes context into fine-grained sentence-level knowledge.
arXiv Detail & Related papers (2025-10-14T12:48:24Z) - FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation [37.28571879699906]
Large language models (LLMs) augmented with retrieval systems have demonstrated significant potential in handling knowledge-intensive tasks.<n>This paper proposes FaithfulRAG, a novel framework that resolves knowledge conflicts by explicitly modeling discrepancies between the models parametric knowledge and retrieved context.
arXiv Detail & Related papers (2025-06-10T16:02:54Z) - Conflicts in Texts: Data, Implications and Challenges [58.03478157713084]
Conflicts could reflect the complexity of situations, changes that need to be explained and dealt with, difficulties in data annotation, and mistakes in generated outputs.<n>This survey categorizes these conflicts into three key areas: (1) natural texts on the web, where factual inconsistencies, subjective biases, and multiple perspectives introduce contradictions; (2) human-annotated data, where annotator disagreements, mistakes, and societal biases impact model training; and (3) model interactions, where hallucinations and knowledge conflicts emerge during deployment.<n>We highlight key challenges and future directions for developing conflict-aware NLP systems that can reason over and reconcile conflicting information more effectively
arXiv Detail & Related papers (2025-04-28T04:24:01Z) - Disentangling Memory and Reasoning Ability in Large Language Models [97.26827060106581]
We propose a new inference paradigm that decomposes the complex inference process into two distinct and clear actions.<n>Our experiment results show that this decomposition improves model performance and enhances the interpretability of the inference process.
arXiv Detail & Related papers (2024-11-20T17:55:38Z) - Mitigating Knowledge Conflicts in Language Model-Driven Question Answering [15.29366851382021]
Two fundamental knowledge sources play crucial roles in document-based question answering and document summarization systems.<n>Recent studies revealed a significant challenge: when there exists a misalignment between the model's inherent knowledge and the ground truth answers in training data, the system may exhibit problematic behaviors during inference.<n>Our investigation proposes a strategy to minimize hallucination by building explicit connection between source inputs and generated outputs.
arXiv Detail & Related papers (2024-11-18T07:33:10Z) - Analysing the Residual Stream of Language Models Under Knowledge Conflicts [23.96385393039587]
Large language models (LLMs) can store a significant amount of factual knowledge in their parameters.<n>However, their parametric knowledge may conflict with the information provided in the context.<n>This can lead to undesirable model behaviour, such as reliance on outdated or incorrect information.
arXiv Detail & Related papers (2024-10-21T15:12:51Z) - ECon: On the Detection and Resolution of Evidence Conflicts [56.89209046429291]
The rise of large language models (LLMs) has significantly influenced the quality of information in decision-making systems.
This study introduces a method for generating diverse, validated evidence conflicts to simulate real-world misinformation scenarios.
arXiv Detail & Related papers (2024-10-05T07:41:17Z) - Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models [33.76903352835436]
Large Vision-Language Models (LVLMs) have demonstrated impressive capabilities for capturing and reasoning over multimodal inputs.
These models are prone to parametric knowledge conflicts, which arise from inconsistencies of represented knowledge between their vision and language components.
We present a systematic approach to detect, interpret, and mitigate them.
arXiv Detail & Related papers (2024-10-04T17:59:28Z) - DYNAMICQA: Tracing Internal Knowledge Conflicts in Language Models [42.776896363518844]
We study the effect of intra-memory conflict on an LM's ability to accept relevant context.
We utilize two knowledge conflict measures and a novel dataset containing inherently conflicting data, DynamicQA.
We verify that LMs exhibit a greater degree of intra-memory conflict with dynamic facts compared to facts that have a single truth value.
arXiv Detail & Related papers (2024-07-24T06:06:07Z) - Studying Large Language Model Behaviors Under Context-Memory Conflicts With Real Documents [54.953320616069654]
Retrieval-augmented generation mitigates many problems of fully parametric language models.
In RAG, the model's knowledge can be updated from documents provided in context.
We present a framework for studying such knowledge conflicts in a realistic setup.
arXiv Detail & Related papers (2024-04-24T17:59:36Z) - LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements [59.71218039095155]
Task of reading comprehension (RC) provides a primary means to assess language models' natural language understanding (NLU) capabilities.
If the context aligns with the models' internal knowledge, it is hard to discern whether the models' answers stem from context comprehension or from internal information.
To address this issue, we suggest to use RC on imaginary data, based on fictitious facts and entities.
arXiv Detail & Related papers (2024-04-09T13:08:56Z) - Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models [51.72963030032491]
Knowledge documents for large language models (LLMs) may conflict with the memory of LLMs due to outdated or incorrect knowledge.
We construct a new dataset, dubbed KNOT, for knowledge conflict resolution examination in the form of question answering.
arXiv Detail & Related papers (2024-04-04T16:40:11Z) - Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint [20.543282448771336]
We propose an adaptive decoding method to discern whether the knowledge conflicts occur and resolve them.
Experiments show that COIECD exhibits strong performance and robustness over knowledge conflicts in realistic datasets.
arXiv Detail & Related papers (2024-02-19T07:10:30Z) - Resolving Knowledge Conflicts in Large Language Models [46.903549751371415]
Large language models (LLMs) often encounter knowledge conflicts.
We ask what are the desiderata for LLMs when a knowledge conflict arises and whether existing LLMs fulfill them.
We introduce an evaluation framework for simulating contextual knowledge conflicts.
arXiv Detail & Related papers (2023-10-02T06:57:45Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - Context-faithful Prompting for Large Language Models [51.194410884263135]
Large language models (LLMs) encode parametric knowledge about world facts.
Their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks.
We assess and enhance LLMs' contextual faithfulness in two aspects: knowledge conflict and prediction with abstention.
arXiv Detail & Related papers (2023-03-20T17:54:58Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - elBERto: Self-supervised Commonsense Learning for Question Answering [131.51059870970616]
We propose a Self-supervised Bidirectional Representation Learning of Commonsense framework, which is compatible with off-the-shelf QA model architectures.
The framework comprises five self-supervised tasks to force the model to fully exploit the additional training signals from contexts containing rich commonsense.
elBERto achieves substantial improvements on out-of-paragraph and no-effect questions where simple lexical similarity comparison does not help.
arXiv Detail & Related papers (2022-03-17T16:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.