Related papers: Studying Large Language Model Behaviors Under Context-Memory Conflicts With Real Documents

Studying Large Language Model Behaviors Under Context-Memory Conflicts With Real Documents

URL: http://arxiv.org/abs/2404.16032v2
Date: Tue, 08 Oct 2024 18:07:33 GMT
Title: Studying Large Language Model Behaviors Under Context-Memory Conflicts With Real Documents
Authors: Evgenii Kortukov, Alexander Rubinstein, Elisa Nguyen, Seong Joon Oh,
Abstract summary: Retrieval-augmented generation mitigates many problems of fully parametric language models. In RAG, the model's knowledge can be updated from documents provided in context. We present a framework for studying such knowledge conflicts in a realistic setup.
Score: 54.953320616069654
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Retrieval-augmented generation (RAG) mitigates many problems of fully parametric language models, such as temporal degradation, hallucinations, and lack of grounding. In RAG, the model's knowledge can be updated from documents provided in context. This leads to cases of conflict between the model's parametric knowledge and the contextual information, where the model may not always update its knowledge. Previous work studied context-memory knowledge conflicts by creating synthetic documents that contradict the model's correct parametric answers. We present a framework for studying such knowledge conflicts in a realistic setup. We update incorrect parametric knowledge using real conflicting documents. This reflects how knowledge conflicts arise in practice. In this realistic scenario, we find that knowledge updates fail less often than previously reported. In cases where the models still fail to update their answers, we find a parametric bias: the incorrect parametric answer appearing in context makes the knowledge update likelier to fail. These results suggest that the factual parametric knowledge of LLMs can negatively influence their reading abilities and behaviors. Our code is available at https://github.com/kortukov/realistic_knowledge_conflicts/ .

Related papers

FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation [37.28571879699906]
Large language models (LLMs) augmented with retrieval systems have demonstrated significant potential in handling knowledge-intensive tasks.<n>This paper proposes FaithfulRAG, a novel framework that resolves knowledge conflicts by explicitly modeling discrepancies between the models parametric knowledge and retrieved context.
arXiv Detail & Related papers (2025-06-10T16:02:54Z)
What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models [16.41477610681199]
Large language models frequently rely on both contextual input and parametric knowledge to perform tasks.<n>These sources can come into conflict, especially when retrieved documents contradict the model's parametric beliefs.<n>We propose a diagnostic framework to systematically evaluate LLM behavior under context-memory conflict.
arXiv Detail & Related papers (2025-06-06T19:20:23Z)
Knowledge Updating? No More Model Editing! Just Selective Contextual Reasoning [38.018263569983226]
We provide an evaluation of ten model editing methods along four dimensions: reliability, generalization, locality, and portability. We then propose a straightforward method called Selective Contextual Reasoning (SCR) for knowledge updating.
arXiv Detail & Related papers (2025-03-07T08:04:25Z)
Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance [68.56701216210617]
In-principle, one would expect models to adapt to the user context better after instruction finetuning. We observe a surprising failure mode: during instruction tuning, the context reliance under knowledge conflicts initially increases as expected, but then gradually decreases.
arXiv Detail & Related papers (2024-10-14T17:57:09Z)
Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing. Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z)
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia [57.31074448586854]
Large language models (LLMs) have an impressive ability to draw on novel information supplied in their context. Yet the mechanisms underlying this contextual grounding remain unknown. We present a novel method to study grounding abilities using Fakepedia.
arXiv Detail & Related papers (2023-12-04T17:35:42Z)
R-Tuning: Instructing Large Language Models to Say `I Don't Know' [66.11375475253007]
Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges. Previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not. We present a new approach called Refusal-Aware Instruction Tuning (R-Tuning) Experimental results demonstrate R-Tuning effectively improves a model's ability to answer known questions and refrain from answering unknown questions.
arXiv Detail & Related papers (2023-11-16T08:45:44Z)
RECKONING: Reasoning through Dynamic Knowledge Encoding [51.076603338764706]
We show that language models can answer questions by reasoning over knowledge provided as part of the context. In these situations, the model fails to distinguish the knowledge that is necessary to answer the question. We propose teaching the model to reason more robustly by folding the provided contextual knowledge into the model's parameters.
arXiv Detail & Related papers (2023-05-10T17:54:51Z)
DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering [34.70206857546496]
Question answering models commonly have access to two sources of "knowledge" during inference time. It is unclear whether the answer stems from the given non-parametric knowledge or not. We propose a new paradigm in which QA models are trained to disentangle the two sources of knowledge.
arXiv Detail & Related papers (2022-11-10T15:34:44Z)
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence [37.18100697469402]
We simulate knowledge conflicts where parametric knowledge suggests one answer and different passages suggest different answers. We find retrieval performance heavily impacts which sources models rely on, and current models mostly rely on non-performing knowledge. We present a new calibration study, where models are discouraged from presenting any single answer when presented with multiple conflicting answer candidates.
arXiv Detail & Related papers (2022-10-25T01:46:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.