Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating
Models to Reflect Conflicting Evidence
- URL: http://arxiv.org/abs/2210.13701v1
- Date: Tue, 25 Oct 2022 01:46:00 GMT
- Title: Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating
Models to Reflect Conflicting Evidence
- Authors: Hung-Ting Chen, Michael J.Q. Zhang, Eunsol Choi
- Abstract summary: We simulate knowledge conflicts where parametric knowledge suggests one answer and different passages suggest different answers.
We find retrieval performance heavily impacts which sources models rely on, and current models mostly rely on non-performing knowledge.
We present a new calibration study, where models are discouraged from presenting any single answer when presented with multiple conflicting answer candidates.
- Score: 37.18100697469402
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Question answering models can use rich knowledge sources -- up to one hundred
retrieved passages and parametric knowledge in the large-scale language model
(LM). Prior work assumes information in such knowledge sources is consistent
with each other, paying little attention to how models blend information stored
in their LM parameters with that from retrieved evidence documents. In this
paper, we simulate knowledge conflicts (i.e., where parametric knowledge
suggests one answer and different passages suggest different answers) and
examine model behaviors. We find retrieval performance heavily impacts which
sources models rely on, and current models mostly rely on non-parametric
knowledge in their best-performing settings. We discover a troubling trend that
contradictions among knowledge sources affect model confidence only marginally.
To address this issue, we present a new calibration study, where models are
discouraged from presenting any single answer when presented with multiple
conflicting answer candidates in retrieved evidences.
Related papers
- Analysing the Residual Stream of Language Models Under Knowledge Conflicts [23.96385393039587]
Large language models (LLMs) can store a significant amount of factual knowledge in their parameters.
However, their parametric knowledge may conflict with the information provided in the context.
This can lead to undesirable model behaviour, such as reliance on outdated or incorrect information.
arXiv Detail & Related papers (2024-10-21T15:12:51Z) - Studying Large Language Model Behaviors Under Context-Memory Conflicts With Real Documents [54.953320616069654]
Retrieval-augmented generation mitigates many problems of fully parametric language models.
In RAG, the model's knowledge can be updated from documents provided in context.
We present a framework for studying such knowledge conflicts in a realistic setup.
arXiv Detail & Related papers (2024-04-24T17:59:36Z) - Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.
Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z) - R-Tuning: Instructing Large Language Models to Say `I Don't Know' [66.11375475253007]
Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges.
Previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not.
We present a new approach called Refusal-Aware Instruction Tuning (R-Tuning)
Experimental results demonstrate R-Tuning effectively improves a model's ability to answer known questions and refrain from answering unknown questions.
arXiv Detail & Related papers (2023-11-16T08:45:44Z) - The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources
in Natural Language Understanding Systems [87.3207729953778]
We evaluate state-of-the-art coreference resolution models on our dataset.
Several models struggle to reason on-the-fly over knowledge observed both at pretrain time and at inference time.
Still, even the best performing models seem to have difficulties with reliably integrating knowledge presented only at inference time.
arXiv Detail & Related papers (2022-12-15T23:26:54Z) - DisentQA: Disentangling Parametric and Contextual Knowledge with
Counterfactual Question Answering [34.70206857546496]
Question answering models commonly have access to two sources of "knowledge" during inference time.
It is unclear whether the answer stems from the given non-parametric knowledge or not.
We propose a new paradigm in which QA models are trained to disentangle the two sources of knowledge.
arXiv Detail & Related papers (2022-11-10T15:34:44Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Entity-Based Knowledge Conflicts in Question Answering [29.973926661540524]
We formalize the problem of knowledge conflicts, where the contextual information contradicts the learned information.
We propose a method to mitigate over-reliance on parametric knowledge, which minimizes hallucination, and improves out-of-distribution generalization by 4%-7%.
Our findings demonstrate the importance for practitioners to evaluate model tendency to hallucinate rather than read, and show that our mitigation strategy encourages generalization to evolving information.
arXiv Detail & Related papers (2021-09-10T18:29:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.