Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and
Mitigating Knowledge Conflicts in Language Models
- URL: http://arxiv.org/abs/2402.18154v1
- Date: Wed, 28 Feb 2024 08:34:41 GMT
- Title: Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and
Mitigating Knowledge Conflicts in Language Models
- Authors: Zhuoran Jin, Pengfei Cao, Hongbang Yuan, Yubo Chen, Jiexin Xu, Huaijun
Li, Xiaojian Jiang, Kang Liu, Jun Zhao
- Abstract summary: Internal memory and external context inevitably clash, leading to knowledge conflicts within language models (LMs)
We propose a novel method called PatH PatcHing (PH3), which can efficiently mitigate knowledge conflicts by pruning conflicting attention heads without updating model parameters.
- Score: 18.2500350157507
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, retrieval augmentation and tool augmentation have demonstrated a
remarkable capability to expand the internal memory boundaries of language
models (LMs) by providing external context. However, internal memory and
external context inevitably clash, leading to knowledge conflicts within LMs.
In this paper, we aim to interpret the mechanism of knowledge conflicts through
the lens of information flow, and then mitigate conflicts by precise
interventions at the pivotal point. We find there are some attention heads with
opposite effects in the later layers, where memory heads can recall knowledge
from internal memory, and context heads can retrieve knowledge from external
context. Moreover, we reveal that the pivotal point at which knowledge
conflicts emerge in LMs is the integration of inconsistent information flows by
memory heads and context heads. Inspired by the insights, we propose a novel
method called Pruning Head via PatH PatcHing (PH3), which can efficiently
mitigate knowledge conflicts by pruning conflicting attention heads without
updating model parameters. PH3 can flexibly control eight LMs to use internal
memory ($\uparrow$ 44.0%) or external context ($\uparrow$ 38.5%). Moreover, PH3
can also improve the performance of LMs on open-domain QA tasks. We also
conduct extensive experiments to demonstrate the cross-model, cross-relation,
and cross-format generalization of our method.
Related papers
- Analysing the Residual Stream of Language Models Under Knowledge Conflicts [23.96385393039587]
Large language models (LLMs) can store a significant amount of factual knowledge in their parameters.
However, their parametric knowledge may conflict with the information provided in the context.
This can lead to undesirable model behaviour, such as reliance on outdated or incorrect information.
arXiv Detail & Related papers (2024-10-21T15:12:51Z) - Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering [23.96385393039587]
Large language models (LLMs) can store a significant amount of factual knowledge in their parameters.
LLMs can internally register the signals of knowledge conflict at mid-layers.
We propose textscSpARE, a representation engineering method that uses pre-trained sparse auto-encoders.
arXiv Detail & Related papers (2024-10-21T13:30:47Z) - Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs [55.74117540987519]
This paper explores the problem of commonsense-level vision-knowledge conflict in Multimodal Large Language Models (MLLMs)
We introduce an automated pipeline, augmented with human-in-the-loop quality control, to establish a benchmark aimed at simulating and assessing the conflicts in MLLMs.
We evaluate the conflict-resolution capabilities of nine representative MLLMs across various model families and find a noticeable over-reliance on textual queries.
arXiv Detail & Related papers (2024-10-10T17:31:17Z) - ECon: On the Detection and Resolution of Evidence Conflicts [56.89209046429291]
The rise of large language models (LLMs) has significantly influenced the quality of information in decision-making systems.
This study introduces a method for generating diverse, validated evidence conflicts to simulate real-world misinformation scenarios.
arXiv Detail & Related papers (2024-10-05T07:41:17Z) - DYNAMICQA: Tracing Internal Knowledge Conflicts in Language Models [42.776896363518844]
We study the effect of intra-memory conflict on an LM's ability to accept relevant context.
We utilize two knowledge conflict measures and a novel dataset containing inherently conflicting data, DynamicQA.
We verify that LMs exhibit a greater degree of intra-memory conflict with dynamic facts compared to facts that have a single truth value.
arXiv Detail & Related papers (2024-07-24T06:06:07Z) - LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements [59.71218039095155]
Task of reading comprehension (RC) provides a primary means to assess language models' natural language understanding (NLU) capabilities.
If the context aligns with the models' internal knowledge, it is hard to discern whether the models' answers stem from context comprehension or from internal information.
To address this issue, we suggest to use RC on imaginary data, based on fictitious facts and entities.
arXiv Detail & Related papers (2024-04-09T13:08:56Z) - Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models [51.72963030032491]
Knowledge documents for large language models (LLMs) may conflict with the memory of LLMs due to outdated or incorrect knowledge.
We construct a new dataset, dubbed KNOT, for knowledge conflict resolution examination in the form of question answering.
arXiv Detail & Related papers (2024-04-04T16:40:11Z) - Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint [20.543282448771336]
We propose an adaptive decoding method to discern whether the knowledge conflicts occur and resolve them.
Experiments show that COIECD exhibits strong performance and robustness over knowledge conflicts in realistic datasets.
arXiv Detail & Related papers (2024-02-19T07:10:30Z) - Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models [68.91592125175787]
Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs)
We present Rowen, a novel approach that enhances LLMs with a selective retrieval augmentation process tailored to address hallucinations.
arXiv Detail & Related papers (2024-02-16T11:55:40Z) - A Framework for Inference Inspired by Human Memory Mechanisms [9.408704431898279]
We propose a PMI framework that consists of perception, memory and inference components.
The memory module comprises working and long-term memory, with the latter endowed with a higher-order structure to retain extensive and complex relational knowledge and experience.
We apply our PMI to improve prevailing Transformers and CNN models on question-answering tasks like bAbI-20k and Sort-of-CLEVR datasets.
arXiv Detail & Related papers (2023-10-01T08:12:55Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.