Related papers: Knowledge Localization: Mission Not Accomplished? Enter Query Localization!

Knowledge Localization: Mission Not Accomplished? Enter Query Localization!

URL: http://arxiv.org/abs/2405.14117v1
Date: Thu, 23 May 2024 02:44:12 GMT
Title: Knowledge Localization: Mission Not Accomplished? Enter Query Localization!
Authors: Yuheng Chen, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao,
Abstract summary: The Knowledge Neuron (KN) thesis is a prominent theory for explaining these mechanisms. We re-examine the knowledge localization (KL) assumption and confirm the existence of facts that do not adhere to it from both statistical and knowledge modification perspectives. We propose the Consistency-Aware KN modification method, which improves the performance of knowledge modification.
Score: 19.16542466297147
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) store extensive factual knowledge, but the mechanisms behind how they store and express this knowledge remain unclear. The Knowledge Neuron (KN) thesis is a prominent theory for explaining these mechanisms. This theory is based on the knowledge localization (KL) assumption, which suggests that a fact can be localized to a few knowledge storage units, namely knowledge neurons. However, this assumption may be overly strong regarding knowledge storage and neglects knowledge expression mechanisms. Thus, we re-examine the KL assumption and confirm the existence of facts that do not adhere to it from both statistical and knowledge modification perspectives. Furthermore, we propose the Query Localization (QL) assumption. (1) Query-KN Mapping: The localization results are associated with the query rather than the fact. (2) Dynamic KN Selection: The attention module contributes to the selection of KNs for answering a query. Based on this, we further propose the Consistency-Aware KN modification method, which improves the performance of knowledge modification. We conduct 39 sets of experiments, along with additional visualization experiments, to rigorously validate our conclusions.

Related papers

CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners [88.35958039968081]
CaKE (Circuit-aware Knowledge Editing) is a novel method that enables more effective integration of updated knowledge in large language models. Results show that CaKE enables more accurate and consistent use of updated knowledge across related reasoning tasks.
arXiv Detail & Related papers (2025-03-20T17:14:34Z)
Inside-Out: Hidden Factual Knowledge in LLMs [50.79758420289131]
This work presents a framework for assessing whether large language models (LLMs) encode more factual knowledge in their parameters than what they express in their outputs. We first propose a formal definition of knowledge, quantifying it for a given question as the fraction of correct-incorrect answer pairs where the correct one is ranked higher. We then present a case study, applying this framework to three popular open-weights LLMs in a closed-book QA setup.
arXiv Detail & Related papers (2025-03-19T15:21:48Z)
Knowledge Mechanisms in Large Language Models: A Survey and Perspective [88.51320482620679]
This paper reviews knowledge mechanism analysis from a novel taxonomy including knowledge utilization and evolution. We discuss what knowledge LLMs have learned, the reasons for the fragility of parametric knowledge, and the potential dark knowledge (hypothesis) that will be challenging to address.
arXiv Detail & Related papers (2024-07-22T06:15:59Z)
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs [55.317267269115845]
Chain-of-Knowledge (CoK) is a comprehensive framework for knowledge reasoning. CoK includes methodologies for both dataset construction and model learning. We conduct extensive experiments with KnowReason.
arXiv Detail & Related papers (2024-06-30T10:49:32Z)
Can Language Models Act as Knowledge Bases at Scale? [24.99538360485476]
Large language models (LLMs) have demonstrated remarkable proficiency in understanding and generating responses to complex queries. Our research investigates whether LLMs can effectively store, recall, and reason with knowledge on a large scale comparable to latest knowledge bases (KBs) such as Wikidata.
arXiv Detail & Related papers (2024-02-22T04:20:14Z)
Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models [23.11132761945838]
Large language models (LLMs) store extensive factual knowledge, but the underlying mechanisms remain unclear. Previous research suggests that factual knowledge is stored within multi-layer perceptron weights. Some storage units exhibit degeneracy, referred to as Degenerate Knowledge Neurons.
arXiv Detail & Related papers (2024-02-21T11:50:32Z)
Stable Knowledge Editing in Large Language Models [68.98582618305679]
We introduce StableKE, a knowledge editing method based on knowledge augmentation rather than knowledge localization. To overcome the expense of human labeling, StableKE integrates two automated knowledge augmentation strategies. StableKE surpasses other knowledge editing methods, demonstrating stability both edited knowledge and multi-hop knowledge.
arXiv Detail & Related papers (2024-02-20T14:36:23Z)
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models [83.5849717262019]
We propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs. KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.
arXiv Detail & Related papers (2024-02-17T02:54:32Z)
Knowledge Verification to Nip Hallucination in the Bud [69.79051730580014]
We demonstrate the feasibility of mitigating hallucinations by verifying and minimizing the inconsistency between external knowledge present in the alignment data and the intrinsic knowledge embedded within foundation LLMs. We propose a novel approach called Knowledge Consistent Alignment (KCA), which employs a well-aligned LLM to automatically formulate assessments based on external knowledge. We demonstrate the superior efficacy of KCA in reducing hallucinations across six benchmarks, utilizing foundation LLMs of varying backbones and scales.
arXiv Detail & Related papers (2024-01-19T15:39:49Z)
DeepEdit: Knowledge Editing as Decoding with Constraints [118.78008395850888]
How to edit the knowledge in multi-step reasoning has become the major challenge in the knowledge editing (KE) of large language models (LLMs) We propose a new KE framework: DEEPEDIT, which enhances LLMs's ability to generate coherent reasoning chains with new knowledge through depth-first search. In addition to DEEPEDIT, we propose two new KE benchmarks: MQUAKE-2002 and MQUAKE-HARD, which provide more precise and challenging assessments of KE approaches.
arXiv Detail & Related papers (2024-01-19T03:48:27Z)
Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons [20.56154830853632]
This paper delves into the complex task of understanding how factual knowledge is stored in multilingual language models. We introduce the Architecture-adapted Multilingual Integrated Gradients method, which successfully localizes knowledge neurons more precisely. We also conduct an in-depth exploration of knowledge neurons, leading to the following two important discoveries.
arXiv Detail & Related papers (2023-08-25T06:26:05Z)
Decker: Double Check with Heterogeneous Knowledge for Commonsense Fact Verification [80.31112722910787]
We propose Decker, a commonsense fact verification model that is capable of bridging heterogeneous knowledge. Experimental results on two commonsense fact verification benchmark datasets, CSQA2.0 and CREAK demonstrate the effectiveness of our Decker.
arXiv Detail & Related papers (2023-05-10T06:28:16Z)
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge [72.63368052592004]
We study LMs' abilities to make inferences based on injected facts (or propagate those facts) We find that existing methods for updating knowledge show little propagation of injected knowledge. Yet, prepending entity definitions in an LM's context improves performance across all settings.
arXiv Detail & Related papers (2023-05-02T17:59:46Z)
Structured Knowledge Grounding for Question Answering [0.23068481501673416]
We propose to leverage the language and knowledge for knowledge based question-answering with flexibility, breadth of coverage and structured reasoning. Specifically, we devise a knowledge construction method that retrieves the relevant context with a dynamic hop. And we devise a deep fusion mechanism to further bridge the information exchanging bottleneck between the language and the knowledge.
arXiv Detail & Related papers (2022-09-17T08:48:50Z)
KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification and Reasoning Abilities of Language Models [28.82149012250609]
We propose a benchmark, named Knowledge Memorization, Identification, and Reasoning test (KMIR) KMIR covers 3 types of knowledge, including general knowledge, domain-specific knowledge, and commonsense, and provides 184,348 well-designed questions. Preliminary experiments with various representative pre-training language models on KMIR reveal many interesting phenomenons.
arXiv Detail & Related papers (2022-02-28T03:52:57Z)
Incremental Knowledge Based Question Answering [52.041815783025186]
We propose a new incremental KBQA learning framework that can progressively expand learning capacity as humans do. Specifically, it comprises a margin-distilled loss and a collaborative selection method, to overcome the catastrophic forgetting problem. The comprehensive experiments demonstrate its effectiveness and efficiency when working with the evolving knowledge base.
arXiv Detail & Related papers (2021-01-18T09:03:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.