Related papers: Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods

Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods

URL: http://arxiv.org/abs/2404.18655v1
Date: Mon, 29 Apr 2024 12:38:26 GMT
Title: Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods
Authors: Haeun Yu, Pepa Atanasova, Isabelle Augenstein,
Abstract summary: Language Models (LMs) acquire parametric knowledge from their training process, embedding it within their weights. Instance Attribution (IA) and Neuron Attribution (NA) offer insights into this training-acquired knowledge. Our study introduces a novel evaluation framework to quantify and compare the knowledge revealed by IA and NA.
Score: 45.1662948487385
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language Models (LMs) acquire parametric knowledge from their training process, embedding it within their weights. The increasing scalability of LMs, however, poses significant challenges for understanding a model's inner workings and further for updating or correcting this embedded knowledge without the significant cost of retraining. This underscores the importance of unveiling exactly what knowledge is stored and its association with specific model components. Instance Attribution (IA) and Neuron Attribution (NA) offer insights into this training-acquired knowledge, though they have not been compared systematically. Our study introduces a novel evaluation framework to quantify and compare the knowledge revealed by IA and NA. To align the results of the methods we introduce the attribution method NA-Instances to apply NA for retrieving influential training instances, and IA-Neurons to discover important neurons of influential instances discovered by IA. We further propose a comprehensive list of faithfulness tests to evaluate the comprehensiveness and sufficiency of the explanations provided by both methods. Through extensive experiments and analysis, we demonstrate that NA generally reveals more diverse and comprehensive information regarding the LM's parametric knowledge compared to IA. Nevertheless, IA provides unique and valuable insights into the LM's parametric knowledge, which are not revealed by NA. Our findings further suggest the potential of a synergistic approach of combining the diverse findings of IA and NA for a more holistic understanding of an LM's parametric knowledge.

Related papers

Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning. This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z)
Evaluating the External and Parametric Knowledge Fusion of Large Language Models [72.40026897037814]
We develop a systematic pipeline for data construction and knowledge infusion to simulate knowledge fusion scenarios. Our investigation reveals that enhancing parametric knowledge within LLMs can significantly bolster their capability for knowledge integration. Our findings aim to steer future explorations on harmonizing external and parametric knowledge within LLMs.
arXiv Detail & Related papers (2024-05-29T11:48:27Z)
C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations. Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
EpiK-Eval: Evaluation for Language Models as Epistemic Models [16.485951373967502]
We introduce EpiK-Eval, a novel question-answering benchmark tailored to evaluate LLMs' proficiency in formulating a coherent and consistent knowledge representation from segmented narratives. We argue that these shortcomings stem from the intrinsic nature of prevailing training objectives. The findings from this study offer insights for developing more robust and reliable LLMs.
arXiv Detail & Related papers (2023-10-23T21:15:54Z)
UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge. With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z)
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity [27.84415856657607]
We study how and why domain knowledge benefits the performance of informed learning. We propose a generalized informed training objective to better exploit the benefits of knowledge and balance the label and knowledge imperfectness.
arXiv Detail & Related papers (2022-07-02T06:28:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.