Measuring and Modifying Factual Knowledge in Large Language Models
- URL: http://arxiv.org/abs/2306.06264v1
- Date: Fri, 9 Jun 2023 21:25:48 GMT
- Title: Measuring and Modifying Factual Knowledge in Large Language Models
- Authors: Pouya Pezeshkpour
- Abstract summary: Large Language Models store an extensive amount of factual knowledge obtained from vast collections of text.
We employ information theory-based measurements to provide a framework estimating the factual knowledge contained within large language models.
- Score: 2.8427946758947304
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) store an extensive amount of factual knowledge
obtained from vast collections of text. To effectively utilize these models for
downstream tasks, it is crucial to have reliable methods for measuring their
knowledge. However, existing approaches for knowledge measurement have certain
limitations, and despite recent efforts, they fail to provide accurate
measurements and the necessary insights for modifying the knowledge within
LLMs. In this work, we employ information theory-based measurements to provide
a framework estimating the factual knowledge contained within large language
models. More specifically, we measure knowledge by analyzing the LLM's
prediction probability distribution before and after instilling the target
knowledge, employing metrics such as entropy and KL-divergence. Introducing our
metrics, we first assess their accuracy in comparison to previous ranking-based
methods, surpassing them by over $35\%$ in a synthetic experiment. Then, we
explore two prominent methods of knowledge instillation, discovering that LLMs
exhibit limitations in capturing new knowledge under specific circumstances for
one of these methods. Lastly, we demonstrate the applicability of our methods
in extracting unlearned and mislearned facts in LLMs through their application
to in-context learning. We make code and data for all methods and experiments
in this paper publicly available.
Related papers
- To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models [39.39428450239399]
Large Language Models (LLMs) trained on extensive corpora inevitably retain sensitive data, such as personal privacy information and copyrighted material.
Recent advancements in knowledge unlearning involve updating LLM parameters to erase specific knowledge.
We introduce KnowUnDo to evaluate if the unlearning process inadvertently erases essential knowledge.
arXiv Detail & Related papers (2024-07-02T03:34:16Z) - Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction [15.534647327246239]
We propose an approach for estimating the latent knowledge embedded inside large language models (LLMs)
We leverage the in-context learning abilities of LLMs to estimate the extent to which an LLM knows the facts stored in a knowledge base.
arXiv Detail & Related papers (2024-04-19T15:40:39Z) - KnowTuning: Knowledge-aware Fine-tuning for Large Language Models [83.5849717262019]
We propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs.
KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.
arXiv Detail & Related papers (2024-02-17T02:54:32Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Enabling Large Language Models to Learn from Rules [99.16680531261987]
We are inspired that humans can learn the new tasks or knowledge in another way by learning from rules.
We propose rule distillation, which first uses the strong in-context abilities of LLMs to extract the knowledge from the textual rules.
Our experiments show that making LLMs learn from rules by our method is much more efficient than example-based learning in both the sample size and generalization ability.
arXiv Detail & Related papers (2023-11-15T11:42:41Z) - Give Me the Facts! A Survey on Factual Knowledge Probing in Pre-trained
Language Models [2.3981254787726067]
Pre-trained Language Models (PLMs) are trained on vast unlabeled data, rich in world knowledge.
This has sparked the interest of the community in quantifying the amount of factual knowledge present in PLMs.
In this work, we survey methods and datasets that are used to probe PLMs for factual knowledge.
arXiv Detail & Related papers (2023-10-25T11:57:13Z) - Knowledge Editing for Large Language Models: A Survey [51.01368551235289]
One major drawback of large language models (LLMs) is their substantial computational cost for pre-training.
Knowledge-based Model Editing (KME) has attracted increasing attention, which aims to precisely modify the LLMs to incorporate specific knowledge.
arXiv Detail & Related papers (2023-10-24T22:18:13Z) - Eva-KELLM: A New Benchmark for Evaluating Knowledge Editing of LLMs [54.22416829200613]
Eva-KELLM is a new benchmark for evaluating knowledge editing of large language models.
Experimental results indicate that the current methods for knowledge editing using raw documents are not effective in yielding satisfactory results.
arXiv Detail & Related papers (2023-08-19T09:17:19Z) - Measuring the Knowledge Acquisition-Utilization Gap in Pretrained
Language Models [26.342351417963965]
Pre-trained language models (PLMs) have shown evidence of acquiring vast amounts of knowledge.
It remains unclear how much of this parametric knowledge is actually usable in performing downstream tasks.
We propose a systematic framework to measure parametric knowledge utilization in PLMs.
arXiv Detail & Related papers (2023-05-24T06:26:11Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.