Can LMs Learn New Entities from Descriptions? Challenges in Propagating
Injected Knowledge
- URL: http://arxiv.org/abs/2305.01651v1
- Date: Tue, 2 May 2023 17:59:46 GMT
- Title: Can LMs Learn New Entities from Descriptions? Challenges in Propagating
Injected Knowledge
- Authors: Yasumasa Onoe, Michael J.Q. Zhang, Shankar Padmanabhan, Greg Durrett,
Eunsol Choi
- Abstract summary: We study LMs' abilities to make inferences based on injected facts (or propagate those facts)
We find that existing methods for updating knowledge show little propagation of injected knowledge.
Yet, prepending entity definitions in an LM's context improves performance across all settings.
- Score: 72.63368052592004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models (LMs) are used for knowledge intensive tasks like
question answering, but their knowledge gets continuously outdated as the world
changes. Prior work has studied targeted updates to LMs, injecting individual
facts and evaluating whether the model learns these facts while not changing
predictions on other contexts. We take a step forward and study LMs' abilities
to make inferences based on injected facts (or propagate those facts): for
example, after learning that something is a TV show, does an LM predict that
you can watch it? We study this with two cloze-style tasks: an existing dataset
of real-world sentences about novel entities (ECBD) as well as a new controlled
benchmark with manually designed templates requiring varying levels of
inference about injected knowledge. Surprisingly, we find that existing methods
for updating knowledge (gradient-based fine-tuning and modifications of this
approach) show little propagation of injected knowledge. These methods improve
performance on cloze instances only when there is lexical overlap between
injected facts and target inferences. Yet, prepending entity definitions in an
LM's context improves performance across all settings, suggesting that there is
substantial headroom for parameter-updating approaches for knowledge injection.
Related papers
- Gradient Localization Improves Lifelong Pretraining of Language Models [32.29298047707914]
Large Language Models (LLMs) trained on web-scale text corpora have been shown to capture world knowledge in their parameters.
In this work, we examine two types of knowledge relating to temporally sensitive entities and demonstrate that each type is localized to different sets of parameters within the LLMs.
arXiv Detail & Related papers (2024-11-07T05:43:50Z) - Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning [2.8972337324168014]
We study how PLM may learn and remember new world knowledge facts that do not occur in their pre-training corpus.
We first propose Novel-WD, a new dataset consisting of sentences containing novel facts extracted from recent Wikidata updates.
We make this dataset freely available to the community, and release a procedure to later build new versions of similar datasets with up-to-date information.
arXiv Detail & Related papers (2024-08-30T07:54:50Z) - Detecting Edited Knowledge in Language Models [5.260519479124422]
Knowledge editing methods (KEs) can update language models' obsolete or inaccurate knowledge learned from pre-training.
Knowing whether a generated output is based on edited knowledge or first-hand knowledge from pre-training can increase users' trust in generative models.
We propose a novel task: detecting edited knowledge in language models.
arXiv Detail & Related papers (2024-05-04T22:02:24Z) - Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.
Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Propagating Knowledge Updates to LMs Through Distillation [97.3628651636153]
We show that a context-based approach can both impart knowledge about entities and propagate that knowledge to enable broader inferences.
Our experiments demonstrate that this approach is more effective at propagating knowledge updates than fine-tuning and other gradient-based knowledge-editing methods.
arXiv Detail & Related papers (2023-06-15T17:39:50Z) - Decouple knowledge from parameters for plug-and-play language modeling [77.5601135412186]
We introduce PlugLM, a pre-training model with differentiable plug-in memory(DPM)
The key intuition is to decouple the knowledge storage from model parameters with an editable and scalable key-value memory.
PlugLM obtains 3.95 F1 improvements across four domains on average without any in-domain pre-training.
arXiv Detail & Related papers (2023-05-19T10:01:55Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Enhancing Language Models with Plug-and-Play Large-Scale Commonsense [2.1248439796866228]
We study how to enhance language models (LMs) with textual commonsense knowledge.
We propose a plug-and-play method for large-scale commonsense integration without pre-training.
arXiv Detail & Related papers (2021-09-06T16:16:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.