Editing Factual Knowledge in Language Models
- URL: http://arxiv.org/abs/2104.08164v1
- Date: Fri, 16 Apr 2021 15:24:42 GMT
- Title: Editing Factual Knowledge in Language Models
- Authors: Nicola De Cao, Wilker Aziz, Ivan Titov
- Abstract summary: We present KnowledgeEditor, a method that can be used to edit this knowledge.
Besides being computationally efficient, KnowledgeEditor does not require any modifications in LM pre-training.
We show KnowledgeEditor's efficacy with two popular architectures and knowledge-intensive tasks.
- Score: 51.947280241185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The factual knowledge acquired during pretraining and stored in the
parameters of Language Models (LM) can be useful in downstream tasks (e.g.,
question answering or textual inference). However, some facts can be
incorrectly induced or become obsolete over time. We present KnowledgeEditor, a
method that can be used to edit this knowledge and, thus, fix 'bugs' or
unexpected predictions without the need for expensive re-training or
fine-tuning. Besides being computationally efficient, KnowledgeEditor does not
require any modifications in LM pre-training (e.g., the use of meta-learning).
In our approach, we train a hyper-network with constrained optimization to
modify a fact without affecting the rest of the knowledge; the trained
hyper-network is then used to predict the weight update at test time. We show
KnowledgeEditor's efficacy with two popular architectures and
knowledge-intensive tasks: i) a BERT model fine-tuned for fact-checking, and
ii) a sequence-to-sequence BART model for question answering. With our method,
changing a prediction on the specific wording of a query tends to result in a
consistent change in predictions also for its paraphrases. We show that this
can be further encouraged by exploiting (e.g., automatically-generated)
paraphrases during training. Interestingly, our hyper-network can be regarded
as a 'probe' revealing which components of a model need to be changed to
manipulate factual knowledge; our analysis shows that the updates tend to be
concentrated on a small subset of components. Code at
https://github.com/nicola-decao/KnowledgeEditor
Related papers
- Detecting Edited Knowledge in Language Models [5.260519479124422]
Knowledge editing methods (KEs) can update language models' obsolete or inaccurate knowledge learned from pre-training.
Knowing whether a generated output is based on edited knowledge or first-hand knowledge from pre-training can increase users' trust in generative models.
We propose a novel task: detecting edited knowledge in language models.
arXiv Detail & Related papers (2024-05-04T22:02:24Z) - Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.
Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z) - On the Robustness of Editing Large Language Models [57.477943944826904]
Large language models (LLMs) have played a pivotal role in building communicative AI, yet they encounter the challenge of efficient updates.
This work seeks to understand the strengths and limitations of editing methods, facilitating practical applications of communicative AI.
arXiv Detail & Related papers (2024-02-08T17:06:45Z) - Massive Editing for Large Language Models via Meta Learning [27.972194696587813]
Large language models (LLMs) have enabled learning knowledge from the pre-training corpora, but the acquired knowledge may be fundamentally incorrect or outdated over time.
We propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem.
Our method is evaluated by editing up to thousands of facts on LMs with different architectures, i.e., BERT-base, GPT-2, T5-XL (2.8B), and GPT-J (6B)
arXiv Detail & Related papers (2023-11-08T13:03:06Z) - Decouple knowledge from parameters for plug-and-play language modeling [77.5601135412186]
We introduce PlugLM, a pre-training model with differentiable plug-in memory(DPM)
The key intuition is to decouple the knowledge storage from model parameters with an editable and scalable key-value memory.
PlugLM obtains 3.95 F1 improvements across four domains on average without any in-domain pre-training.
arXiv Detail & Related papers (2023-05-19T10:01:55Z) - Can LMs Learn New Entities from Descriptions? Challenges in Propagating
Injected Knowledge [72.63368052592004]
We study LMs' abilities to make inferences based on injected facts (or propagate those facts)
We find that existing methods for updating knowledge show little propagation of injected knowledge.
Yet, prepending entity definitions in an LM's context improves performance across all settings.
arXiv Detail & Related papers (2023-05-02T17:59:46Z) - Memory-Based Model Editing at Scale [102.28475739907498]
Existing model editors struggle to accurately model an edit's intended scope.
We propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC)
SERAC stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed.
arXiv Detail & Related papers (2022-06-13T23:40:34Z) - K-XLNet: A General Method for Combining Explicit Knowledge with Language
Model Pretraining [5.178964604577459]
We focus on improving model pretraining by leveraging explicit knowledge.
To be specific, we first match knowledge facts from knowledge graph (KG) and then add a knowledge injunction layer to transformer directly.
The experimental results show that solely by adding external knowledge to transformer can improve the learning performance on many NLP tasks.
arXiv Detail & Related papers (2021-03-25T06:14:18Z) - Knowledge-Aware Language Model Pretraining [29.56904859722379]
We incorporate knowledge-awareness in language model pretraining without changing the transformer architecture.
We observe improved language modeling accuracy, factual correctness in LAMA knowledge probing tasks, and semantics in the hidden representations through edge probing.
Our knowledge-aware language model (KALM) can serve as a drop-in replacement for GPT-2 models.
arXiv Detail & Related papers (2020-06-29T06:09:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.