Related papers: Editing Factual Knowledge in Language Models

Editing Factual Knowledge in Language Models

URL: http://arxiv.org/abs/2104.08164v1
Date: Fri, 16 Apr 2021 15:24:42 GMT
Title: Editing Factual Knowledge in Language Models
Authors: Nicola De Cao, Wilker Aziz, Ivan Titov
Abstract summary: We present KnowledgeEditor, a method that can be used to edit this knowledge. Besides being computationally efficient, KnowledgeEditor does not require any modifications in LM pre-training. We show KnowledgeEditor's efficacy with two popular architectures and knowledge-intensive tasks.
Score: 51.947280241185
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The factual knowledge acquired during pretraining and stored in the parameters of Language Models (LM) can be useful in downstream tasks (e.g., question answering or textual inference). However, some facts can be incorrectly induced or become obsolete over time. We present KnowledgeEditor, a method that can be used to edit this knowledge and, thus, fix 'bugs' or unexpected predictions without the need for expensive re-training or fine-tuning. Besides being computationally efficient, KnowledgeEditor does not require any modifications in LM pre-training (e.g., the use of meta-learning). In our approach, we train a hyper-network with constrained optimization to modify a fact without affecting the rest of the knowledge; the trained hyper-network is then used to predict the weight update at test time. We show KnowledgeEditor's efficacy with two popular architectures and knowledge-intensive tasks: i) a BERT model fine-tuned for fact-checking, and ii) a sequence-to-sequence BART model for question answering. With our method, changing a prediction on the specific wording of a query tends to result in a consistent change in predictions also for its paraphrases. We show that this can be further encouraged by exploiting (e.g., automatically-generated) paraphrases during training. Interestingly, our hyper-network can be regarded as a 'probe' revealing which components of a model need to be changed to manipulate factual knowledge; our analysis shows that the updates tend to be concentrated on a small subset of components. Code at https://github.com/nicola-decao/KnowledgeEditor

Related papers

Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning [29.20378857521518]
Large language models (LLMs) have achieved remarkable performance on various natural language tasks. They are trained on static corpora and their knowledge can become outdated quickly in the fast-changing world. Previous efforts often sought to update a small amount of parameters in some specific layer(s) of a LLM. We propose BaFT to manage different types of knowledge in an adaptive way, thereby achieving a better editing-locality trade-off.
arXiv Detail & Related papers (2025-03-01T02:34:44Z)
K-Edit: Language Model Editing with Contextual Knowledge Awareness [71.73747181407323]
Knowledge-based model editing enables precise modifications to the weights of large language models. We present K-Edit, an effective approach to generating contextually consistent knowledge edits.
arXiv Detail & Related papers (2025-02-15T01:35:13Z)
Detecting Edited Knowledge in Language Models [5.260519479124422]
Knowledge editing methods (KEs) can update language models' obsolete or inaccurate knowledge learned from pre-training. Knowing whether a generated output is based on edited knowledge or first-hand knowledge from pre-training can increase users' trust in generative models. We propose a novel task: detecting edited knowledge in language models.
arXiv Detail & Related papers (2024-05-04T22:02:24Z)
Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing. Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z)
On the Robustness of Editing Large Language Models [57.477943944826904]
Large language models (LLMs) have played a pivotal role in building communicative AI, yet they encounter the challenge of efficient updates. This work seeks to understand the strengths and limitations of editing methods, facilitating practical applications of communicative AI.
arXiv Detail & Related papers (2024-02-08T17:06:45Z)
Massive Editing for Large Language Models via Meta Learning [27.972194696587813]
Large language models (LLMs) have enabled learning knowledge from the pre-training corpora, but the acquired knowledge may be fundamentally incorrect or outdated over time. We propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem. Our method is evaluated by editing up to thousands of facts on LMs with different architectures, i.e., BERT-base, GPT-2, T5-XL (2.8B), and GPT-J (6B)
arXiv Detail & Related papers (2023-11-08T13:03:06Z)
Decouple knowledge from parameters for plug-and-play language modeling [77.5601135412186]
We introduce PlugLM, a pre-training model with differentiable plug-in memory(DPM) The key intuition is to decouple the knowledge storage from model parameters with an editable and scalable key-value memory. PlugLM obtains 3.95 F1 improvements across four domains on average without any in-domain pre-training.
arXiv Detail & Related papers (2023-05-19T10:01:55Z)
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge [72.63368052592004]
We study LMs' abilities to make inferences based on injected facts (or propagate those facts) We find that existing methods for updating knowledge show little propagation of injected knowledge. Yet, prepending entity definitions in an LM's context improves performance across all settings.
arXiv Detail & Related papers (2023-05-02T17:59:46Z)
Memory-Based Model Editing at Scale [102.28475739907498]
Existing model editors struggle to accurately model an edit's intended scope. We propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC) SERAC stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed.
arXiv Detail & Related papers (2022-06-13T23:40:34Z)
K-XLNet: A General Method for Combining Explicit Knowledge with Language Model Pretraining [5.178964604577459]
We focus on improving model pretraining by leveraging explicit knowledge. To be specific, we first match knowledge facts from knowledge graph (KG) and then add a knowledge injunction layer to transformer directly. The experimental results show that solely by adding external knowledge to transformer can improve the learning performance on many NLP tasks.
arXiv Detail & Related papers (2021-03-25T06:14:18Z)
Knowledge-Aware Language Model Pretraining [29.56904859722379]
We incorporate knowledge-awareness in language model pretraining without changing the transformer architecture. We observe improved language modeling accuracy, factual correctness in LAMA knowledge probing tasks, and semantics in the hidden representations through edge probing. Our knowledge-aware language model (KALM) can serve as a drop-in replacement for GPT-2 models.
arXiv Detail & Related papers (2020-06-29T06:09:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.