Related papers: Massive Editing for Large Language Models via Meta Learning

Massive Editing for Large Language Models via Meta Learning

URL: http://arxiv.org/abs/2311.04661v3
Date: Thu, 25 Jan 2024 03:50:57 GMT
Title: Massive Editing for Large Language Models via Meta Learning
Authors: Chenmien Tan and Ge Zhang and Jie Fu
Abstract summary: Large language models (LLMs) have enabled learning knowledge from the pre-training corpora, but the acquired knowledge may be fundamentally incorrect or outdated over time. We propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem. Our method is evaluated by editing up to thousands of facts on LMs with different architectures, i.e., BERT-base, GPT-2, T5-XL (2.8B), and GPT-J (6B)
Score: 27.972194696587813
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While large language models (LLMs) have enabled learning knowledge from the pre-training corpora, the acquired knowledge may be fundamentally incorrect or outdated over time, which necessitates rectifying the knowledge of the language model (LM) after the training. A promising approach involves employing a hyper-network to generate parameter shift, whereas existing hyper-networks suffer from inferior scalability in synchronous editing operation amount. To mitigate the problem, we propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem, subsequently updating the LM parameters using the normal equation. To accommodate editing multiple facts simultaneously with limited memory budgets, we separate the computation on the hyper-network and LM, enabling arbitrary batch size on both neural networks. Our method is evaluated by editing up to thousands of facts on LMs with different architectures, i.e., BERT-base, GPT-2, T5-XL (2.8B), and GPT-J (6B), across various knowledge-intensive NLP tasks, i.e., closed book fact-checking and question answering. Remarkably, MALMEN is capable of editing hundreds of times more facts than strong baselines with the identical hyper-network architecture and outperforms editor specifically designed for GPT. Our code is available at https://github.com/ChenmienTan/malmen.

Related papers

Latent Knowledge Scalpel: Precise and Massive Knowledge Editing for Large Language Models [3.834827405473377]
Large Language Models (LLMs) often retain inaccurate or outdated information from pre-training, leading to incorrect predictions or biased outputs during inference.<n>We introduce the Latent Knowledge Scalpel (LKS), an LLM editor that manipulates the latent knowledge of specific entities via a lightweight hypernetwork to enable precise and large-scale editing.<n> Experiments conducted on Llama-2 and Mistral show even with the number of simultaneous edits reaching 10,000, LKS effectively performs knowledge editing while preserving the general abilities of the edited LLMs.
arXiv Detail & Related papers (2025-08-01T03:51:43Z)
PropMEND: Hypernetworks for Knowledge Propagation in LLMs [82.99849359892112]
We present a hypernetwork-based approach for knowledge propagation, named PropMEND.<n>We show almost 2x accuracy on challenging multi-hop questions whose answers are not explicitly stated in the injected fact.<n>We also introduce a new dataset, Controlled RippleEdit, to evaluate the generalization of our hypernetwork.
arXiv Detail & Related papers (2025-06-10T15:44:19Z)
MindBridge: Scalable and Cross-Model Knowledge Editing via Memory-Augmented Modality [55.01380617388064]
Most existing methods overfit to specific models, causing edited knowledge to be discarded during each update. We introduce MindBridge, a scalable solution inspired by the low coupling between modality processing and LLMs in multi-modal models. MindBridge achieves superior performance even in editing tens of thousands of knowledge entries and can flexibly adapt to different LLMs.
arXiv Detail & Related papers (2025-03-04T15:17:57Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Enhance Lifelong Model Editing with Continuous Data-Adapter Association [55.697627106315004]
Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors. Current approaches manage sequential edits by freezing original parameters and allocating new adapters for each knowledge modification. We propose ELDER, textbfEnhancing textbfLifelong motextbfDel textbfEditing with mixtutextbfRe of Low-Rank Adapter (LoRA)
arXiv Detail & Related papers (2024-08-19T02:27:00Z)
DAFNet: Dynamic Auxiliary Fusion for Sequential Model Editing in Large Language Models [32.598670876662375]
A Dynamic Auxiliary Fusion Network (DAFNet) is designed to enhance the semantic interaction among the factual knowledge within the entire sequence. DAFNet significantly outperforms strong baselines in single-turn and sequential editing.
arXiv Detail & Related papers (2024-05-31T02:56:49Z)
Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing. Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z)
Online Adaptation of Language Models with a Memory of Amortized Contexts [82.02369596879817]
Memory of Amortized Contexts (MAC) is an efficient and effective online adaptation framework for large language models. We show how MAC can be combined with and improve the performance of popular alternatives such as retrieval augmented generations.
arXiv Detail & Related papers (2024-03-07T08:34:57Z)
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering [17.20346072074533]
Recent model editing is a promising technique for efficiently updating a small amount of knowledge of large language models (LLMs) We propose a detachable and expandable Subject Word Embedding Altering (SWEA) framework, which finds the editing embeddings through token-level matching. We demonstrate the overall state-of-the-art (SOTA) performance of SWEA$oplus$OS on the textscCounterFact and zsRE datasets.
arXiv Detail & Related papers (2024-01-31T13:08:45Z)
ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph [142.42275983201978]
We propose a subgraph-aware self-attention mechanism to imitate the GNN for performing structured reasoning. We also adopt an adaptation tuning strategy to adapt the model parameters with 20,000 subgraphs with synthesized questions. Experiments show that ReasoningLM surpasses state-of-the-art models by a large margin, even with fewer updated parameters and less training data.
arXiv Detail & Related papers (2023-12-30T07:18:54Z)
G-SPEED: General SParse Efficient Editing MoDel [25.48360227520061]
underlinetextbfGeneral underlinetextbfSParse underlinetextbfEfficient underlinetextbfEditing MounderlinetextbfDel(textbfG-SPEED)
arXiv Detail & Related papers (2023-10-16T15:01:18Z)
Editing Factual Knowledge in Language Models [51.947280241185]
We present KnowledgeEditor, a method that can be used to edit this knowledge. Besides being computationally efficient, KnowledgeEditor does not require any modifications in LM pre-training. We show KnowledgeEditor's efficacy with two popular architectures and knowledge-intensive tasks.
arXiv Detail & Related papers (2021-04-16T15:24:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.