Related papers: MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA

MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA

URL: http://arxiv.org/abs/2312.11795v1
Date: Tue, 19 Dec 2023 02:11:01 GMT
Title: MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA
Authors: Lang Yu, Qin Chen, Jie Zhou, Liang He
Abstract summary: We propose a plug-in Model Editing method based on neuron-indexed dynamic LoRA (MELO) Our proposed MELO achieves state-of-the-art editing performance on three sequential editing tasks.
Score: 34.21194537887934
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have shown great success in various Natural Language Processing (NLP) tasks, whist they still need updates after deployment to fix errors or keep pace with the changing knowledge in the world. Researchers formulate such problem as Model Editing and have developed various editors focusing on different axes of editing properties. However, current editors can hardly support all properties and rely on heavy computational resources. In this paper, we propose a plug-in Model Editing method based on neuron-indexed dynamic LoRA (MELO), which alters the behavior of language models by dynamically activating certain LoRA blocks according to the index built in an inner vector database. Our method satisfies various editing properties with high efficiency and can be easily integrated into multiple LLM backbones. Experimental results show that our proposed MELO achieves state-of-the-art editing performance on three sequential editing tasks (document classification, question answering and hallucination correction), while requires the least trainable parameters and computational cost.

Related papers

Joint Localization and Activation Editing for Low-Resource Fine-Tuning [73.64004083269424]
We propose a joint localization and activation editing (JoLA) method. JoLA learns (1) which heads in the Transformer to edit (2) whether the intervention should be additive, multiplicative, or both and (3) the intervention parameters themselves. Through evaluations on three benchmarks spanning commonsense reasoning, natural language understanding, and natural language generation, we demonstrate that JoLA consistently outperforms existing methods.
arXiv Detail & Related papers (2025-02-03T09:13:09Z)
Neuron-Level Sequential Editing for Large Language Models [19.324852774144752]
We introduce textbfNeuron-level textbfSequential textbfEditing (NSE) for supporting sequential model editing. Specifically, we optimize the target layer's hidden states using the model's original weights to prevent model failure. Our experiments demonstrate that NSE significantly outperforms current modifying parameters model editing methods.
arXiv Detail & Related papers (2024-10-05T05:52:22Z)
Enhance Lifelong Model Editing with Continuous Data-Adapter Association [55.697627106315004]
Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors. Current approaches manage sequential edits by freezing original parameters and allocating new adapters for each knowledge modification. We propose ELDER, textbfEnhancing textbfLifelong motextbfDel textbfEditing with mixtutextbfRe of Low-Rank Adapter (LoRA)
arXiv Detail & Related papers (2024-08-19T02:27:00Z)
MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked Low-Rank Adaptation [18.087144677674786]
We focus on multilingual knowledge editing (MKE), which requires propagating updates across multiple languages. We introduce the Multilingual Knowledge Editing Benchmark (MKEB), a novel dataset comprising 12 languages. We also propose a method that enhances knowledge Editing with neuron-Masked Low-Rank Adaptation (MEMLA)
arXiv Detail & Related papers (2024-06-17T14:03:50Z)
The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse [58.0132400208411]
Even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. benchmarking Large Language Models after each edit is impractically time-consuming and resource-intensive. We have utilized GPT-3.5 to develop a new dataset, HardEdit, based on hard cases.
arXiv Detail & Related papers (2024-02-15T01:50:38Z)
Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue [122.20016030723043]
We evaluate the side effects of model editing on large language models (LLMs) Our analysis reveals that the side effects are caused by model editing altering the original model weights excessively. To mitigate this, a method named RECT is proposed to regularize the edit update weights.
arXiv Detail & Related papers (2024-01-09T18:03:15Z)
SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models [91.22477798288003]
This paper introduces SmartEdit, a novel approach to instruction-based image editing. It exploits Multimodal Large Language Models (MLLMs) to enhance their understanding and reasoning capabilities. We show that a small amount of complex instruction editing data can effectively stimulate SmartEdit's editing capabilities for more complex instructions.
arXiv Detail & Related papers (2023-12-11T17:54:11Z)
Editing Large Language Models: Problems, Methods, and Opportunities [51.903537096207]
This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. We provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.
arXiv Detail & Related papers (2023-05-22T16:00:00Z)
Memory-Based Model Editing at Scale [102.28475739907498]
Existing model editors struggle to accurately model an edit's intended scope. We propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC) SERAC stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed.
arXiv Detail & Related papers (2022-06-13T23:40:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.