Related papers: ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA

ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA

URL: http://arxiv.org/abs/2408.11869v3
Date: Tue, 14 Jan 2025 04:25:23 GMT
Title: ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA
Authors: Jiaang Li, Quan Wang, Zhongnan Wang, Yongdong Zhang, Zhendong Mao,
Abstract summary: Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors.<n>Previous approaches manage sequential edits by freezing original parameters and discretely allocating new parameters for each knowledge update.<n>We propose ELDER, a novel approach to create a continuous association between data and adapters.
Score: 55.697627106315004
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors. Most model editing methods are solely designed for single-time use and result in a significant forgetting effect in lifelong editing scenarios, where sequential edits are conducted over time. Previous approaches manage sequential edits by freezing original parameters and discretely allocating new parameters for each knowledge update. However, these methods lack robustness to minor input variations due to the discrete mapping between data and parameters. To overcome this challenge, we propose ELDER, a novel approach to create a continuous association between data and adapters. ELDER integrates multiple LoRAs through a router network and is trained to establish a smooth data-adapter association, thereby enhancing the edit robustness and generalization of semantically equivalent inputs. To ensure inputs containing the same knowledge will be processed by the same LoRAs, we design a novel loss to guide the model link LoRA allocations with edit knowledge. Furthermore, we propose a deferral mechanism to retain the original LLM capabilities post-edit. Extensive experiments on GPT-2 XL and LLaMA2-7B demonstrate that ELDER effectively edits models in the lifelong setting, outperforming eight baselines while exhibiting strong scalability and preserving LLMs' general abilities on downstream tasks. Our code is available at https://github.com/JiaangL/ELDER.

Related papers

MindBridge: Scalable and Cross-Model Knowledge Editing via Memory-Augmented Modality [55.01380617388064]
Most existing methods overfit to specific models, causing edited knowledge to be discarded during each update. We introduce MindBridge, a scalable solution inspired by the low coupling between modality processing and LLMs in multi-modal models. MindBridge achieves superior performance even in editing tens of thousands of knowledge entries and can flexibly adapt to different LLMs.
arXiv Detail & Related papers (2025-03-04T15:17:57Z)
Constraining Sequential Model Editing with Editing Anchor Compression [40.93064933191375]
Large language models (LLMs) struggle with hallucinations due to false or outdated knowledge. This paper statistically observes that the parameter matrix after editing exhibits a significant deviation compared to its previous state as the number of edits increases. A framework termed Editing Anchor Compression (EAC) is proposed to constrain the deviation of the parameter matrix during sequential editing.
arXiv Detail & Related papers (2025-02-25T03:56:49Z)
Reinforced Lifelong Editing for Language Models [12.101856766731574]
Large language models (LLMs) acquire information from pre-training corpora, but their stored knowledge can become inaccurate or outdated over time. Model editing addresses this challenge by modifying model parameters without retraining, and prevalent approaches leverage hypernetworks to generate these parameter updates. We propose RLEdit, an RL-based editing method that captures changes at the full knowledge sequence level and generates appropriate parameter updates.
arXiv Detail & Related papers (2025-02-09T03:37:06Z)
O-Edit: Orthogonal Subspace Editing for Language Model Sequential Editing [0.0]
Large language models (LLMs) acquire knowledge during pre-training, but over time, this knowledge may become incorrect or outdated, necessitating updates after training. We propose Orthogonal Subspace Editing, O-Edit. This algorithmizes the direction of each knowledge update, minimizing interference between successive updates and reducing the impact of new updates on unrelated knowledge. It can perform thousands of edits on mainstream LLMs, achieving an average performance improvement that is 4.2 times better than existing methods while effectively preserving the model's performance on downstream tasks, all with minimal additional parameter overhead.
arXiv Detail & Related papers (2024-10-15T10:16:45Z)
Better Call SAUL: Fluent and Consistent Language Model Editing with Generation Regularization [48.07144492109635]
Large language models need to be updated regularly. Model editing is challenging as it might also affect knowledge that is unrelated to the new data. We propose SAUL, a streamlined model editing method that uses sentence concatenation with augmented random facts for generation regularization.
arXiv Detail & Related papers (2024-10-03T12:28:13Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Adaptive Adapter Routing for Long-Tailed Class-Incremental Learning [55.384428765798496]
New data exhibits a long-tailed distribution, such as e-commerce platform reviews. This necessitates continuous model learning imbalanced data without forgetting. We introduce AdaPtive Adapter RouTing (APART) as an exemplar-free solution for LTCIL.
arXiv Detail & Related papers (2024-09-11T17:52:00Z)
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models [30.831866499812925]
Large language models (LLMs) require continual knowledge updates to stay abreast of the ever-changing world facts. We introduce LEMoE, an advanced Mixture of Experts (MoE) adaptor for lifelong model editing.
arXiv Detail & Related papers (2024-06-28T16:17:41Z)
DAFNet: Dynamic Auxiliary Fusion for Sequential Model Editing in Large Language Models [32.598670876662375]
A Dynamic Auxiliary Fusion Network (DAFNet) is designed to enhance the semantic interaction among the factual knowledge within the entire sequence. DAFNet significantly outperforms strong baselines in single-turn and sequential editing.
arXiv Detail & Related papers (2024-05-31T02:56:49Z)
Learning to Edit: Aligning LLMs with Knowledge Editing [101.96620267293731]
We propose a Learning to Edit (LTE) framework, focusing on teaching large language models to apply updated knowledge into input questions. LTE features a two-phase process: (i) the Alignment Phase, which fine-tunes LLMs on a meticulously curated parallel dataset to make reliable, in-scope edits. We demonstrate LTE's superiority in knowledge editing performance, robustness in both batch and sequential editing, minimal interference on general tasks, and rapid editing speeds.
arXiv Detail & Related papers (2024-02-19T07:45:17Z)
The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse [58.0132400208411]
Even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. benchmarking Large Language Models after each edit is impractically time-consuming and resource-intensive. We have utilized GPT-3.5 to develop a new dataset, HardEdit, based on hard cases.
arXiv Detail & Related papers (2024-02-15T01:50:38Z)
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering [17.20346072074533]
Recent model editing is a promising technique for efficiently updating a small amount of knowledge of large language models (LLMs) We propose a detachable and expandable Subject Word Embedding Altering (SWEA) framework, which finds the editing embeddings through token-level matching. We demonstrate the overall state-of-the-art (SOTA) performance of SWEA$oplus$OS on the textscCounterFact and zsRE datasets.
arXiv Detail & Related papers (2024-01-31T13:08:45Z)
MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA [34.21194537887934]
We propose a plug-in Model Editing method based on neuron-indexed dynamic LoRA (MELO) Our proposed MELO achieves state-of-the-art editing performance on three sequential editing tasks.
arXiv Detail & Related papers (2023-12-19T02:11:01Z)
Massive Editing for Large Language Models via Meta Learning [27.972194696587813]
Large language models (LLMs) have enabled learning knowledge from the pre-training corpora, but the acquired knowledge may be fundamentally incorrect or outdated over time. We propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem. Our method is evaluated by editing up to thousands of facts on LMs with different architectures, i.e., BERT-base, GPT-2, T5-XL (2.8B), and GPT-J (6B)
arXiv Detail & Related papers (2023-11-08T13:03:06Z)
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors [53.819805242367345]
We propose GRACE, a lifelong model editing method, which implements spot-fixes on streaming errors of a deployed model. GRACE writes new mappings into a pre-trained model's latent space, creating a discrete, local codebook of edits without altering model weights. Our experiments on T5, BERT, and GPT models show GRACE's state-of-the-art performance in making and retaining edits, while generalizing to unseen inputs.
arXiv Detail & Related papers (2022-11-20T17:18:22Z)
Memory-Based Model Editing at Scale [102.28475739907498]
Existing model editors struggle to accurately model an edit's intended scope. We propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC) SERAC stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed.
arXiv Detail & Related papers (2022-06-13T23:40:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.