Related papers: MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors

MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors

URL: http://arxiv.org/abs/2405.19086v2
Date: Sun, 2 Jun 2024 02:32:31 GMT
Title: MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors
Authors: Renzhi Wang, Piji Li,
Abstract summary: MEMoE is a model editing adapter utilizing a Mixture of Experts (MoE) architecture with a knowledge anchor routing strategy. We show the superiority of our approach over both batch editing and sequential batch editing tasks.
Score: 30.831866499812925
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Model editing aims to efficiently alter the behavior of Large Language Models (LLMs) within a desired scope, while ensuring no adverse impact on other inputs. Recent years have witnessed various model editing methods been proposed. However, these methods either exhibit poor overall performance or struggle to strike a balance between generalization and locality. We propose MEMoE, a model editing adapter utilizing a Mixture of Experts (MoE) architecture with a knowledge anchor routing strategy. MEMoE updates knowledge using a bypass MoE structure, keeping the original parameters unchanged to preserve the general ability of LLMs. And, the knowledge anchor routing ensures that inputs requiring similar knowledge are routed to the same expert, thereby enhancing the generalization of the updated knowledge. Experimental results show the superiority of our approach over both batch editing and sequential batch editing tasks, exhibiting exceptional overall performance alongside outstanding balance between generalization and locality. Our code will be available.

Related papers

Balancing Knowledge Updates: Toward Unified Modular Editing in LLMs [24.366567992263303]
We propose IntAttn-Edit to update both memory and Attn modules.<n>We show that IntAttn-Edit achieves higher edit success, better generalization, and stronger knowledge preservation than prior methods.
arXiv Detail & Related papers (2025-10-31T11:37:39Z)
SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models [96.81401797908835]
We introduce SAKE, the first benchmark specifically designed for editing auditory attribute knowledge in Large Audio-Language Models.<n>We benchmark seven editing methods on two LALMs along four dimensions: reliability, generality, audio/text locality, and portability.<n>Results highlight challenges such as preserving intra-attribute knowledge unrelated to the edit, generalizing edits to multimodal reasoning, and maintaining edits under sequential updates.
arXiv Detail & Related papers (2025-10-19T16:22:09Z)
Aligning Language Models with Real-time Knowledge Editing [11.503574001763246]
We introduce CRAFT, an ever-evolving real-world benchmark for knowledge editing.<n>It features well-designed paired edits for composite reasoning, and evaluates models on alias portability and temporal and common-sense locality.<n> Towards flexible real-time editing, we propose KEDAS, a novel paradigm of knowledge editing alignment featuring diverse edit augmentation and self-adaptive post-alignment inference.
arXiv Detail & Related papers (2025-08-02T10:25:36Z)
Model Merging for Knowledge Editing [53.799891745131724]
Large Language Models (LLMs) require continuous updates to maintain accurate and current knowledge as the world evolves.<n>Existing knowledge editing approaches offer various solutions for knowledge updating, but they often struggle with sequential editing scenarios.<n>This paper proposes a two-stage framework combining robust supervised fine-tuning (R-SFT) with model merging for knowledge editing.
arXiv Detail & Related papers (2025-06-14T07:42:39Z)
MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs [82.34547399693966]
Existing methods for lifelong model editing compromise generalization, interfere with past edits, or fail to scale to long editing sequences.<n>We propose MEMOIR, a novel scalable framework that injects knowledge through a residual memory.<n>MeMOIR confines each edit to a distinct subset of the memory parameters, minimizing interference among edits.
arXiv Detail & Related papers (2025-06-09T16:16:42Z)
Image Editing As Programs with Diffusion Models [69.05164729625052]
We introduce Image Editing As Programs (IEAP), a unified image editing framework built upon the Diffusion Transformer (DiT) architecture.<n>IEAP approaches instructional editing through a reductionist lens, decomposing complex editing instructions into sequences of atomic operations.<n>Our framework delivers superior accuracy and semantic fidelity, particularly for complex, multi-step instructions.
arXiv Detail & Related papers (2025-06-04T16:57:24Z)
One for All: Update Parameterized Knowledge Across Multiple Models [35.137065486616805]
Large language models (LLMs) encode vast world knowledge but struggle to stay up-to-date, often leading to errors and hallucinations.<n> Knowledge editing offers an efficient alternative to retraining, enabling targeted modifications by updating specific model parameters.<n>We propose OnceEdit, a novel ensemble-based approach that employs a plug-in model as the editing module.
arXiv Detail & Related papers (2025-06-01T03:48:54Z)
Uncovering Overfitting in Large Language Model Editing [35.55260822503773]
We identify and investigate the phenomenon of Editing Overfit, where edited models assign disproportionately high probabilities to the edit target. We propose a new plug-and-play strategy called Learn to Inference (LTI), which introduce a Multi-stage Inference Constraint module to guide the edited models in recalling new knowledge.
arXiv Detail & Related papers (2024-10-10T11:09:00Z)
Better Call SAUL: Fluent and Consistent Language Model Editing with Generation Regularization [48.07144492109635]
Large language models need to be updated regularly. Model editing is challenging as it might also affect knowledge that is unrelated to the new data. We propose SAUL, a streamlined model editing method that uses sentence concatenation with augmented random facts for generation regularization.
arXiv Detail & Related papers (2024-10-03T12:28:13Z)
Enhance Lifelong Model Editing with Continuous Data-Adapter Association [55.697627106315004]
Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors. Current approaches manage sequential edits by freezing original parameters and allocating new adapters for each knowledge modification. We propose ELDER, textbfEnhancing textbfLifelong motextbfDel textbfEditing with mixtutextbfRe of Low-Rank Adapter (LoRA)
arXiv Detail & Related papers (2024-08-19T02:27:00Z)
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts [75.85448576746373]
We propose a method of grouping and pruning similar experts to improve the model's parameter efficiency. We validate the effectiveness of our method by pruning three state-of-the-art MoE architectures. The evaluation shows that our method outperforms other model pruning methods on a range of natural language tasks.
arXiv Detail & Related papers (2024-07-12T17:25:02Z)
LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models [30.831866499812925]
Large language models (LLMs) require continual knowledge updates to stay abreast of the ever-changing world facts. We introduce LEMoE, an advanced Mixture of Experts (MoE) adaptor for lifelong model editing.
arXiv Detail & Related papers (2024-06-28T16:17:41Z)
The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse [58.0132400208411]
Even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. benchmarking Large Language Models after each edit is impractically time-consuming and resource-intensive. We have utilized GPT-3.5 to develop a new dataset, HardEdit, based on hard cases.
arXiv Detail & Related papers (2024-02-15T01:50:38Z)
Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue [122.20016030723043]
We evaluate the side effects of model editing on large language models (LLMs) Our analysis reveals that the side effects are caused by model editing altering the original model weights excessively. To mitigate this, a method named RECT is proposed to regularize the edit update weights.
arXiv Detail & Related papers (2024-01-09T18:03:15Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning [68.94230363140771]
Mixture of Cluster-conditional LoRA Experts (MoCLE) MoCLE is a novel Mixture of Experts architecture designed to activate the task-customized model parameters based on the instruction clusters. Experiments on InstructBLIP and LLaVA demonstrate the effectiveness of MoCLE.
arXiv Detail & Related papers (2023-12-19T18:11:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.