MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs
- URL: http://arxiv.org/abs/2602.10965v1
- Date: Wed, 11 Feb 2026 15:56:30 GMT
- Title: MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs
- Authors: Yupu Gu, Rongzhe Wei, Andy Zhu, Pan Li,
- Abstract summary: MoEEdit is a routing-stable framework for parameter-modifying knowledge editing in large language models.<n>We show that MoEEdit attains state-of-the-art efficacy and generalization while preserving high specificity and routing stability.
- Score: 8.074300009866548
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge editing (KE) enables precise modifications to factual content in large language models (LLMs). Existing KE methods are largely designed for dense architectures, limiting their applicability to the increasingly prevalent sparse Mixture-of-Experts (MoE) models that underpin modern scalable LLMs. Although MoEs offer strong efficiency and capacity scaling, naively adapting dense-model editors is both computationally costly and prone to routing distribution shifts that undermine stability and consistency. To address these challenges, we introduce MoEEdit, the first routing-stable framework for parameter-modifying knowledge editing in MoE LLMs. Our method reparameterizes expert updates via per-expert null-space projections that keep router inputs invariant and thereby suppress routing shifts. The resulting block-structured optimization is solved efficiently with a block coordinate descent (BCD) solver. Experiments show that MoEEdit attains state-of-the-art efficacy and generalization while preserving high specificity and routing stability, with superior compute and memory efficiency. These results establish a robust foundation for scalable, precise knowledge editing in sparse LLMs and underscore the importance of routing-stable interventions.
Related papers
- Generalizable Multimodal Large Language Model Editing via Invariant Trajectory Learning [46.514554089834554]
Existing editing methods rely on a rigid mapping from parameter or module modifications to output.<n>In this paper, we reformulate MLLM editing as an out-of-distribution (OOD) generalization problem.<n>We propose ODEdit, a plug-and-play invariant learning based framework that enhances editing reliability, locality, and generality.
arXiv Detail & Related papers (2026-01-27T15:25:07Z) - Representation Interventions Enable Lifelong Unstructured Knowledge Control [54.86207134539453]
Large language models (LLMs) often produce incorrect or outdated content. Updating their knowledge efficiently and accurately without costly retraining is a major challenge.<n>We introduce RILKE, a robust and scalable method that treats knowledge control as interventions within the model's representation space.<n>During training, RILKE learns paraphrase-robust and edit-localized modules that limit each update to a low-dimensional subspace to minimize cross-edit interference.<n>In inference, a query-adaptive router selects the appropriate module to guide the model's generation.
arXiv Detail & Related papers (2025-11-25T22:15:00Z) - EMSEdit: Efficient Multi-Step Meta-Learning-based Model Editing [20.6706431279733]
EMSEdit is a lightweight alternative to meta-learning-based model editing.<n>We show that EMSEdit consistently outperforms state-of-the-art methods in both sequential and batch editing.
arXiv Detail & Related papers (2025-08-06T01:54:58Z) - InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing [86.17245523439514]
In-context learning is a promising editing method by comprehending edit information through context encoding.<n>This method is constrained by the limited context window of large language models.<n>We propose InComeS, a flexible framework that enhances LLMs' ability to process editing contexts.
arXiv Detail & Related papers (2025-05-28T09:20:18Z) - EAMET: Robust Massive Model Editing via Embedding Alignment Optimization [12.022506016268112]
We propose EAMET (Embedding Alignment Model Editing in Transformers) to address the embedding misalignment among knowledge items.<n>Experiments show that EAMET consistently outperforms existing methods, achieving about 90% editing efficacy when editing 10k facts.
arXiv Detail & Related papers (2025-05-17T07:00:02Z) - DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [86.76714527437383]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks.<n>We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge.<n>Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z) - Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design [59.00758127310582]
We propose a novel framework Read-ME that transforms pre-trained dense LLMs into smaller MoE models.
Our approach employs activation sparsity to extract experts.
Read-ME outperforms other popular open-source dense models of similar scales.
arXiv Detail & Related papers (2024-10-24T19:48:51Z) - ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA [55.697627106315004]
Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors.<n>Previous approaches manage sequential edits by freezing original parameters and discretely allocating new parameters for each knowledge update.<n>We propose ELDER, a novel approach to create a continuous association between data and adapters.
arXiv Detail & Related papers (2024-08-19T02:27:00Z) - SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering [17.20346072074533]
Recent model editing is a promising technique for efficiently updating a small amount of knowledge of large language models.<n>We propose a detachable and expandable Subject Word Embedding Altering (SWEA) framework, which finds the editing embeddings through token-level matching.<n>We demonstrate the overall state-of-the-art (SOTA) performance of SWEA$oplus$OS on the CounterFact and zsRE datasets.
arXiv Detail & Related papers (2024-01-31T13:08:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.