Multiplicative Orthogonal Sequential Editing for Language Models
- URL: http://arxiv.org/abs/2601.07873v1
- Date: Sun, 11 Jan 2026 04:09:32 GMT
- Title: Multiplicative Orthogonal Sequential Editing for Language Models
- Authors: Hao-Xiang Xu, Jun-Yu Ma, Ziqi Peng, Yuhao Sun, Zhen-Hua Ling, Jia-Chen Gu,
- Abstract summary: We propose a new knowledge editing paradigm termed Multiplicative Orthogonal Sequential Editing (MOSE)<n>Compared to current methods, MOSE achieves a 12.08% improvement in sequential editing performance, while retaining 95.73% of general abilities across downstream tasks.
- Score: 55.42748430481554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge editing aims to efficiently modify the internal knowledge of large language models (LLMs) without compromising their other capabilities. The prevailing editing paradigm, which appends an update matrix to the original parameter matrix, has been shown by some studies to damage key numerical stability indicators (such as condition number and norm), thereby reducing editing performance and general abilities, especially in sequential editing scenario. Although subsequent methods have made some improvements, they remain within the additive framework and have not fundamentally addressed this limitation. To solve this problem, we analyze it from both statistical and mathematical perspectives and conclude that multiplying the original matrix by an orthogonal matrix does not change the numerical stability of the matrix. Inspired by this, different from the previous additive editing paradigm, a multiplicative editing paradigm termed Multiplicative Orthogonal Sequential Editing (MOSE) is proposed. Specifically, we first derive the matrix update in the multiplicative form, the new knowledge is then incorporated into an orthogonal matrix, which is multiplied by the original parameter matrix. In this way, the numerical stability of the edited matrix is unchanged, thereby maintaining editing performance and general abilities. We compared MOSE with several current knowledge editing methods, systematically evaluating their impact on both editing performance and the general abilities across three different LLMs. Experimental results show that MOSE effectively limits deviations in the edited parameter matrix and maintains its numerical stability. Compared to current methods, MOSE achieves a 12.08% improvement in sequential editing performance, while retaining 95.73% of general abilities across downstream tasks. The code is available at https://github.com/famoustourist/MOSE.
Related papers
- Spectral Characterization and Mitigation of Sequential Knowledge Editing Collapse [44.49646322759214]
We show that a model's general abilities are closely associated with dominant singular directions of pretrained weight matrices.<n>We propose REVIVE, a plug-and-play framework that stabilizes sequential editing by explicitly preserving the dominant singular subspace.
arXiv Detail & Related papers (2026-01-16T07:18:14Z) - MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs [76.28901550926021]
Existing methods for lifelong model editing compromise generalization, interfere with past edits, or fail to scale to long editing sequences.<n>We propose MEMOIR, a novel scalable framework that injects knowledge through a residual memory, while preserving the core capabilities of the pre-trained model.<n>MeMOIR achieves state-of-the-art performance across reliability, generalization, and locality metrics, scaling to thousands of sequential edits with minimal forgetting.
arXiv Detail & Related papers (2025-06-09T16:16:42Z) - LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing [28.870053452479443]
Current locate-then-edit approaches exhibit a progressive performance decline during sequential editing.<n>textbfLyapLock is proposed to decompose the long-term constrained programming into tractable stepwise subproblems for efficient solving.<n> Experimental results show that our framework scales sequential editing capacity to over 10,000 edits while stabilizing general capabilities and boosting average editing efficacy by 11.89% over SOTA baselines.
arXiv Detail & Related papers (2025-05-21T16:16:33Z) - Constraining Sequential Model Editing with Editing Anchor Compression [40.93064933191375]
Large language models (LLMs) struggle with hallucinations due to false or outdated knowledge.<n>This paper statistically observes that the parameter matrix after editing exhibits a significant deviation compared to its previous state as the number of edits increases.<n>A framework termed Editing Anchor Compression (EAC) is proposed to constrain the deviation of the parameter matrix during sequential editing.
arXiv Detail & Related papers (2025-02-25T03:56:49Z) - Reinforced Lifelong Editing for Language Models [27.669767029654526]
Large language models (LLMs) acquire information from pre-training corpora, but their stored knowledge can become inaccurate or outdated over time.<n>Model editing addresses this challenge by modifying model parameters without retraining, and prevalent approaches leverage hypernetworks to generate these parameter updates.<n>We propose RLEdit, an RL-based editing method that captures changes at the full knowledge sequence level and generates appropriate parameter updates.
arXiv Detail & Related papers (2025-02-09T03:37:06Z) - Lifelong Knowledge Editing requires Better Regularization [11.14177136208272]
We formalize the popular locate-then-edit methods as a two-step fine-tuning process.<n>We show that model degradation occurs due to over-optimization of internal activations and continuous norm-growth of edited matrices.<n>Applying these simple yet effective regularization techniques at key points in the editing process can substantially mitigate model degradation.
arXiv Detail & Related papers (2025-02-03T18:59:14Z) - Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation [53.88562288388169]
A common strategy for.
Efficient Fine-Tuning (PEFT) of pre-trained Vision Transformers (ViTs) involves adapting the model to downstream tasks.
We propose a novel PEFT approach inspired by Singular Value Decomposition (SVD) for representing the adaptation matrix.
SVD decomposes a matrix into the product of a left unitary matrix, a diagonal matrix of scaling values, and a right unitary matrix.
arXiv Detail & Related papers (2024-10-30T12:08:30Z) - Perturbation-Restrained Sequential Model Editing [33.51709226068619]
Current model editing methods compromise the general abilities of large language models (LLMs) as the number of edits increases.<n>A framework termed Perturbation Restraint on Upper bouNd for Editing (PRUNE) is proposed, which applies the condition number restraints in sequential editing.<n>The results show that PRUNE can preserve general abilities while maintaining the editing performance effectively in sequential model editing.
arXiv Detail & Related papers (2024-05-27T04:40:56Z) - Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue [122.20016030723043]
We evaluate the side effects of model editing on large language models (LLMs)
Our analysis reveals that the side effects are caused by model editing altering the original model weights excessively.
To mitigate this, a method named RECT is proposed to regularize the edit update weights.
arXiv Detail & Related papers (2024-01-09T18:03:15Z) - Memory-Based Model Editing at Scale [102.28475739907498]
Existing model editors struggle to accurately model an edit's intended scope.
We propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC)
SERAC stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed.
arXiv Detail & Related papers (2022-06-13T23:40:34Z) - Multi-Objective Matrix Normalization for Fine-grained Visual Recognition [153.49014114484424]
Bilinear pooling achieves great success in fine-grained visual recognition (FGVC)
Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features.
We propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation.
arXiv Detail & Related papers (2020-03-30T08:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.