Related papers: Rethinking the Residual Distribution of Locate-then-Editing Methods in Model Editing

Rethinking the Residual Distribution of Locate-then-Editing Methods in Model Editing

URL: http://arxiv.org/abs/2502.03748v2
Date: Mon, 13 Oct 2025 01:45:48 GMT
Title: Rethinking the Residual Distribution of Locate-then-Editing Methods in Model Editing
Authors: Xiaopeng Li, Shanwen Wang, Shasha Li, Shezheng Song, Bin Ji, Jun Ma, Jie Yu,
Abstract summary: Model editing enables targeted updates to the knowledge of large language models.<n> locate-then-edit methods first identify critical layers, then compute residuals at the final critical layer based on the target edit.<n> residual distribution, a core mechanism in these methods, introduces weight shift errors that undermine editing precision.<n>We propose the BLUE strategy to enhance locate-then-edit methods.
Score: 14.958557185068
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model editing enables targeted updates to the knowledge of large language models (LLMs) with minimal retraining. Among existing approaches, locate-then-edit methods constitute a prominent paradigm: they first identify critical layers, then compute residuals at the final critical layer based on the target edit, and finally apply least-squares-based multi-layer updates via $\textbf{residual distribution}$. While empirically effective, we identify a counterintuitive failure mode: residual distribution, a core mechanism in these methods, introduces weight shift errors that undermine editing precision. Through theoretical and empirical analysis, we show that such errors increase with the distribution distance, batch size, and edit sequence length, ultimately leading to inaccurate or suboptimal edits. To address this, we propose the $\textbf{B}$oundary $\textbf{L}$ayer $\textbf{U}$pdat$\textbf{E (BLUE)}$ strategy to enhance locate-then-edit methods. Sequential batch editing experiments on three LLMs and two datasets demonstrate that BLUE not only delivers an average performance improvement of 35.59\%, significantly advancing the state of the art in model editing, but also enhances the preservation of LLMs' general capabilities. Our code is available at https://github.com/xpq-tech/BLUE.

Related papers

Understanding Robustness of Model Editing in Code LLMs: An Empirical Study [1.5624785508022727]
We present a systematic study of five state-of-the-art model editing methods.<n>We apply these methods to three leading open-source code LLMs, CodeLlama, CodeQwen1.5, and DeepSeek-Coder.<n>Instant edits consistently degrade model performance, with syntactic validity dropping by up to 86 percentage points and functional correctness declining by 45 points even in the best-performing setting.
arXiv Detail & Related papers (2025-11-05T04:58:13Z)
Fine-tuning Done Right in Model Editing [83.79661791576103]
Fine-tuning, a foundational method for adapting large language models, has long been considered ineffective for model editing.<n>We restore fine-tuning to the standard breadth-first (i.e., epoch-based) pipeline with mini-batch optimization.<n>We derive LocFT-BF, a simple and effective localized editing method built on the restored fine-tuning framework.
arXiv Detail & Related papers (2025-09-26T08:53:13Z)
EMSEdit: Efficient Multi-Step Meta-Learning-based Model Editing [20.6706431279733]
EMSEdit is a lightweight alternative to meta-learning-based model editing.<n>We show that EMSEdit consistently outperforms state-of-the-art methods in both sequential and batch editing.
arXiv Detail & Related papers (2025-08-06T01:54:58Z)
MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs [76.28901550926021]
Existing methods for lifelong model editing compromise generalization, interfere with past edits, or fail to scale to long editing sequences.<n>We propose MEMOIR, a novel scalable framework that injects knowledge through a residual memory, while preserving the core capabilities of the pre-trained model.<n>MeMOIR achieves state-of-the-art performance across reliability, generalization, and locality metrics, scaling to thousands of sequential edits with minimal forgetting.
arXiv Detail & Related papers (2025-06-09T16:16:42Z)
The Mirage of Model Editing: Revisiting Evaluation in the Wild [70.17413507444704]
We study the effectiveness of model editing in question answering applications. Our single editing experiments indicate that current editing methods perform substantially worse than previously reported. Our analysis provides a fundamental reexamination of both the real-world applicability of existing model editing methods and their evaluation practices.
arXiv Detail & Related papers (2025-02-16T15:57:55Z)
Learning Where to Edit Vision Transformers [27.038720045544867]
We propose a locate-then-edit approach for editing vision Transformers (ViTs) in computer vision. We first address the where-to-edit challenge by meta-learning a hypernetwork on CutMix-augmented data. To validate our method, we construct an editing benchmark that introduces subpopulation shifts towards natural underrepresented images and AI-generated images.
arXiv Detail & Related papers (2024-11-04T10:17:40Z)
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models [63.209935157623946]
Large language models (LLMs) often exhibit hallucinations due to incorrect or outdated knowledge. We introduce AlphaEdit, a novel solution that projects perturbation onto the null space of the preserved knowledge before applying it to the parameters. We theoretically prove that this projection ensures the output of post-edited LLMs remains unchanged when queried about the preserved knowledge.
arXiv Detail & Related papers (2024-10-03T10:06:27Z)
ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA [55.697627106315004]
Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors. Previous approaches manage sequential edits by freezing original parameters and discretely allocating new parameters for each knowledge update. We propose ELDER, a novel approach to create a continuous association between data and adapters.
arXiv Detail & Related papers (2024-08-19T02:27:00Z)
Perturbation-Restrained Sequential Model Editing [33.51709226068619]
Current model editing methods compromise the general abilities of large language models (LLMs) as the number of edits increases. We propose a framework termed Perturbation Restraint on Upper bouNd for Editing (PRUNE) PRUNE can preserve considerable general abilities while maintaining the editing performance effectively in sequential model editing.
arXiv Detail & Related papers (2024-05-27T04:40:56Z)
Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3 [2.569159339315845]
This study presents a targeted model editing analysis focused on the latest large language model, Llama-3. We identify the most effective layers for targeted edits through an evaluation that encompasses up to 4096 edits.
arXiv Detail & Related papers (2024-05-01T17:50:37Z)
Editing Conceptual Knowledge for Large Language Models [65.38231526537476]
This paper pioneers the investigation of editing conceptual knowledge for Large Language Models (LLMs) We construct a novel benchmark dataset ConceptEdit and establish a suite of new metrics for evaluation. experimental results reveal that, although existing editing methods can efficiently modify concept-level definition to some extent, they also have the potential to distort the related instantial knowledge.
arXiv Detail & Related papers (2024-03-10T16:57:10Z)
The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse [58.0132400208411]
Even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. benchmarking Large Language Models after each edit is impractically time-consuming and resource-intensive. We have utilized GPT-3.5 to develop a new dataset, HardEdit, based on hard cases.
arXiv Detail & Related papers (2024-02-15T01:50:38Z)
Knowledge Editing on Black-box Large Language Models [37.17131278142237]
Knowledge editing aims to efficiently and precisely modify the behavior of large language models (LLMs) to update specific knowledge. Current research primarily focuses on white-box LLMs editing, overlooking an important scenario: black-box LLMs editing. We introduce KE on black-box LLMs and then propose a comprehensive evaluation framework to overcome the limitations of existing evaluations. Experiments and analysis on two benchmarks demonstrate that postEdit outperforms all baselines and achieves strong generalization.
arXiv Detail & Related papers (2024-02-13T17:59:34Z)
Emptying the Ocean with a Spoon: Should We Edit Models? [8.545919917068273]
We call into question the recently popularized method of direct model editing as a means of correcting factual errors in LLM generations. We contrast model editing with three similar but distinct approaches that pursue better defined objectives.
arXiv Detail & Related papers (2023-10-18T13:38:03Z)
Editing Large Language Models: Problems, Methods, and Opportunities [51.903537096207]
This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. We provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.
arXiv Detail & Related papers (2023-05-22T16:00:00Z)
Memory-Based Model Editing at Scale [102.28475739907498]
Existing model editors struggle to accurately model an edit's intended scope. We propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC) SERAC stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed.
arXiv Detail & Related papers (2022-06-13T23:40:34Z)
Fast Model Editing at Scale [77.69220974621425]
We propose Model Editor Networks with Gradient Decomposition (MEND) MEND is a collection of small auxiliary editing networks that use a single desired input-output pair to make fast, local edits to a pre-trained model. MEND can be trained on a single GPU in less than a day even for 10 billion+ parameter models.
arXiv Detail & Related papers (2021-10-21T17:41:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.