Related papers: REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration

REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration

URL: http://arxiv.org/abs/2510.01879v1
Date: Thu, 02 Oct 2025 10:35:39 GMT
Title: REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration
Authors: Yisu Wang, Ming Wang, Haoyuan Song, Wenjie Huang, Chaozheng Wang, Yi Xie, Xuming Ran,
Abstract summary: Post-training for large language models (LLMs) is constrained by the high cost of acquiring new knowledge or correcting errors.<n>This work introduces a robust framework for developing reliable, scalable, and continually evolving LLMs.
Score: 11.462236606266567
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Post-training for large language models (LLMs) is constrained by the high cost of acquiring new knowledge or correcting errors and by the unintended side effects that frequently arise from retraining. To address these issues, we introduce REPAIR (Robust Editing via Progressive Adaptive Intervention and Reintegration), a lifelong editing framework designed to support precise and low-cost model updates while preserving non-target knowledge. REPAIR mitigates the instability and conflicts of large-scale sequential edits through a closed-loop feedback mechanism coupled with dynamic memory management. Furthermore, by incorporating frequent knowledge fusion and enforcing strong locality guards, REPAIR effectively addresses the shortcomings of traditional distribution-agnostic approaches that often overlook unintended ripple effects. Our experiments demonstrate that REPAIR boosts editing accuracy by 10%-30% across multiple model families and significantly reduces knowledge forgetting. This work introduces a robust framework for developing reliable, scalable, and continually evolving LLMs.

Related papers

Conflict-Resolving and Sharpness-Aware Minimization for Generalized Knowledge Editing with Multiple Updates [69.6610686845008]
CoRSA is a parameter-efficient, holistic approach for knowledge editing with multiple updates.<n>It tackles multiple challenges simultaneously: it improves generalization to different input forms and enhances stability across multiple updates.<n>CoRSA also generalizes to the code domain, outperforming the strongest baseline by 5.48% Pass@5 in update efficacy.
arXiv Detail & Related papers (2026-02-03T16:18:06Z)
Causality-Inspired Safe Residual Correction for Multivariate Time Series [12.183024727781449]
We propose CRC (Causality-inspired Safe Residual Correction), a plug-and-play framework explicitly designed to ensure non-degradation.<n>It employs a causality-inspired encoder to expose direction-aware structure by decoupling self- and cross-variable dynamics, and a hybrid corrector to model residual errors.<n>Experiments show that CRC consistently improves accuracy, while an in-depth ablation study confirms that its core safety mechanisms ensure exceptionally high non-degradation rates (NDR)
arXiv Detail & Related papers (2025-12-27T01:34:14Z)
Representation Interventions Enable Lifelong Unstructured Knowledge Control [54.86207134539453]
Large language models (LLMs) often produce incorrect or outdated content. Updating their knowledge efficiently and accurately without costly retraining is a major challenge.<n>We introduce RILKE, a robust and scalable method that treats knowledge control as interventions within the model's representation space.<n>During training, RILKE learns paraphrase-robust and edit-localized modules that limit each update to a low-dimensional subspace to minimize cross-edit interference.<n>In inference, a query-adaptive router selects the appropriate module to guide the model's generation.
arXiv Detail & Related papers (2025-11-25T22:15:00Z)
STABLE: Gated Continual Learning for Large Language Models [0.0]
STABLE is a gated continual self editing framework that constrains forgetting during sequential updates.<n>Each candidate edit is evaluated against a stability budget using one of three metrics.<n>Experiments on the Qwen-2.5-7B model show that gating effectively mitigates forgetting while preserving adaptability.
arXiv Detail & Related papers (2025-10-17T16:14:05Z)
MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs [76.28901550926021]
Existing methods for lifelong model editing compromise generalization, interfere with past edits, or fail to scale to long editing sequences.<n>We propose MEMOIR, a novel scalable framework that injects knowledge through a residual memory, while preserving the core capabilities of the pre-trained model.<n>MeMOIR achieves state-of-the-art performance across reliability, generalization, and locality metrics, scaling to thousands of sequential edits with minimal forgetting.
arXiv Detail & Related papers (2025-06-09T16:16:42Z)
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards [67.86091419220816]
Large Language Models (LLMs) show great promise in complex reasoning.<n>A prevalent issue is superficial self-reflection'', where models fail to robustly verify their own outputs.<n>We introduce RISE (Reinforcing Reasoning with Self-Verification), a novel online RL framework designed to tackle this.
arXiv Detail & Related papers (2025-05-19T17:59:31Z)
Model Hemorrhage and the Robustness Limits of Large Language Models [119.46442117681147]
Large language models (LLMs) demonstrate strong performance across natural language processing tasks, yet undergo significant performance degradation when modified for deployment.<n>We define this phenomenon as model hemorrhage - performance decline caused by parameter alterations and architectural changes.
arXiv Detail & Related papers (2025-03-31T10:16:03Z)
GRU: Mitigating the Trade-off between Unlearning and Retention for LLMs [34.90826139012299]
We propose gradient Rectified Unlearning (GRU), an improved framework that regulates the directions of updates during the unlearning procedure.<n>GRU is easy and general to implement, demonstrating practical effectiveness across a variety of well-established unlearning benchmarks.
arXiv Detail & Related papers (2025-03-12T07:08:54Z)
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling [48.15636223774418]
Large language models (LLMs) are prone to hallucination stemming from misaligned self-awareness.<n>We propose the Explicit Knowledge Boundary Modeling framework to integrate fast and slow reasoning systems to harmonize reliability and usability.
arXiv Detail & Related papers (2025-03-04T03:16:02Z)
CoME: An Unlearning-based Approach to Conflict-free Model Editing [8.215201299292033]
Large language models (LLMs) often retain outdated or incorrect information from pre-training, which undermines their reliability.<n>We propose Conflict-free Model Editing (CoME), a novel framework that enhances the accuracy of knowledge updates in LLMs by selectively removing outdated knowledge.
arXiv Detail & Related papers (2025-02-20T04:55:38Z)
Temporal-Difference Variational Continual Learning [89.32940051152782]
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.<n>Our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.
arXiv Detail & Related papers (2024-10-10T10:58:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.