Keys to Robust Edits: from Theoretical Insights to Practical Advances
- URL: http://arxiv.org/abs/2410.09338v2
- Date: Thu, 22 May 2025 02:11:21 GMT
- Title: Keys to Robust Edits: from Theoretical Insights to Practical Advances
- Authors: Jianhao Yan, Futing Wang, Yun Luo, Yafu Li, Yue Zhang,
- Abstract summary: Large language models (LLMs) struggle with maintaining accurate knowledge due to conflicting/outdated parametric memories.<n>Our solution introduces textitRobust Edit Pathway (REP), a plug-and-play module that disentangles editing keys from native model representations.
- Score: 20.10464264597003
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) struggle with maintaining accurate knowledge due to conflicting/outdated parametric memories. While locate-and-edit methods address this, their reliance on models' internal representations leads to robustness failures in long-context reasoning and paraphrased queries. We identify a fundamental limitation of locate-and-edit methods: existing semantic keys (for memory localization) cannot simultaneously satisfy robustness (context-invariant activation) and specificity (precise knowledge discrimination). Through theoretical error-bound analysis, we establish formal criteria for effective editing. Our solution introduces \textit{Robust Edit Pathway (REP)}, a plug-and-play module that: (1) disentangles editing keys from native model representations; (2) dynamically adjusts keys via contrastive learning to achieve robustness-specificity balance. Extensive experiments across various editing methods (ROME/MEMIT/R-ROME/EMMET), existing LLMs (LLaMA2, QWen, Mistral), and datasets (CounterFact, ZsRE) show that REP improves success rate over robustness tests by up-to 66.4\% while maintaining the success rate unaffected. Our code can be found at https://github.com/ElliottYan/RobustKeyEdit .
Related papers
- Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities [32.76303717104482]
We introduce DeR2, a controlled deep-research sandbox that isolates document-grounded reasoning.<n>DeR2 decouples evidence access from reasoning via four regimes--Instruction-only, Concepts, Related-only, and Full-set.<n> Experiments across a diverse set of state-of-the-art foundation models reveal substantial variation and significant headroom.
arXiv Detail & Related papers (2026-01-29T16:26:19Z) - MDiff4STR: Mask Diffusion Model for Scene Text Recognition [59.79818820650126]
Mask Diffusion Models (MDMs) have emerged as a promising alternative to auto-regressive models (ARMs) for vision-language tasks.<n>We show that vanilla MDM lags behind ARMs in terms of accuracy, although it improves recognition efficiency.<n>We propose MDiff4STR, a Mask Diffusion model enhanced with two key improvement strategies tailored for Scene Text Recognition.
arXiv Detail & Related papers (2025-12-01T08:57:51Z) - Representation Interventions Enable Lifelong Unstructured Knowledge Control [54.86207134539453]
Large language models (LLMs) often produce incorrect or outdated content. Updating their knowledge efficiently and accurately without costly retraining is a major challenge.<n>We introduce RILKE, a robust and scalable method that treats knowledge control as interventions within the model's representation space.<n>During training, RILKE learns paraphrase-robust and edit-localized modules that limit each update to a low-dimensional subspace to minimize cross-edit interference.<n>In inference, a query-adaptive router selects the appropriate module to guide the model's generation.
arXiv Detail & Related papers (2025-11-25T22:15:00Z) - Dynamic Retriever for In-Context Knowledge Editing via Policy Optimization [11.338802325779866]
We propose Dynamic Retriever for In-Context Knowledge Editing (DR-IKE)<n>DR-IKE is a lightweight framework that trains a BERT retriever with REINFORCE to rank demonstrations by editing reward.<n>It improves edit success by up to 17.1%, reduces latency by 41.6%, and preserves accuracy on unrelated queries.
arXiv Detail & Related papers (2025-10-24T00:15:30Z) - Rethinking Reasoning in LLMs: Neuro-Symbolic Local RetoMaton Beyond ICL and CoT [8.06437018518246]
We extend RetoMaton by replacing its global datastore with a local, task-adaptive weighted Finite Automaton.<n>This local automaton structure promotes robust, context-aware retrieval while preserving symbolic traceability and low inference overhead.<n>Our results highlight a promising shift toward trustworthy, symbolic reasoning in modern language models.
arXiv Detail & Related papers (2025-08-22T16:51:06Z) - CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward [50.97588334916863]
We develop CompassVerifier, an accurate and robust lightweight verifier model for evaluation and outcome reward.<n>It demonstrates multi-domain competency spanning math, knowledge, and diverse reasoning tasks, with the capability to process various answer types.<n>We introduce VerifierBench benchmark comprising model outputs collected from multiple data sources, augmented through manual analysis of metaerror patterns to enhance CompassVerifier.
arXiv Detail & Related papers (2025-08-05T17:55:24Z) - CAAD: Context-Aware Adaptive Decoding for Truthful Text Generation [31.469511576774252]
We propose a context-aware adaptive decoding method for large language models.<n>Our approach achieves a 2.8 percent average improvement on TruthfulQA.<n>Our model-agnostic, scalable, and efficient method requires only a single generation pass.
arXiv Detail & Related papers (2025-08-04T08:28:25Z) - DS-Det: Single-Query Paradigm and Attention Disentangled Learning for Flexible Object Detection [39.56089737473775]
We propose DS-Det, a more efficient transformer detector capable of detecting a flexible number of objects in images.<n>Specifically, we reformulate and introduce a new unified Single-Query paradigm for decoder modeling.<n>We also propose a simplified decoder framework through attention disentangled learning.
arXiv Detail & Related papers (2025-07-26T05:40:04Z) - MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs [82.34547399693966]
Existing methods for lifelong model editing compromise generalization, interfere with past edits, or fail to scale to long editing sequences.<n>We propose MEMOIR, a novel scalable framework that injects knowledge through a residual memory.<n>MeMOIR confines each edit to a distinct subset of the memory parameters, minimizing interference among edits.
arXiv Detail & Related papers (2025-06-09T16:16:42Z) - $μ$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models [8.472795721252856]
Matryoshka Unstructured Knowledge Editing preserves dependencies between memory updates and output tokens.
$mu$KE improves edit efficacy by up to 12.33% over state-of-the-art methods.
arXiv Detail & Related papers (2025-04-01T21:24:44Z) - Knowledge Updating? No More Model Editing! Just Selective Contextual Reasoning [38.018263569983226]
We provide an evaluation of ten model editing methods along four dimensions: reliability, generalization, locality, and portability.
We then propose a straightforward method called Selective Contextual Reasoning (SCR) for knowledge updating.
arXiv Detail & Related papers (2025-03-07T08:04:25Z) - The Mirage of Model Editing: Revisiting Evaluation in the Wild [70.17413507444704]
We study the effectiveness of model editing in question answering applications.
Our single editing experiments indicate that current editing methods perform substantially worse than previously reported.
Our analysis provides a fundamental reexamination of both the real-world applicability of existing model editing methods and their evaluation practices.
arXiv Detail & Related papers (2025-02-16T15:57:55Z) - Robust Search with Uncertainty-Aware Value Models for Language Model Reasoning [31.973976155760397]
Value model guided search is effective in steering LLM generation but suffers from a lack of robustness.<n>We propose an uncertainty-aware framework with two key components: (1) Uncertainty-Aware Value Models (UVMs), which replace single-point value estimates with value distributions to quantify prediction reliability, and (2) Group Thompson Sampling, an efficient algorithm that selects candidates based on their probability of being optimal.
arXiv Detail & Related papers (2025-02-16T15:10:30Z) - Joint Localization and Activation Editing for Low-Resource Fine-Tuning [73.64004083269424]
We propose a joint localization and activation editing (JoLA) method.<n>JoLA learns (1) which heads in the Transformer to edit (2) whether the intervention should be additive, multiplicative, or both and (3) the intervention parameters themselves.<n>We demonstrate that JoLA consistently outperforms existing methods.
arXiv Detail & Related papers (2025-02-03T09:13:09Z) - Uncovering Overfitting in Large Language Model Editing [35.55260822503773]
We identify and investigate the phenomenon of Editing Overfit, where edited models assign disproportionately high probabilities to the edit target.
We propose a new plug-and-play strategy called Learn to Inference (LTI), which introduce a Multi-stage Inference Constraint module to guide the edited models in recalling new knowledge.
arXiv Detail & Related papers (2024-10-10T11:09:00Z) - ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA [55.697627106315004]
Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors.<n>Previous approaches manage sequential edits by freezing original parameters and discretely allocating new parameters for each knowledge update.<n>We propose ELDER, a novel approach to create a continuous association between data and adapters.
arXiv Detail & Related papers (2024-08-19T02:27:00Z) - Understanding the Collapse of LLMs in Model Editing [37.429695927372755]
We study the root causes of such collapse.
We propose a simple yet effective approach: uniformly using prefixed keys during editing phase and adding prefixes during testing phase.
arXiv Detail & Related papers (2024-06-17T07:08:29Z) - Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models [26.516571783335824]
Recent studies have identified side effects, such as knowledge distortion and the deterioration of general abilities, that have emerged after editing.
This survey presents a comprehensive study of these side effects, providing a unified perspective on the challenges of knowledge editing in large language models.
arXiv Detail & Related papers (2024-06-03T15:28:21Z) - Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.
Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z) - Editing Conceptual Knowledge for Large Language Models [65.38231526537476]
This paper pioneers the investigation of editing conceptual knowledge for Large Language Models (LLMs)
We construct a novel benchmark dataset ConceptEdit and establish a suite of new metrics for evaluation.
experimental results reveal that, although existing editing methods can efficiently modify concept-level definition to some extent, they also have the potential to distort the related instantial knowledge.
arXiv Detail & Related papers (2024-03-10T16:57:10Z) - The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse [58.0132400208411]
Even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks.
benchmarking Large Language Models after each edit is impractically time-consuming and resource-intensive.
We have utilized GPT-3.5 to develop a new dataset, HardEdit, based on hard cases.
arXiv Detail & Related papers (2024-02-15T01:50:38Z) - Propagation and Pitfalls: Reasoning-based Assessment of Knowledge
Editing through Counterfactual Tasks [36.292901021210575]
We introduce a novel reasoning-based benchmark -- ReCoE (Reasoning-based Counterfactual Editing dataset)
We conduct a thorough analysis of existing knowledge editing techniques, including input augmentation, finetuning, and locate-and-edit.
All model editing methods show notably low performance on this dataset, especially in certain reasoning schemes.
arXiv Detail & Related papers (2024-01-31T04:12:59Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Editing Large Language Models: Problems, Methods, and Opportunities [51.903537096207]
This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs.
We provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal.
Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.
arXiv Detail & Related papers (2023-05-22T16:00:00Z) - RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by
Reversing Chain-of-Thought [56.558892336235914]
Reversing Chain-of-Thought (RCoT) is a novel method to improve large language models' reasoning abilities.
RCoT automatically detects and rectifys factual inconsistency in generated solutions.
We show that manually written fine-grained feedback can dramatically improve LLMs' reasoning abilities.
arXiv Detail & Related papers (2023-05-19T08:02:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.