Are We Evaluating the Edit Locality of LLM Model Editing Properly?
- URL: http://arxiv.org/abs/2601.17343v1
- Date: Sat, 24 Jan 2026 07:07:21 GMT
- Title: Are We Evaluating the Edit Locality of LLM Model Editing Properly?
- Authors: Wei Liu, Haomei Xu, Hongkai Liu, Zhiying Deng, Ruixuan Li, Heng Huang, Yee Whye Teh, Wee Sun Lee,
- Abstract summary: We find that existing specificity evaluation protocols are inadequate for this purpose.<n>Existing specificity metrics are weakly correlated with the strength of specificity regularizers.<n>We also find that current metrics lack sufficient sensitivity, rendering them ineffective at distinguishing the specificity performance of different methods.
- Score: 68.441768731381
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model editing has recently emerged as a popular paradigm for efficiently updating knowledge in LLMs. A central desideratum of updating knowledge is to balance editing efficacy, i.e., the successful injection of target knowledge, and specificity (also known as edit locality), i.e., the preservation of existing non-target knowledge. However, we find that existing specificity evaluation protocols are inadequate for this purpose. We systematically elaborated on the three fundamental issues it faces. Beyond the conceptual issues, we further empirically demonstrate that existing specificity metrics are weakly correlated with the strength of specificity regularizers. We also find that current metrics lack sufficient sensitivity, rendering them ineffective at distinguishing the specificity performance of different methods. Finally, we propose a constructive evaluation protocol. Under this protocol, the conflict between open-ended LLMs and the assumption of determined answers is eliminated, query-independent fluency biases are avoided, and the evaluation strictness can be smoothly adjusted within a near-continuous space. Experiments across various LLMs, datasets, and editing methods show that metrics derived from the proposed protocol are more sensitive to changes in the strength of specificity regularizers and exhibit strong correlation with them, enabling more fine-grained discrimination of different methods' knowledge preservation capabilities.
Related papers
- EtCon: Edit-then-Consolidate for Reliable Knowledge Editing [85.20993502078899]
We propose Edit-then-Consolidate, a novel knowledge editing paradigm that aims to bridge the gap between theoretical knowledge editing methods and their real-world applicability.<n>Our framework consistently improves editing reliability and generalization under real-world evaluations, while better preserving locality and pre-trained capabilities.
arXiv Detail & Related papers (2025-12-04T12:43:50Z) - Towards Meta-Cognitive Knowledge Editing for Multimodal LLMs [71.8547241246169]
We introduce CogEdit, a novel benchmark designed to evaluate MLLMs' meta-cognitive knowledge editing abilities.<n>We propose MIND, a framework that constructs a meta-knowledge memory for self-awareness, employs game-theoretic interactions to monitor knowledge activation, and incorporates label refinement for noise-robust updates.
arXiv Detail & Related papers (2025-09-06T13:26:04Z) - Disentangling Knowledge Representations for Large Language Model Editing [38.244171146682206]
We propose DiKE, a novel approach that Disentangles Knowledge representations for LLM Editing.<n>DiKE consists of two key components: a Knowledge Representation Disentanglement (KRD) module that decomposes the subject representation into target-knowledgerelated and -unrelated components, and a Knowledge Edit (DKE) module that updates only the target-related component while explicitly preserving the unrelated one.<n>To rigorously evaluate fine-grained irrelevant knowledge preservation, we construct FINE-KED, a new benchmark comprising fine-grained irrelevant knowledge at different levels of relational similarity to the edited knowledge.
arXiv Detail & Related papers (2025-05-24T16:24:04Z) - Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs [47.06544781855325]
We propose a Fine-grained Neuron-level Knowledge Editing (FiNE) method that enhances editing locality without affecting success rates.<n>By precisely identifying and modifying specific neurons within feed-forward networks, FiNE significantly improves knowledge localization and editing.
arXiv Detail & Related papers (2025-03-03T01:30:28Z) - Revealing and Mitigating Over-Attention in Knowledge Editing [28.950187006528783]
Large Language Models have demonstrated superior performance across a wide range of tasks.<n>However, they still exhibit undesirable errors due to incorrect knowledge learned from the training data.<n> knowledge editing methods emerged to precisely edit the specific model knowledge via efficiently modifying a very small percentage of parameters.<n>These editing methods can lead to the problem of Specificity Failure, where the existing knowledge and capabilities are severely degraded due to editing.
arXiv Detail & Related papers (2025-02-20T18:51:12Z) - Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models.<n>It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Evaluating Dependencies in Fact Editing for Language Models: Specificity
and Implication Awareness [26.589633375359647]
We aim to ensure that the editing of learned facts respects internal logical constraints, which are known as dependency of knowledge.
Existing work on editing LLMs has partially addressed the issue of dependency, when the editing of a fact should apply to its lexical variations without disrupting irrelevant ones.
We propose an evaluation protocol with an accompanying question-answering dataset, DepEdit, that provides a comprehensive assessment of the editing process.
arXiv Detail & Related papers (2023-12-04T12:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.