Disentangling Knowledge Representations for Large Language Model Editing
- URL: http://arxiv.org/abs/2505.18774v1
- Date: Sat, 24 May 2025 16:24:04 GMT
- Title: Disentangling Knowledge Representations for Large Language Model Editing
- Authors: Mengqi Zhang, Zisheng Zhou, Xiaotian Ye, Qiang Liu, Zhaochun Ren, Zhumin Chen, Pengjie Ren,
- Abstract summary: We propose DiKE, a novel approach that Disentangles Knowledge representations for LLM Editing.<n>DiKE consists of two key components: a Knowledge Representation Disentanglement (KRD) module that decomposes the subject representation into target-knowledgerelated and -unrelated components, and a Knowledge Edit (DKE) module that updates only the target-related component while explicitly preserving the unrelated one.<n>To rigorously evaluate fine-grained irrelevant knowledge preservation, we construct FINE-KED, a new benchmark comprising fine-grained irrelevant knowledge at different levels of relational similarity to the edited knowledge.
- Score: 38.244171146682206
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge Editing has emerged as a promising solution for efficiently updating embedded knowledge in large language models (LLMs). While existing approaches demonstrate effectiveness in integrating new knowledge and preserving the original capabilities of LLMs, they fail to maintain fine-grained irrelevant knowledge facts that share the same subject as edited knowledge but differ in relation and object. This challenge arises because subject representations inherently encode multiple attributes, causing the target and fine-grained irrelevant knowledge to become entangled in the representation space, and thus vulnerable to unintended alterations during editing. To address this, we propose DiKE, a novel approach that Disentangles Knowledge representations for LLM Editing (DiKE). DiKE consists of two key components: a Knowledge Representation Disentanglement (KRD) module that decomposes the subject representation into target-knowledgerelated and -unrelated components, and a Disentanglement-based Knowledge Edit (DKE) module that updates only the target-related component while explicitly preserving the unrelated one. We further derive a closed-form, rank-one parameter update based on matrix theory to enable efficient and minimally invasive edits. To rigorously evaluate fine-grained irrelevant knowledge preservation, we construct FINE-KED, a new benchmark comprising fine-grained irrelevant knowledge at different levels of relational similarity to the edited knowledge. Extensive experiments across multiple LLMs demonstrate that DiKE substantially improves fine-grained irrelevant knowledge preservation while maintaining competitive general editing performance.
Related papers
- Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs [47.06544781855325]
We propose a Fine-grained Neuron-level Knowledge Editing (FiNE) method that enhances editing locality without affecting success rates.<n>By precisely identifying and modifying specific neurons within feed-forward networks, FiNE significantly improves knowledge localization and editing.
arXiv Detail & Related papers (2025-03-03T01:30:28Z) - Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating [61.70705744491162]
Large Language Models (LLMs) excel in natural language processing by encoding extensive human knowledge.<n> Updating LLMs involves two key tasks simultaneously: unlearning to remove unwanted knowledge and editing to incorporate new information.<n>We propose LOKA, a conflict-free framework for LLM updating based on a knowledge codebook.
arXiv Detail & Related papers (2025-01-31T20:48:46Z) - ConKE: Conceptualization-Augmented Knowledge Editing in Large Language Models for Commonsense Reasoning [47.98788315789392]
ConceptEdit is a framework that integrates conceptualization and instantiation into the Knowledge Editing pipeline.<n>We show that ConceptEdit successfully generates commonsense knowledge with improved plausibility compared to other baselines.
arXiv Detail & Related papers (2024-12-16T03:34:40Z) - Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration [107.31481207855835]
Current methods, including intrinsic knowledge editing and external knowledge resorting, each possess strengths and weaknesses.
We propose UniKE, a novel multimodal editing method that establishes a unified perspective for intrinsic knowledge editing and external knowledge resorting.
arXiv Detail & Related papers (2024-09-30T02:13:53Z) - Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models [65.10456412127405]
We propose a novel Unstructured Knowledge Editing method, namely UnKE.<n>In the layer dimension, we propose non-local block key-value storage to replace local layer key-value storage.<n>In the token dimension, we replace "term-driven optimization" with "cause-driven optimization", which edits the last token directly while preserving context.
arXiv Detail & Related papers (2024-05-24T08:42:40Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.