An Information-Theoretic Framework for Robust Large Language Model Editing
- URL: http://arxiv.org/abs/2512.16227v1
- Date: Thu, 18 Dec 2025 06:21:17 GMT
- Title: An Information-Theoretic Framework for Robust Large Language Model Editing
- Authors: Qizhou Chen, Chengyu Wang, Taolin Zhang, Xiaofeng He,
- Abstract summary: Large Language Models (LLMs) have become indispensable tools in science, technology, and society.<n>Errors or outdated information within these models can undermine their accuracy and restrict their safe deployment.<n>We introduce a novel framework for editing LLMs, grounded in information bottleneck theory.<n>We present the Information Bottleneck Knowledge Editor (IBKE), which leverages compact latent representations to guide gradient-based updates.
- Score: 17.984683741974063
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have become indispensable tools in science, technology, and society, enabling transformative advances across diverse fields. However, errors or outdated information within these models can undermine their accuracy and restrict their safe deployment. Developing efficient strategies for updating model knowledge without the expense and disruption of full retraining remains a critical challenge. Current model editing techniques frequently struggle to generalize corrections beyond narrow domains, leading to unintended consequences and limiting their practical impact. Here, we introduce a novel framework for editing LLMs, grounded in information bottleneck theory. This approach precisely compresses and isolates the essential information required for generalizable knowledge correction while minimizing disruption to unrelated model behaviors. Building upon this foundation, we present the Information Bottleneck Knowledge Editor (IBKE), which leverages compact latent representations to guide gradient-based updates, enabling robust and broadly applicable model editing. We validate IBKE's effectiveness across multiple LLM architectures and standard benchmark tasks, demonstrating state-of-the-art accuracy and improved generality and specificity of edits. These findings establish a theoretically principled and practical paradigm for open-domain knowledge editing, advancing the utility and trustworthiness of LLMs in real-world applications.
Related papers
- Consistency-Aware Editing for Entity-level Unlearning in Language Models [53.522931419965424]
We introduce a novel consistency-aware editing (CAE) framework for entity-level unlearning.<n>CAE aggregates a diverse set of prompts related to a target entity, including its attributes, relations, and adversarial paraphrases.<n>It then jointly learns a low-rank update guided by a consistency regularizer that aligns the editing directions across prompts.
arXiv Detail & Related papers (2025-12-19T15:18:07Z) - EtCon: Edit-then-Consolidate for Reliable Knowledge Editing [85.20993502078899]
We propose Edit-then-Consolidate, a novel knowledge editing paradigm that aims to bridge the gap between theoretical knowledge editing methods and their real-world applicability.<n>Our framework consistently improves editing reliability and generalization under real-world evaluations, while better preserving locality and pre-trained capabilities.
arXiv Detail & Related papers (2025-12-04T12:43:50Z) - Representation Interventions Enable Lifelong Unstructured Knowledge Control [54.86207134539453]
Large language models (LLMs) often produce incorrect or outdated content. Updating their knowledge efficiently and accurately without costly retraining is a major challenge.<n>We introduce RILKE, a robust and scalable method that treats knowledge control as interventions within the model's representation space.<n>During training, RILKE learns paraphrase-robust and edit-localized modules that limit each update to a low-dimensional subspace to minimize cross-edit interference.<n>In inference, a query-adaptive router selects the appropriate module to guide the model's generation.
arXiv Detail & Related papers (2025-11-25T22:15:00Z) - Retention analysis of edited knowledge after fine-tuning [5.1877231178075425]
Large language models (LLMs) store vast amounts of knowledge, which often requires updates to correct factual errors, incorporate newly acquired information, or adapt model behavior.<n>Model editing methods have emerged as efficient solutions for such updates, offering localized and precise knowledge modification at significantly lower computational cost than continual training.<n>However, the effect of fine-tuning on previously edited knowledge remains poorly understood.
arXiv Detail & Related papers (2025-07-14T15:51:19Z) - Model Merging for Knowledge Editing [53.799891745131724]
Large Language Models (LLMs) require continuous updates to maintain accurate and current knowledge as the world evolves.<n>Existing knowledge editing approaches offer various solutions for knowledge updating, but they often struggle with sequential editing scenarios.<n>This paper proposes a two-stage framework combining robust supervised fine-tuning (R-SFT) with model merging for knowledge editing.
arXiv Detail & Related papers (2025-06-14T07:42:39Z) - ThinkEval: Practical Evaluation of Knowledge Leakage in LLM Editing using Thought-based Knowledge Graphs [3.9295613363026174]
We present ThinkEval, a framework to quantify indirect knowledge leakage and ripple effects in model-editing.<n>ThinkEval builds and employs specialized knowledge graphs to analyze the causal structure of facts before and after editing.<n>We evaluate five editing techniques: AlphaEdit, RECT, ROME, MEMIT, and PRUNE.
arXiv Detail & Related papers (2025-06-02T07:24:12Z) - FAME: Towards Factual Multi-Task Model Editing [4.858226284963096]
Large language models (LLMs) embed extensive knowledge and utilize it to perform exceptionally well across various tasks.
We present FAME, an factual, comprehensive, and multi-task dataset, which is designed to enhance the practicality of model editing.
We then propose SKEME, a model editing method that uses a novel caching mechanism to ensure synchronization with the real world.
arXiv Detail & Related papers (2024-10-07T13:46:06Z) - The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse [58.0132400208411]
Even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks.
benchmarking Large Language Models after each edit is impractically time-consuming and resource-intensive.
We have utilized GPT-3.5 to develop a new dataset, HardEdit, based on hard cases.
arXiv Detail & Related papers (2024-02-15T01:50:38Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.