Related papers: AKEW: Assessing Knowledge Editing in the Wild

AKEW: Assessing Knowledge Editing in the Wild

URL: http://arxiv.org/abs/2402.18909v2
Date: Thu, 10 Oct 2024 05:30:03 GMT
Title: AKEW: Assessing Knowledge Editing in the Wild
Authors: Xiaobao Wu, Liangming Pan, William Yang Wang, Anh Tuan Luu,
Abstract summary: AKEW (Assessing Knowledge Editing in the Wild) is a new practical benchmark for knowledge editing. It fully covers three editing settings of knowledge updates: structured facts, unstructured texts as facts, and extracted triplets. Through extensive experiments, we demonstrate the considerable gap between state-of-the-art knowledge-editing methods and practical scenarios.
Score: 79.96813982502952
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge editing injects knowledge updates into language models to keep them correct and up-to-date. However, its current evaluations deviate significantly from practice: their knowledge updates solely consist of structured facts derived from meticulously crafted datasets, instead of practical sources -- unstructured texts like news articles, and they often overlook practical real-world knowledge updates. To address these issues, in this paper we propose AKEW (Assessing Knowledge Editing in the Wild), a new practical benchmark for knowledge editing. AKEW fully covers three editing settings of knowledge updates: structured facts, unstructured texts as facts, and extracted triplets. It further introduces new datasets featuring both counterfactual and real-world knowledge updates. Through extensive experiments, we demonstrate the considerable gap between state-of-the-art knowledge-editing methods and practical scenarios. Our analyses further highlight key insights to motivate future research for practical knowledge editing.

Related papers

SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models [96.81401797908835]
We introduce SAKE, the first benchmark specifically designed for editing auditory attribute knowledge in Large Audio-Language Models.<n>We benchmark seven editing methods on two LALMs along four dimensions: reliability, generality, audio/text locality, and portability.<n>Results highlight challenges such as preserving intra-attribute knowledge unrelated to the edit, generalizing edits to multimodal reasoning, and maintaining edits under sequential updates.
arXiv Detail & Related papers (2025-10-19T16:22:09Z)
A Dual-Axis Taxonomy of Knowledge Editing for LLMs: From Mechanisms to Functions [6.051561613968997]
Large language models (LLMs) acquire vast knowledge from large text corpora, but this information can become outdated or inaccurate.<n>Since retraining is computationally expensive, knowledge editing offers an efficient alternative -- modifying internal knowledge without full retraining.<n>This survey introduces a novel, complementary function-based taxonomy to provide a more holistic view.
arXiv Detail & Related papers (2025-08-12T09:51:39Z)
Aligning Language Models with Real-time Knowledge Editing [11.503574001763246]
We introduce CRAFT, an ever-evolving real-world benchmark for knowledge editing.<n>It features well-designed paired edits for composite reasoning, and evaluates models on alias portability and temporal and common-sense locality.<n> Towards flexible real-time editing, we propose KEDAS, a novel paradigm of knowledge editing alignment featuring diverse edit augmentation and self-adaptive post-alignment inference.
arXiv Detail & Related papers (2025-08-02T10:25:36Z)
Retention analysis of edited knowledge after fine-tuning [5.440397659472036]
Large language models (LLMs) store vast amounts of knowledge, which often requires updates to correct factual errors, incorporate newly acquired information, or adapt model behavior.<n>Model editing methods have emerged as efficient solutions for such updates, offering localized and precise knowledge modification at significantly lower computational cost than continual training.<n>However, the effect of fine-tuning on previously edited knowledge remains poorly understood.
arXiv Detail & Related papers (2025-07-14T15:51:19Z)
Understanding the Limits of Lifelong Knowledge Editing in LLMs [59.12302872055081]
We bridge research into lifelong knowledge editing to real-world edits at practically relevant scale. We first introduce WikiBigEdit; a large-scale benchmark of real-world Wikidata edits. In its first instance, it includes over 500K question-answer pairs for knowledge editing.
arXiv Detail & Related papers (2025-03-07T18:45:42Z)
Event-level Knowledge Editing [53.767465515537545]
Existing work edits large language models (LLMs) at the level of factual knowledge triplets. We propose a new task setting: event-level knowledge editing, which directly edits new events into LLMs. We construct a high-quality event-level editing benchmark ELKEN, consisting of 1,515 event edits, 6,449 questions about factual knowledge, and 10,150 questions about future tendencies.
arXiv Detail & Related papers (2024-02-20T15:36:41Z)
Stable Knowledge Editing in Large Language Models [68.98582618305679]
We introduce StableKE, a knowledge editing method based on knowledge augmentation rather than knowledge localization. To overcome the expense of human labeling, StableKE integrates two automated knowledge augmentation strategies. StableKE surpasses other knowledge editing methods, demonstrating stability both edited knowledge and multi-hop knowledge.
arXiv Detail & Related papers (2024-02-20T14:36:23Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
History Matters: Temporal Knowledge Editing in Large Language Model [42.74144542674756]
We introduce the task of Temporal Knowledge Editing (TKE) and establish a benchmark AToKe to evaluate current model editing methods. We find that while existing model editing methods are effective at making models remember new knowledge, the edited model catastrophically forgets historical knowledge. To address this gap, we propose a simple and general framework termed Multi-Editing with Time Objective (METO) for enhancing existing editing models.
arXiv Detail & Related papers (2023-12-09T07:51:56Z)
Assessing Knowledge Editing in Language Models via Relation Perspective [21.64869056276927]
This paper constructs a new benchmark named RaKE, which focuses on relation-based knowledge editing. We establish a suite of innovative metrics for evaluation and conduct comprehensive experiments involving various knowledge editing baselines. Our research results confirm that knowledge related to relations is not only stored in the FFN network but also in the attention layers.
arXiv Detail & Related papers (2023-11-15T15:44:42Z)
Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators [78.63553017938911]
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks. However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge. We introduce CONNER, designed to evaluate generated knowledge from six important perspectives.
arXiv Detail & Related papers (2023-10-11T08:22:37Z)
Eva-KELLM: A New Benchmark for Evaluating Knowledge Editing of LLMs [54.22416829200613]
Eva-KELLM is a new benchmark for evaluating knowledge editing of large language models. Experimental results indicate that the current methods for knowledge editing using raw documents are not effective in yielding satisfactory results.
arXiv Detail & Related papers (2023-08-19T09:17:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.