Related papers: DocTER: Evaluating Document-based Knowledge Editing

DocTER: Evaluating Document-based Knowledge Editing

URL: http://arxiv.org/abs/2308.09954v2
Date: Thu, 24 Jul 2025 11:16:48 GMT
Title: DocTER: Evaluating Document-based Knowledge Editing
Authors: Suhang Wu, Ante Wang, Minlong Peng, Yujie Lin, Wenbo Li, Mingming Sun, Jinsong Su,
Abstract summary: We explore knowledge editing using easily accessible documents instead of manually labeled factual triples.<n>A comprehensive four-perspective evaluation is introduced: Edit Success, Locality, Reasoning, and Cross-lingual Transfer.<n>Experiments on popular knowledge editing methods demonstrate that editing with documents presents significantly greater challenges than using triples.
Score: 53.14000724633775
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge editing aims to correct outdated or inaccurate knowledge in neural networks. In this paper, we explore knowledge editing using easily accessible documents instead of manually labeled factual triples employed in earlier research. To advance this field, we establish the first evaluation benchmark, \textit{DocTER}, featuring Documents containing counterfactual knowledge for editing. A comprehensive four-perspective evaluation is introduced: Edit Success, Locality, Reasoning, and Cross-lingual Transfer. To adapt conventional triplet-based knowledge editing methods for this task, we develop an Extract-then-Edit pipeline that extracts triples from documents before applying existing methods. Experiments on popular knowledge editing methods demonstrate that editing with documents presents significantly greater challenges than using triples. In document-based scenarios, even the best-performing in-context editing approach still lags behind by 10 points in editing success when compared to using gold triples. This observation also holds for both reasoning and cross-lingual test sets. We further analyze key factors influencing task performance, including the quality of extracted triples, the frequency and position of edited knowledge in documents, various methods for enhancing reasoning, and performance differences across various directions in cross-lingual knowledge editing, which provide valuable insights for future research.

Related papers

Towards a Principled Evaluation of Knowledge Editors [2.497666465251894]
We show that choosing different metrics and evaluation methodologies as well as different edit batch sizes can lead to a different ranking of knowledge editors.<n>We also include a manual assessment of the string matching based evaluation method for knowledge editing that is favored by recently released datasets, revealing a tendency to produce false positive matches.
arXiv Detail & Related papers (2025-07-08T12:37:54Z)
Understanding the Limits of Lifelong Knowledge Editing in LLMs [59.12302872055081]
We bridge research into lifelong knowledge editing to real-world edits at practically relevant scale.<n>We first introduce WikiBigEdit; a large-scale benchmark of real-world Wikidata edits.<n>In its first instance, it includes over 500K question-answer pairs for knowledge editing.
arXiv Detail & Related papers (2025-03-07T18:45:42Z)
Effective LLM Knowledge Learning via Model Generalization [73.16975077770765]
Large language models (LLMs) are trained on enormous documents that contain extensive world knowledge. It is still not well-understood how knowledge is acquired via autoregressive pre-training. In this paper, we focus on understanding and improving LLM knowledge learning.
arXiv Detail & Related papers (2025-03-05T17:56:20Z)
Related Knowledge Perturbation Matters: Rethinking Multiple Pieces of Knowledge Editing in Same-Subject [49.559994791305535]
Current state-of-the-art editing methods struggle when tasked with editing multiple related knowledge pieces for the same subject.<n>We introduce the $textS2textRKE$(Same-Subject Related Knowledge Editing) benchmark.<n>Our experiments reveal that only mainstream locate-then-edit methods, such as ROME and MEMIT, exhibit "related knowledge perturbation"
arXiv Detail & Related papers (2025-02-08T04:47:17Z)
Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings. First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss. Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z)
Cross-Lingual Multi-Hop Knowledge Editing -- Benchmarks, Analysis and a Simple Contrastive Learning based Approach [53.028586843468915]
We propose the Cross-Lingual Multi-Hop Knowledge Editing paradigm, for measuring and analyzing the performance of various SoTA knowledge editing techniques in a cross-lingual setup. Specifically, we create a parallel cross-lingual benchmark, CROLIN-MQUAKE for measuring the knowledge editing capabilities. Following this, we propose a significantly improved system for cross-lingual multi-hop knowledge editing, CLEVER-CKE.
arXiv Detail & Related papers (2024-07-14T17:18:16Z)
Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models [26.516571783335824]
Recent studies have identified side effects, such as knowledge distortion and the deterioration of general abilities, that have emerged after editing. This survey presents a comprehensive study of these side effects, providing a unified perspective on the challenges of knowledge editing in large language models.
arXiv Detail & Related papers (2024-06-03T15:28:21Z)
Editing Conceptual Knowledge for Large Language Models [65.38231526537476]
This paper pioneers the investigation of editing conceptual knowledge for Large Language Models (LLMs) We construct a novel benchmark dataset ConceptEdit and establish a suite of new metrics for evaluation. experimental results reveal that, although existing editing methods can efficiently modify concept-level definition to some extent, they also have the potential to distort the related instantial knowledge.
arXiv Detail & Related papers (2024-03-10T16:57:10Z)
AKEW: Assessing Knowledge Editing in the Wild [79.96813982502952]
AKEW (Assessing Knowledge Editing in the Wild) is a new practical benchmark for knowledge editing. It fully covers three editing settings of knowledge updates: structured facts, unstructured texts as facts, and extracted triplets. Through extensive experiments, we demonstrate the considerable gap between state-of-the-art knowledge-editing methods and practical scenarios.
arXiv Detail & Related papers (2024-02-29T07:08:34Z)
Learning to Edit: Aligning LLMs with Knowledge Editing [101.96620267293731]
We propose a Learning to Edit (LTE) framework, focusing on teaching large language models to apply updated knowledge into input questions. LTE features a two-phase process: (i) the Alignment Phase, which fine-tunes LLMs on a meticulously curated parallel dataset to make reliable, in-scope edits. We demonstrate LTE's superiority in knowledge editing performance, robustness in both batch and sequential editing, minimal interference on general tasks, and rapid editing speeds.
arXiv Detail & Related papers (2024-02-19T07:45:17Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
Unveiling the Pitfalls of Knowledge Editing for Large Language Models [41.83423510576848]
It is still unclear whether knowledge editing might introduce side effects that pose potential risks or not. This paper pioneers the investigation into the potential pitfalls associated with knowledge editing for Large Language Models. Experimental results vividly demonstrate that knowledge editing might inadvertently cast a shadow of unintended consequences.
arXiv Detail & Related papers (2023-10-03T15:10:46Z)
Cross-Lingual Knowledge Editing in Large Language Models [73.12622532088564]
Knowledge editing has been shown to adapt large language models to new knowledge without retraining from scratch. It is still unknown the effect of source language editing on a different target language. We first collect a large-scale cross-lingual synthetic dataset by translating ZsRE from English to Chinese.
arXiv Detail & Related papers (2023-09-16T11:07:52Z)
Coarse-to-Fine Knowledge Selection for Document Grounded Dialogs [11.63334863772068]
Multi-document grounded dialogue systems (DGDS) answer users' requests by finding supporting knowledge from a collection of documents. This paper proposes Re3G, which aims to optimize both coarse-grained knowledge retrieval and fine-grained knowledge extraction in a unified framework.
arXiv Detail & Related papers (2023-02-23T08:28:29Z)
Fact-based Text Editing [11.115292572080131]
textscFactEditor edits a draft text by referring to given facts using a buffer, a stream, and a memory. textscFactEditor conducts inference faster than the encoder-decoder approach.
arXiv Detail & Related papers (2020-07-02T06:50:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.