Related papers: Emptying the Ocean with a Spoon: Should We Edit Models?

Emptying the Ocean with a Spoon: Should We Edit Models?

URL: http://arxiv.org/abs/2310.11958v1
Date: Wed, 18 Oct 2023 13:38:03 GMT
Title: Emptying the Ocean with a Spoon: Should We Edit Models?
Authors: Yuval Pinter, Michael Elhadad
Abstract summary: We call into question the recently popularized method of direct model editing as a means of correcting factual errors in LLM generations. We contrast model editing with three similar but distinct approaches that pursue better defined objectives.
Score: 8.545919917068273
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We call into question the recently popularized method of direct model editing as a means of correcting factual errors in LLM generations. We contrast model editing with three similar but distinct approaches that pursue better defined objectives: (1) retrieval-based architectures, which decouple factual memory from inference and linguistic capabilities embodied in LLMs; (2) concept erasure methods, which aim at preventing systemic bias in generated text; and (3) attribution methods, which aim at grounding generations into identified textual sources. We argue that direct model editing cannot be trusted as a systematic remedy for the disadvantages inherent to LLMs, and while it has proven potential in improving model explainability, it opens risks by reinforcing the notion that models can be trusted for factuality. We call for cautious promotion and application of model editing as part of the LLM deployment process, and for responsibly limiting the use cases of LLMs to those not relying on editing as a critical component.

Related papers

InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing [77.47790551485721]
In-context learning is a promising editing method by comprehending edit information through context encoding.<n>This method is constrained by the limited context window of large language models.<n>We propose InComeS, a flexible framework that enhances LLMs' ability to process editing contexts.
arXiv Detail & Related papers (2025-05-28T09:20:18Z)
Knowledge Updating? No More Model Editing! Just Selective Contextual Reasoning [38.018263569983226]
We provide an evaluation of ten model editing methods along four dimensions: reliability, generalization, locality, and portability. We then propose a straightforward method called Selective Contextual Reasoning (SCR) for knowledge updating.
arXiv Detail & Related papers (2025-03-07T08:04:25Z)
CoME: An Unlearning-based Approach to Conflict-free Model Editing [8.215201299292033]
Large language models (LLMs) often retain outdated or incorrect information from pre-training, which undermines their reliability. We propose Conflict-free Model Editing (CoME), a novel framework that enhances the accuracy of knowledge updates in LLMs by selectively removing outdated knowledge.
arXiv Detail & Related papers (2025-02-20T04:55:38Z)
Rethinking the Residual Distribution of Locate-then-Editing Methods in Model Editing [11.291085182725597]
We show that locate-then-edit methods degrade original knowledge of Large Language Models (LLMs) We propose the Boundary Layer UpdatE (Blue) strategy to enhance locate-then-edit methods.
arXiv Detail & Related papers (2025-02-06T03:20:17Z)
ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA [55.697627106315004]
Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors. Previous approaches manage sequential edits by freezing original parameters and discretely allocating new parameters for each knowledge update. We propose ELDER, a novel approach to create a continuous association between data and adapters.
arXiv Detail & Related papers (2024-08-19T02:27:00Z)
DALD: Improving Logits-based Detector without Logits from Black-box LLMs [56.234109491884126]
Large Language Models (LLMs) have revolutionized text generation, producing outputs that closely mimic human writing. We present Distribution-Aligned LLMs Detection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection. DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations.
arXiv Detail & Related papers (2024-06-07T19:38:05Z)
Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework [20.753141804841]
Large language models (LLMs) can easily generate biased and discriminative responses. This paper focuses on social bias, tackling the association between demographic information and LLM outputs.
arXiv Detail & Related papers (2024-03-13T17:46:28Z)
Editing Conceptual Knowledge for Large Language Models [65.38231526537476]
This paper pioneers the investigation of editing conceptual knowledge for Large Language Models (LLMs) We construct a novel benchmark dataset ConceptEdit and establish a suite of new metrics for evaluation. experimental results reveal that, although existing editing methods can efficiently modify concept-level definition to some extent, they also have the potential to distort the related instantial knowledge.
arXiv Detail & Related papers (2024-03-10T16:57:10Z)
The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse [58.0132400208411]
Even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. benchmarking Large Language Models after each edit is impractically time-consuming and resource-intensive. We have utilized GPT-3.5 to develop a new dataset, HardEdit, based on hard cases.
arXiv Detail & Related papers (2024-02-15T01:50:38Z)
Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue [122.20016030723043]
We evaluate the side effects of model editing on large language models (LLMs) Our analysis reveals that the side effects are caused by model editing altering the original model weights excessively. To mitigate this, a method named RECT is proposed to regularize the edit update weights.
arXiv Detail & Related papers (2024-01-09T18:03:15Z)
The ART of LLM Refinement: Ask, Refine, and Trust [85.75059530612882]
We propose a reasoning with refinement objective called ART: Ask, Refine, and Trust. It asks necessary questions to decide when an LLM should refine its output. It achieves a performance gain of +5 points over self-refinement baselines.
arXiv Detail & Related papers (2023-11-14T07:26:32Z)
Untying the Reversal Curse via Bidirectional Language Model Editing [41.040662400025184]
Large language models (LLMs) store massive factual knowledge within their parameters. LLMs are prone to hallucinate unintended text due to false or outdated knowledge. We study bidirectional language model editing to assess if edited LLMs can recall the editing knowledge bidirectionally.
arXiv Detail & Related papers (2023-10-16T12:04:13Z)
Unveiling the Pitfalls of Knowledge Editing for Large Language Models [41.83423510576848]
It is still unclear whether knowledge editing might introduce side effects that pose potential risks or not. This paper pioneers the investigation into the potential pitfalls associated with knowledge editing for Large Language Models. Experimental results vividly demonstrate that knowledge editing might inadvertently cast a shadow of unintended consequences.
arXiv Detail & Related papers (2023-10-03T15:10:46Z)
Editing Large Language Models: Problems, Methods, and Opportunities [51.903537096207]
This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. We provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.
arXiv Detail & Related papers (2023-05-22T16:00:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.