Related papers: Editing Arbitrary Propositions in LLMs without Subject Labels

Editing Arbitrary Propositions in LLMs without Subject Labels

URL: http://arxiv.org/abs/2401.07526v1
Date: Mon, 15 Jan 2024 08:08:24 GMT
Title: Editing Arbitrary Propositions in LLMs without Subject Labels
Authors: Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese
Abstract summary: We introduce a simple and fast localization method called Gradient Tracing (GT) GT allows editing arbitrary propositions instead of just binary ones, and does so without the need for subject labels. We show that our method -- without access to subject labels -- performs close to state-of-the-art L&E methods which has access subject labels.
Score: 88.67755930096966
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large Language Model (LLM) editing modifies factual information in LLMs. Locate-and-Edit (L\&E) methods accomplish this by finding where relevant information is stored within the neural network, and editing the weights at that location. The goal of editing is to modify the response of an LLM to a proposition independently of its phrasing, while not modifying its response to other related propositions. Existing methods are limited to binary propositions, which represent straightforward binary relations between a subject and an object. Furthermore, existing methods rely on semantic subject labels, which may not be available or even be well-defined in practice. In this paper, we show that both of these issues can be effectively skirted with a simple and fast localization method called Gradient Tracing (GT). This localization method allows editing arbitrary propositions instead of just binary ones, and does so without the need for subject labels. As propositions always have a truth value, our experiments prompt an LLM as a boolean classifier, and edit its T/F response to propositions. Our method applies GT for location tracing, and then edit the model at that location using a mild variant of Rank-One Model Editing (ROME). On datasets of binary propositions derived from the CounterFact dataset, we show that our method -- without access to subject labels -- performs close to state-of-the-art L\&E methods which has access subject labels. We then introduce a new dataset, Factual Accuracy Classification Test (FACT), which includes non-binary propositions and for which subject labels are not generally applicable, and therefore is beyond the scope of existing L\&E methods. Nevertheless, we show that with our method editing is possible on FACT.

Related papers

MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs [82.34547399693966]
Existing methods for lifelong model editing compromise generalization, interfere with past edits, or fail to scale to long editing sequences.<n>We propose MEMOIR, a novel scalable framework that injects knowledge through a residual memory.<n>MeMOIR confines each edit to a distinct subset of the memory parameters, minimizing interference among edits.
arXiv Detail & Related papers (2025-06-09T16:16:42Z)
Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets [26.566426911250296]
Label-Aligned Transfer Proposal (LAT) systematically projects annotations from diverse source datasets into a target label space.<n>LAT achieves consistent improvements in target-domain detection performance, achieving gains of up to +4.8AP over semi-supervised baselines.
arXiv Detail & Related papers (2025-06-05T08:16:15Z)
Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts [26.415262737856967]
We explore label verification in contexts using Large Language Models (LLMs)<n>We propose the Label-in-a-Haystack Rectification (LiaHR) framework for subjective label correction.<n>This approach can be integrated into annotation pipelines to enhance signal-to-noise ratios.
arXiv Detail & Related papers (2025-05-22T18:55:22Z)
Joint Localization and Activation Editing for Low-Resource Fine-Tuning [73.64004083269424]
We propose a joint localization and activation editing (JoLA) method. JoLA learns (1) which heads in the Transformer to edit (2) whether the intervention should be additive, multiplicative, or both and (3) the intervention parameters themselves. Through evaluations on three benchmarks spanning commonsense reasoning, natural language understanding, and natural language generation, we demonstrate that JoLA consistently outperforms existing methods.
arXiv Detail & Related papers (2025-02-03T09:13:09Z)
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models [65.93240009586351]
Large language models (LLMs) often exhibit hallucinations due to incorrect or outdated knowledge. We introduce AlphaEdit, a novel solution that projects perturbation onto the null space of the preserved knowledge before applying it to the parameters. We theoretically prove that this projection ensures the output of post-edited LLMs remains unchanged when queried about the preserved knowledge.
arXiv Detail & Related papers (2024-10-03T10:06:27Z)
Topic Modeling with Fine-tuning LLMs and Bag of Sentences [1.8592384822257952]
FT-Topic is an unsupervised fine-tuning approach for topic modeling. SenClu is a state-of-the-art topic modeling method that achieves fast inference and hard assignments of sentence groups to a single topic.
arXiv Detail & Related papers (2024-08-06T11:04:07Z)
Aligning Language Models to Explicitly Handle Ambiguity [22.078095273053506]
We propose Alignment with Perceived Ambiguity (APA), a novel pipeline that aligns language models to deal with ambiguous queries. Experimental results on question-answering datasets demonstrate that APA empowers LLMs to explicitly detect and manage ambiguous queries. Our finding proves that APA excels beyond training with gold-standard labels, especially in out-of-distribution scenarios.
arXiv Detail & Related papers (2024-04-18T07:59:53Z)
Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing. Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z)
Editing Conceptual Knowledge for Large Language Models [65.38231526537476]
This paper pioneers the investigation of editing conceptual knowledge for Large Language Models (LLMs) We construct a novel benchmark dataset ConceptEdit and establish a suite of new metrics for evaluation. experimental results reveal that, although existing editing methods can efficiently modify concept-level definition to some extent, they also have the potential to distort the related instantial knowledge.
arXiv Detail & Related papers (2024-03-10T16:57:10Z)
Learning to Edit: Aligning LLMs with Knowledge Editing [101.96620267293731]
We propose a Learning to Edit (LTE) framework, focusing on teaching large language models to apply updated knowledge into input questions. LTE features a two-phase process: (i) the Alignment Phase, which fine-tunes LLMs on a meticulously curated parallel dataset to make reliable, in-scope edits. We demonstrate LTE's superiority in knowledge editing performance, robustness in both batch and sequential editing, minimal interference on general tasks, and rapid editing speeds.
arXiv Detail & Related papers (2024-02-19T07:45:17Z)
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering [17.20346072074533]
Recent model editing is a promising technique for efficiently updating a small amount of knowledge of large language models (LLMs) We propose a detachable and expandable Subject Word Embedding Altering (SWEA) framework, which finds the editing embeddings through token-level matching. We demonstrate the overall state-of-the-art (SOTA) performance of SWEA$oplus$OS on the textscCounterFact and zsRE datasets.
arXiv Detail & Related papers (2024-01-31T13:08:45Z)
Emptying the Ocean with a Spoon: Should We Edit Models? [8.545919917068273]
We call into question the recently popularized method of direct model editing as a means of correcting factual errors in LLM generations. We contrast model editing with three similar but distinct approaches that pursue better defined objectives.
arXiv Detail & Related papers (2023-10-18T13:38:03Z)
Editing Large Language Models: Problems, Methods, and Opportunities [51.903537096207]
This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. We provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.
arXiv Detail & Related papers (2023-05-22T16:00:00Z)
A Practical Framework for Relation Extraction with Noisy Labels Based on Doubly Transitional Loss [14.121872633596452]
We introduce a practical end-to-end deep learning framework for automatic labeling. One transition is parameterized by a non-linear transformation between hidden layers. Another is an explicit probability transition matrix that captures the direct conversion between labels.
arXiv Detail & Related papers (2020-04-28T19:38:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.