Editing Arbitrary Propositions in LLMs without Subject Labels
- URL: http://arxiv.org/abs/2401.07526v1
- Date: Mon, 15 Jan 2024 08:08:24 GMT
- Title: Editing Arbitrary Propositions in LLMs without Subject Labels
- Authors: Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan
Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese
- Abstract summary: We introduce a simple and fast localization method called Gradient Tracing (GT)
GT allows editing arbitrary propositions instead of just binary ones, and does so without the need for subject labels.
We show that our method -- without access to subject labels -- performs close to state-of-the-art L&E methods which has access subject labels.
- Score: 88.67755930096966
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large Language Model (LLM) editing modifies factual information in LLMs.
Locate-and-Edit (L\&E) methods accomplish this by finding where relevant
information is stored within the neural network, and editing the weights at
that location. The goal of editing is to modify the response of an LLM to a
proposition independently of its phrasing, while not modifying its response to
other related propositions. Existing methods are limited to binary
propositions, which represent straightforward binary relations between a
subject and an object. Furthermore, existing methods rely on semantic subject
labels, which may not be available or even be well-defined in practice. In this
paper, we show that both of these issues can be effectively skirted with a
simple and fast localization method called Gradient Tracing (GT). This
localization method allows editing arbitrary propositions instead of just
binary ones, and does so without the need for subject labels. As propositions
always have a truth value, our experiments prompt an LLM as a boolean
classifier, and edit its T/F response to propositions. Our method applies GT
for location tracing, and then edit the model at that location using a mild
variant of Rank-One Model Editing (ROME). On datasets of binary propositions
derived from the CounterFact dataset, we show that our method -- without access
to subject labels -- performs close to state-of-the-art L\&E methods which has
access subject labels. We then introduce a new dataset, Factual Accuracy
Classification Test (FACT), which includes non-binary propositions and for
which subject labels are not generally applicable, and therefore is beyond the
scope of existing L\&E methods. Nevertheless, we show that with our method
editing is possible on FACT.
Related papers
- Entity Alignment with Noisy Annotations from Large Language Models [15.189701951003611]
We propose a unified framework, LLM4EA, to effectively leverage Large Language Models for EA.
Specifically, we design a novel active learning policy to significantly reduce the annotation space.
We iteratively optimize the policy based on the feedback from a base EA model.
arXiv Detail & Related papers (2024-05-27T03:52:55Z) - Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.
Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z) - Editing Conceptual Knowledge for Large Language Models [67.8410749469755]
This paper pioneers the investigation of editing conceptual knowledge for Large Language Models (LLMs)
We construct a novel benchmark dataset ConceptEdit and establish a suite of new metrics for evaluation.
experimental results reveal that, although existing editing methods can efficiently modify concept-level definition to some extent, they also have the potential to distort the related instantial knowledge.
arXiv Detail & Related papers (2024-03-10T16:57:10Z) - Learning to Edit: Aligning LLMs with Knowledge Editing [101.96620267293731]
We propose a Learning to Edit (LTE) framework, focusing on teaching large language models to apply updated knowledge into input questions.
LTE features a two-phase process: (i) the Alignment Phase, which fine-tunes LLMs on a meticulously curated parallel dataset to make reliable, in-scope edits.
We demonstrate LTE's superiority in knowledge editing performance, robustness in both batch and sequential editing, minimal interference on general tasks, and rapid editing speeds.
arXiv Detail & Related papers (2024-02-19T07:45:17Z) - Knowledge Editing on Black-box Large Language Models [37.17131278142237]
Knowledge editing aims to efficiently and precisely modify the behavior of large language models (LLMs) to update specific knowledge.
Current research primarily focuses on white-box LLMs editing, overlooking an important scenario: black-box LLMs editing.
We introduce KE on black-box LLMs and then propose a comprehensive evaluation framework to overcome the limitations of existing evaluations.
Experiments and analysis on two benchmarks demonstrate that postEdit outperforms all baselines and achieves strong generalization.
arXiv Detail & Related papers (2024-02-13T17:59:34Z) - SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering [17.20346072074533]
Recent model editing is a promising technique for efficiently updating a small amount of knowledge of large language models (LLMs)
We propose a detachable and expandable Subject Word Embedding Altering (SWEA) framework, which finds the editing embeddings through token-level matching.
We demonstrate the overall state-of-the-art (SOTA) performance of SWEA$oplus$OS on the textscCounterFact and zsRE datasets.
arXiv Detail & Related papers (2024-01-31T13:08:45Z) - Emptying the Ocean with a Spoon: Should We Edit Models? [8.545919917068273]
We call into question the recently popularized method of direct model editing as a means of correcting factual errors in LLM generations.
We contrast model editing with three similar but distinct approaches that pursue better defined objectives.
arXiv Detail & Related papers (2023-10-18T13:38:03Z) - Editing Large Language Models: Problems, Methods, and Opportunities [51.903537096207]
This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs.
We provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal.
Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.
arXiv Detail & Related papers (2023-05-22T16:00:00Z) - CLIP2StyleGAN: Unsupervised Extraction of StyleGAN Edit Directions [65.00528970576401]
StyleGAN has enabled unprecedented semantic editing capabilities, on both synthesized and real images.
We propose two novel building blocks; one for finding interesting CLIP directions and one for labeling arbitrary directions in CLIP latent space.
We evaluate the effectiveness of the proposed method and demonstrate that extraction of disentangled labeled StyleGAN edit directions is indeed possible.
arXiv Detail & Related papers (2021-12-09T21:26:03Z) - A Practical Framework for Relation Extraction with Noisy Labels Based on
Doubly Transitional Loss [14.121872633596452]
We introduce a practical end-to-end deep learning framework for automatic labeling.
One transition is parameterized by a non-linear transformation between hidden layers.
Another is an explicit probability transition matrix that captures the direct conversion between labels.
arXiv Detail & Related papers (2020-04-28T19:38:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.