Unlearn What You Want to Forget: Efficient Unlearning for LLMs
- URL: http://arxiv.org/abs/2310.20150v1
- Date: Tue, 31 Oct 2023 03:35:59 GMT
- Title: Unlearn What You Want to Forget: Efficient Unlearning for LLMs
- Authors: Jiaao Chen, Diyi Yang
- Abstract summary: Large language models (LLMs) have achieved significant progress from pre-training on and memorizing a wide range of textual data.
This process might suffer from privacy issues and violations of data protection regulations.
We propose an efficient unlearning framework that could efficiently update LLMs without having to retrain the whole model after data removals.
- Score: 92.51670143929056
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have achieved significant progress from
pre-training on and memorizing a wide range of textual data, however, this
process might suffer from privacy issues and violations of data protection
regulations. As a result, the ability to easily remove data related to
individual users from such models while not deteriorating their predictive
quality after the removal becomes increasingly important. To address these
issues, in this work, we propose an efficient unlearning framework that could
efficiently update LLMs without having to retrain the whole model after data
removals, by introducing lightweight unlearning layers learned with a selective
teacher-student objective into the transformers. In addition, we introduce a
fusion mechanism to effectively combine different unlearning layers that learns
to forget different sets of data to handle a sequence of forgetting operations.
Experiments on classification and generation tasks demonstrate the
effectiveness of our proposed methods compared to the state-of-the-art
baselines.
Related papers
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - LLM Unlearning via Loss Adjustment with Only Forget Data [20.310423152885217]
We introduce Forget data only Loss AjustmenT (FLAT), a "flat" loss adjustment approach which addresses these issues.
Empirical results demonstrate that our approach achieves superior unlearning performance compared to existing methods.
arXiv Detail & Related papers (2024-10-14T23:43:33Z) - CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept [5.345828824625758]
We propose a novel amortized unlearning approach using codebook features and Sparse Autoencoders (SAEs)
By leveraging a bottleneck to decompose the activation space and regulate information flow, our method efficiently unlearns targeted information while preserving the model's performance on unrelated data.
arXiv Detail & Related papers (2024-10-08T10:26:22Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Efficient Knowledge Deletion from Trained Models through Layer-wise
Partial Machine Unlearning [2.3496568239538083]
This paper introduces a novel class of machine unlearning algorithms.
First method is partial amnesiac unlearning, integration of layer-wise pruning with amnesiac unlearning.
Second method assimilates layer-wise partial-updates into label-flipping and optimization-based unlearning.
arXiv Detail & Related papers (2024-03-12T12:49:47Z) - Unlearnable Algorithms for In-context Learning [36.895152458323764]
In this paper, we focus on efficient unlearning methods for the task adaptation phase of a pretrained large language model.
We observe that an LLM's ability to do in-context learning for task adaptation allows for efficient exact unlearning of task adaptation training data.
We propose a new holistic measure of unlearning cost which accounts for varying inference costs.
arXiv Detail & Related papers (2024-02-01T16:43:04Z) - TOFU: A Task of Fictitious Unlearning for LLMs [99.92305790945507]
Large language models trained on massive corpora of data from the web can reproduce sensitive or private data raising both legal and ethical concerns.
Unlearning, or tuning models to forget information present in their training data, provides us with a way to protect private data after training.
We present TOFU, a benchmark aimed at helping deepen our understanding of unlearning.
arXiv Detail & Related papers (2024-01-11T18:57:12Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.