Related papers: Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

URL: http://arxiv.org/abs/2406.16257v2
Date: Wed, 16 Oct 2024 17:57:24 GMT
Title: Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning
Authors: Somnath Basu Roy Chowdhury, Krzysztof Choromanski, Arijit Sehanobish, Avinava Dubey, Snigdha Chaturvedi,
Abstract summary: We introduce an exact unlearning framework -- Sequence-aware Sharded Sliced Training (S3T) S3T is designed to enhance the deletion capabilities of an exact unlearning system while minimizing the impact on model's performance. We demonstrate S3T attains superior deletion capabilities and enhanced performance compared to baselines across a wide range of settings.
Score: 35.681853074122735
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine unlearning is the process of efficiently removing the influence of a training data instance from a trained machine learning model without retraining it from scratch. A popular subclass of unlearning approaches is exact machine unlearning, which focuses on techniques that explicitly guarantee the removal of the influence of a data instance from a model. Exact unlearning approaches use a machine learning model in which individual components are trained on disjoint subsets of the data. During deletion, exact unlearning approaches only retrain the affected components rather than the entire model. While existing approaches reduce retraining costs, it can still be expensive for an organization to retrain a model component as it requires halting a system in production, which leads to service failure and adversely impacts customers. To address these challenges, we introduce an exact unlearning framework -- Sequence-aware Sharded Sliced Training (S3T), which is designed to enhance the deletion capabilities of an exact unlearning system while minimizing the impact on model's performance. At the core of S3T, we utilize a lightweight parameter-efficient fine-tuning approach that enables parameter isolation by sequentially training layers with disjoint data slices. This enables efficient unlearning by simply deactivating the layers affected by data deletion. Furthermore, to reduce the retraining cost and improve model performance, we train the model on multiple data sequences, which allows S3T to handle an increased number of deletion requests. Both theoretically and empirically, we demonstrate that S3T attains superior deletion capabilities and enhanced performance compared to baselines across a wide range of settings.

Related papers

Sharpness-Aware Parameter Selection for Machine Unlearning [6.397490580631141]
It often happens that some sensitive personal information, such as credit card numbers or passwords, are mistakenly incorporated in the training of machine learning models and need to be removed afterwards. There have been various machine unlearning techniques proposed in the literature to address this problem. Most of the proposed methods revolve around removing individual data samples from a trained model. While the existing methods for these tasks do the unlearning task by updating the whole set of model parameters or only the last layer of the model, we show that there are a subset of model parameters that have the largest contribution in the unlearning target features.
arXiv Detail & Related papers (2025-04-08T19:41:07Z)
When to Forget? Complexity Trade-offs in Machine Unlearning [23.507879460531264]
Machine Unlearning (MU) aims at removing the influence of specific data points from a trained model. We analyze the efficiency of unlearning methods and establish the first upper and lower bounds on minimax times for this problem. We provide a phase diagram for the unlearning complexity ratio -- a novel metric that compares the computational cost of the best unlearning method to full model retraining.
arXiv Detail & Related papers (2025-02-24T16:56:27Z)
Machine Unlearning on Pre-trained Models by Residual Feature Alignment Using LoRA [15.542668474378633]
We propose a novel and efficient machine unlearning method on pre-trained models. We leverage LoRA to decompose the model's intermediate features into pre-trained features and residual features. The method aims to learn the zero residuals on the retained set and shifted residuals on the unlearning set.
arXiv Detail & Related papers (2024-11-13T08:56:35Z)
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models [52.03511469562013]
We introduce the Iterative Contrastive Unlearning (ICU) framework, which consists of three core components. A Knowledge Unlearning Induction module targets specific knowledge for removal using an unlearning loss. A Contrastive Learning Enhancement module preserves the model's expressive capabilities against the pure unlearning goal. An Iterative Unlearning Refinement module dynamically adjusts the unlearning process through ongoing evaluation and updates.
arXiv Detail & Related papers (2024-07-25T07:09:35Z)
Efficient Knowledge Deletion from Trained Models through Layer-wise Partial Machine Unlearning [2.3496568239538083]
This paper introduces a novel class of machine unlearning algorithms. First method is partial amnesiac unlearning, integration of layer-wise pruning with amnesiac unlearning. Second method assimilates layer-wise partial-updates into label-flipping and optimization-based unlearning.
arXiv Detail & Related papers (2024-03-12T12:49:47Z)
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping [53.454408491386886]
bootstrapping self-alignment markedly surpasses the single-round approach. We propose Step-On-Feet Tuning (SOFT) which leverages model's continuously enhanced few-shot ability to boost zero or one-shot performance. Based on easy-to-hard training recipe, we propose SOFT+ which further boost self-alignment's performance.
arXiv Detail & Related papers (2024-02-12T12:30:42Z)
Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning. Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset. We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU) We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z)
Unlearn What You Want to Forget: Efficient Unlearning for LLMs [92.51670143929056]
Large language models (LLMs) have achieved significant progress from pre-training on and memorizing a wide range of textual data. This process might suffer from privacy issues and violations of data protection regulations. We propose an efficient unlearning framework that could efficiently update LLMs without having to retrain the whole model after data removals.
arXiv Detail & Related papers (2023-10-31T03:35:59Z)
Fast Machine Unlearning Without Retraining Through Selective Synaptic Dampening [51.34904967046097]
Selective Synaptic Dampening (SSD) is a fast, performant, and does not require long-term storage of the training data. We present a novel two-step, post hoc, retrain-free approach to machine unlearning which is fast, performant, and does not require long-term storage of the training data.
arXiv Detail & Related papers (2023-08-15T11:30:45Z)
Federated Unlearning via Active Forgetting [24.060724751342047]
We propose a novel federated unlearning framework based on incremental learning. Our framework differs from existing federated unlearning methods that rely on approximate retraining or data influence estimation.
arXiv Detail & Related papers (2023-07-07T03:07:26Z)
LegoNet: A Fast and Exact Unlearning Architecture [59.49058450583149]
Machine unlearning aims to erase the impact of specific training samples upon deleted requests from a trained model. We present a novel network, namely textitLegoNet, which adopts the framework of fixed encoder + multiple adapters'' We show that LegoNet accomplishes fast and exact unlearning while maintaining acceptable performance, synthetically outperforming unlearning baselines.
arXiv Detail & Related papers (2022-10-28T09:53:05Z)
Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models. Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z)
Certifiable Machine Unlearning for Linear Models [1.484852576248587]
Machine unlearning is the task of updating machine learning (ML) models after a subset of the training data they were trained on is deleted. We present an experimental study of the three state-of-the-art approximate unlearning methods for linear models.
arXiv Detail & Related papers (2021-06-29T05:05:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.