Related papers: Forget Unlearning: Towards True Data-Deletion in Machine Learning

Forget Unlearning: Towards True Data-Deletion in Machine Learning

URL: http://arxiv.org/abs/2210.08911v1
Date: Mon, 17 Oct 2022 10:06:11 GMT
Title: Forget Unlearning: Towards True Data-Deletion in Machine Learning
Authors: Rishav Chourasia, Neil Shah, Reza Shokri
Abstract summary: We show that unlearning is not equivalent to data deletion and does not guarantee the "right to be forgotten" We propose an accurate, computationally efficient, and secure data-deletion machine learning algorithm in the online setting.
Score: 18.656957502454592
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unlearning has emerged as a technique to efficiently erase information of deleted records from learned models. We show, however, that the influence created by the original presence of a data point in the training set can still be detected after running certified unlearning algorithms (which can result in its reconstruction by an adversary). Thus, under realistic assumptions about the dynamics of model releases over time and in the presence of adaptive adversaries, we show that unlearning is not equivalent to data deletion and does not guarantee the "right to be forgotten." We then propose a more robust data-deletion guarantee and show that it is necessary to satisfy differential privacy to ensure true data deletion. Under our notion, we propose an accurate, computationally efficient, and secure data-deletion machine learning algorithm in the online setting based on noisy gradient descent algorithm.

Related papers

UNO: Unlearning via Orthogonalization in Generative models [0.0]
We show that our algorithms are able to forget data while maintaining the fidelity of the original model.<n>Using MNIST and CelebA data, we demonstrate that our algorithms achieve orders of magnitude faster unlearning times than their predecessors.
arXiv Detail & Related papers (2025-06-05T07:37:02Z)
RESTOR: Knowledge Recovery through Machine Unlearning [71.75834077528305]
Large language models trained on web-scale corpora can memorize undesirable datapoints. Many machine unlearning methods have been proposed that aim to 'erase' these datapoints from trained models. We propose the RESTOR framework for machine unlearning based on the following dimensions.
arXiv Detail & Related papers (2024-10-31T20:54:35Z)
Hessian-Free Online Certified Unlearning [8.875278412741695]
We develop an online unlearning algorithm that achieves near-instantaneous data removal. We prove that our proposed method outperforms the state-of-the-art methods in terms of the unlearning and generalization guarantees.
arXiv Detail & Related papers (2024-04-02T07:54:18Z)
The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements. LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information. Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z)
An Information Theoretic Approach to Machine Unlearning [45.600917449314444]
Key challenge in unlearning is forgetting the necessary data in a timely manner, while preserving model performance. In this work, we address the zero-shot unlearning scenario, whereby an unlearning algorithm must be able to remove data given only a trained model and the data to be forgotten. We derive a simple but principled zero-shot unlearning method based on the geometry of the model.
arXiv Detail & Related papers (2024-02-02T13:33:30Z)
Layer Attack Unlearning: Fast and Accurate Machine Unlearning via Layer Level Attack and Knowledge Distillation [21.587358050012032]
We propose a fast and novel machine unlearning paradigm at the layer level called layer attack unlearning. In this work, we introduce the Partial-PGD algorithm to locate the samples to forget efficiently. We also use Knowledge Distillation (KD) to reliably learn the decision boundaries from the teacher.
arXiv Detail & Related papers (2023-12-28T04:38:06Z)
Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning. Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset. We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU) We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z)
Random Relabeling for Efficient Machine Unlearning [8.871042314510788]
Individuals' right to retract personal data and relevant data privacy regulations pose great challenges to machine learning. We propose unlearning scheme random relabeling to efficiently deal with sequential data removal requests. A less constraining removal certification method based on probability distribution similarity with naive unlearning is also proposed.
arXiv Detail & Related papers (2023-05-21T02:37:26Z)
Learning to Unlearn: Instance-wise Unlearning for Pre-trained Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model. We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z)
On the Necessity of Auditable Algorithmic Definitions for Machine Unlearning [13.149070833843133]
Machine unlearning, i.e. having a model forget about some of its training data, has become increasingly important as privacy legislation promotes variants of the right-to-be-forgotten. We first show that the definition that underlies approximate unlearning, which seeks to prove the approximately unlearned model is close to an exactly retrained model, is incorrect because one can obtain the same model using different datasets. We then turn to exact unlearning approaches and ask how to verify their claims of unlearning.
arXiv Detail & Related papers (2021-10-22T16:16:56Z)
Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models. Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z)
Low-Regret Active learning [64.36270166907788]
We develop an online learning algorithm for identifying unlabeled data points that are most informative for training. At the core of our work is an efficient algorithm for sleeping experts that is tailored to achieve low regret on predictable (easy) instances.
arXiv Detail & Related papers (2021-04-06T22:53:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.