Random Relabeling for Efficient Machine Unlearning
- URL: http://arxiv.org/abs/2305.12320v1
- Date: Sun, 21 May 2023 02:37:26 GMT
- Title: Random Relabeling for Efficient Machine Unlearning
- Authors: Junde Li and Swaroop Ghosh
- Abstract summary: Individuals' right to retract personal data and relevant data privacy regulations pose great challenges to machine learning.
We propose unlearning scheme random relabeling to efficiently deal with sequential data removal requests.
A less constraining removal certification method based on probability distribution similarity with naive unlearning is also proposed.
- Score: 8.871042314510788
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning algorithms and data are the driving forces for machine learning to
bring about tremendous transformation of industrial intelligence. However,
individuals' right to retract their personal data and relevant data privacy
regulations pose great challenges to machine learning: how to design an
efficient mechanism to support certified data removals. Removal of previously
seen data known as machine unlearning is challenging as these data points were
implicitly memorized in training process of learning algorithms. Retraining
remaining data from scratch straightforwardly serves such deletion requests,
however, this naive method is not often computationally feasible. We propose
the unlearning scheme random relabeling, which is applicable to generic
supervised learning algorithms, to efficiently deal with sequential data
removal requests in the online setting. A less constraining removal
certification method based on probability distribution similarity with naive
unlearning is further developed for logit-based classifiers.
Related papers
- RESTOR: Knowledge Recovery through Machine Unlearning [71.75834077528305]
Large language models trained on web-scale corpora can memorize undesirable datapoints.
Many machine unlearning methods have been proposed that aim to 'erase' these datapoints from trained models.
We propose the RESTOR framework for machine unlearning based on the following dimensions.
arXiv Detail & Related papers (2024-10-31T20:54:35Z) - Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Dataset Condensation Driven Machine Unlearning [0.0]
Current trend in data regulation requirements and privacy-preserving machine learning has emphasized the importance of machine unlearning.
We propose new dataset condensation techniques and an innovative unlearning scheme that strikes a balance between machine unlearning privacy, utility, and efficiency.
We present a novel and effective approach to instrumenting machine unlearning and propose its application in defending against membership inference and model inversion attacks.
arXiv Detail & Related papers (2024-01-31T21:48:25Z) - Layer Attack Unlearning: Fast and Accurate Machine Unlearning via Layer
Level Attack and Knowledge Distillation [21.587358050012032]
We propose a fast and novel machine unlearning paradigm at the layer level called layer attack unlearning.
In this work, we introduce the Partial-PGD algorithm to locate the samples to forget efficiently.
We also use Knowledge Distillation (KD) to reliably learn the decision boundaries from the teacher.
arXiv Detail & Related papers (2023-12-28T04:38:06Z) - Fast Machine Unlearning Without Retraining Through Selective Synaptic
Dampening [51.34904967046097]
Selective Synaptic Dampening (SSD) is a fast, performant, and does not require long-term storage of the training data.
We present a novel two-step, post hoc, retrain-free approach to machine unlearning which is fast, performant, and does not require long-term storage of the training data.
arXiv Detail & Related papers (2023-08-15T11:30:45Z) - Fair Machine Unlearning: Data Removal while Mitigating Disparities [5.724350004671127]
Right to be forgotten is core principle outlined by EU's General Regulation.
"Forgetting" can be naively achieved by retraining on remaining data.
"Unlearning" impacts other properties critical to real-world applications such as fairness.
arXiv Detail & Related papers (2023-07-27T10:26:46Z) - Forget Unlearning: Towards True Data-Deletion in Machine Learning [18.656957502454592]
We show that unlearning is not equivalent to data deletion and does not guarantee the "right to be forgotten"
We propose an accurate, computationally efficient, and secure data-deletion machine learning algorithm in the online setting.
arXiv Detail & Related papers (2022-10-17T10:06:11Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Machine Unlearning: Linear Filtration for Logit-based Classifiers [2.174931329479201]
Recently enacted legislation grants individuals certain rights to decide in what fashion their personal data may be used.
This poses a challenge to machine learning: how to proceed when an individual retracts permission to use data.
arXiv Detail & Related papers (2020-02-07T12:16:06Z) - Leveraging Semi-Supervised Learning for Fairness using Neural Networks [49.604038072384995]
There has been a growing concern about the fairness of decision-making systems based on machine learning.
In this paper, we propose a semi-supervised algorithm using neural networks benefiting from unlabeled data.
The proposed model, called SSFair, exploits the information in the unlabeled data to mitigate the bias in the training data.
arXiv Detail & Related papers (2019-12-31T09:11:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.