Protecting the Undeleted in Machine Unlearning
- URL: http://arxiv.org/abs/2602.16697v1
- Date: Wed, 18 Feb 2026 18:44:21 GMT
- Title: Protecting the Undeleted in Machine Unlearning
- Authors: Aloni Cohen, Refael Kohen, Kobbi Nissim, Uri Stemmer,
- Abstract summary: We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechanism adhering to perfect retraining allows an adversary to reconstruct almost the entire dataset merely by issuing deletion requests.<n>We propose a new security definition that specifically safeguards undeleted data against leakage caused by the deletion of other points.
- Score: 21.833252081084996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine unlearning aims to remove specific data points from a trained model, often striving to emulate "perfect retraining", i.e., producing the model that would have been obtained had the deleted data never been included. We demonstrate that this approach, and security definitions that enable it, carry significant privacy risks for the remaining (undeleted) data points. We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechanism adhering to perfect retraining allows an adversary controlling merely $ω(1)$ data points to reconstruct almost the entire dataset merely by issuing deletion requests. We survey existing definitions for machine unlearning, showing they are either susceptible to such attacks or too restrictive to support basic functionalities like exact summation. To address this problem, we propose a new security definition that specifically safeguards undeleted data against leakage caused by the deletion of other points. We show that our definition permits several essential functionalities, such as bulletin boards, summations, and statistical learning.
Related papers
- Suppression or Deletion: A Restoration-Based Representation-Level Analysis of Machine Unlearning [24.40457827994831]
We propose a novel restoration-based analysis framework for machine unlearning.<n>Applying our framework to 12 major unlearning methods in image classification tasks, we find that most methods achieve high restoration rates.<n>We propose new evaluation guidelines that prioritize representation-level verification.
arXiv Detail & Related papers (2026-02-18T07:46:30Z) - Reminiscence Attack on Residuals: Exploiting Approximate Machine Unlearning for Privacy [18.219835803238837]
We show that approximate unlearning algorithms fail to adequately protect the privacy of unlearned data.<n>We propose the Reminiscence Attack (ReA), which amplifies the correlation between residuals and membership privacy.<n>We develop a dual-phase approximate unlearning framework that first eliminates deep-layer unlearned data traces and then enforces convergence stability.
arXiv Detail & Related papers (2025-07-28T07:12:12Z) - Redirection for Erasing Memory (REM): Towards a universal unlearning method for corrupted data [55.31265817705997]
We propose a conceptual space to characterize diverse corrupted data unlearning tasks in vision classifiers.<n>We propose a novel method, Redirection for Erasing Memory (REM), whose key feature is that corrupted data are redirected to dedicated neurons introduced at unlearning time.<n>REM performs strongly across the space of tasks, in contrast to prior SOTA methods that fail outside the regions for which they were designed.
arXiv Detail & Related papers (2025-05-23T10:47:27Z) - UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning [57.081646768835704]
User specifications or legal frameworks often require information to be removed from pretrained models, including large language models (LLMs)<n>This requires deleting or "forgetting" a set of data points from an already-trained model, which typically degrades its performance on other data points.<n>We propose UPCORE, a method-agnostic data selection framework for mitigating collateral damage during unlearning.
arXiv Detail & Related papers (2025-02-20T22:51:10Z) - FUNU: Boosting Machine Unlearning Efficiency by Filtering Unnecessary Unlearning [9.472692023087223]
We propose FUNU, a method to identify data points that lead to unnecessary unlearning.<n>We provide a theoretical analysis of FUNU and conduct extensive experiments to validate its efficacy.
arXiv Detail & Related papers (2025-01-28T01:19:07Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - Control, Confidentiality, and the Right to be Forgotten [7.568881327572535]
We propose a new formalism: deletion-as-control.
It allows users' data to be freely used before deletion, while also imposing a meaningful requirement after deletion.
We apply it to social functionalities, and give a new unified view of various machine unlearning definitions.
arXiv Detail & Related papers (2022-10-14T14:54:52Z) - Deletion Inference, Reconstruction, and Compliance in Machine
(Un)Learning [21.404426803200796]
Privacy attacks on machine learning models aim to identify the data that is used to train such models.
Many machine learning methods are recently extended to support machine unlearning.
arXiv Detail & Related papers (2022-02-07T19:02:58Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Certified Data Removal from Machine Learning Models [79.91502073022602]
Good data stewardship requires removal of data at the request of the data's owner.
This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request.
We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with.
arXiv Detail & Related papers (2019-11-08T03:57:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.