Control, Confidentiality, and the Right to be Forgotten
- URL: http://arxiv.org/abs/2210.07876v2
- Date: Mon, 4 Dec 2023 16:13:48 GMT
- Title: Control, Confidentiality, and the Right to be Forgotten
- Authors: Aloni Cohen, Adam Smith, Marika Swanberg, Prashant Nalini Vasudevan
- Abstract summary: We propose a new formalism: deletion-as-control.
It allows users' data to be freely used before deletion, while also imposing a meaningful requirement after deletion.
We apply it to social functionalities, and give a new unified view of various machine unlearning definitions.
- Score: 7.568881327572535
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Recent digital rights frameworks give users the right to delete their data
from systems that store and process their personal information (e.g., the
"right to be forgotten" in the GDPR). How should deletion be formalized in
complex systems that interact with many users and store derivative information?
We argue that prior approaches fall short. Definitions of machine unlearning
Cao and Yang [2015] are too narrowly scoped and do not apply to general
interactive settings. The natural approach of deletion-as-confidentiality Garg
et al. [2020] is too restrictive: by requiring secrecy of deleted data, it
rules out social functionalities. We propose a new formalism:
deletion-as-control. It allows users' data to be freely used before deletion,
while also imposing a meaningful requirement after deletion--thereby giving
users more control. Deletion-as-control provides new ways of achieving deletion
in diverse settings. We apply it to social functionalities, and give a new
unified view of various machine unlearning definitions from the literature.
This is done by way of a new adaptive generalization of history independence.
Deletion-as-control also provides a new approach to the goal of machine
unlearning, that is, to maintaining a model while honoring users' deletion
requests. We show that publishing a sequence of updated models that are
differentially private under continual release satisfies deletion-as-control.
The accuracy of such an algorithm does not depend on the number of deleted
points, in contrast to the machine unlearning literature.
Related papers
- Protecting the Undeleted in Machine Unlearning [21.833252081084996]
We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechanism adhering to perfect retraining allows an adversary to reconstruct almost the entire dataset merely by issuing deletion requests.<n>We propose a new security definition that specifically safeguards undeleted data against leakage caused by the deletion of other points.
arXiv Detail & Related papers (2026-02-18T18:44:21Z) - Distributional Unlearning: Forgetting Distributions, Not Just Samples [18.440064196982345]
Machine unlearning seeks to remove unwanted information from trained models, initially at the individual-sample level, but increasingly at the level of entire sub-populations.<n>Existing unlearning tools remain largely sample-oriented, and straightforward point deletion often leaves enough residual signal for downstream learners to recover the unwanted domain.<n>We introduce distributional unlearning, a data-centric, model-agnostic framework that asks: Given examples from an unwanted distribution and a retained distribution, what is the smallest set of points whose removal makes the edited dataset far from the unwanted domain yet close to the retained one?
arXiv Detail & Related papers (2025-07-20T20:21:23Z) - Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs [54.167494079321465]
Current unlearning methods for LLMs optimize on the private information they seek to remove by incorporating it into their fine-tuning data.<n>We propose a novel unlearning method-Partial Model Collapse (PMC), which does not require unlearning targets in the unlearning objective.
arXiv Detail & Related papers (2025-07-06T03:08:49Z) - UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning [57.081646768835704]
User specifications or legal frameworks often require information to be removed from pretrained models, including large language models (LLMs)
This requires deleting or "forgetting" a set of data points from an already-trained model, which typically degrades its performance on other data points.
We propose UPCORE, a method-agnostic data selection framework for mitigating collateral damage during unlearning.
arXiv Detail & Related papers (2025-02-20T22:51:10Z) - FUNU: Boosting Machine Unlearning Efficiency by Filtering Unnecessary Unlearning [9.472692023087223]
We propose FUNU, a method to identify data points that lead to unnecessary unlearning.
We provide a theoretical analysis of FUNU and conduct extensive experiments to validate its efficacy.
arXiv Detail & Related papers (2025-01-28T01:19:07Z) - Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models [76.39651111467832]
We introduce Reliable and Efficient Concept Erasure (RECE), a novel approach that modifies the model in 3 seconds without necessitating additional fine-tuning.
To mitigate inappropriate content potentially represented by derived embeddings, RECE aligns them with harmless concepts in cross-attention layers.
The derivation and erasure of new representation embeddings are conducted iteratively to achieve a thorough erasure of inappropriate concepts.
arXiv Detail & Related papers (2024-07-17T08:04:28Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Continual Forgetting for Pre-trained Vision Models [70.51165239179052]
In real-world scenarios, selective information is expected to be continuously removed from a pre-trained model.
We propose Group Sparse LoRA (GS-LoRA) for efficient and effective deleting.
We conduct extensive experiments on face recognition, object detection and image classification and demonstrate that GS-LoRA manages to forget specific classes with minimal impact on other classes.
arXiv Detail & Related papers (2024-03-18T07:33:56Z) - LEACE: Perfect linear concept erasure in closed form [103.61624393221447]
Concept erasure aims to remove specified features from a representation.
We introduce LEAst-squares Concept Erasure (LEACE), a closed-form method which provably prevents all linear classifiers from detecting a concept while changing the representation as little as possible.
We apply LEACE to large language models with a novel procedure called "concept scrubbing," which erases target concept information from every layer in the network.
arXiv Detail & Related papers (2023-06-06T16:07:24Z) - Approximate Data Deletion in Generative Models [5.596752018167751]
We propose a density-ratio-based framework for generative models.
We introduce a fast method for approximate data deletion and a statistical test for estimating whether or not training points have been deleted.
arXiv Detail & Related papers (2022-06-29T07:24:39Z) - Deletion Inference, Reconstruction, and Compliance in Machine
(Un)Learning [21.404426803200796]
Privacy attacks on machine learning models aim to identify the data that is used to train such models.
Many machine learning methods are recently extended to support machine unlearning.
arXiv Detail & Related papers (2022-02-07T19:02:58Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Adaptive Machine Unlearning [21.294828533009838]
SIS flexible retraining aim to remove deleted data points from models at a cheaper computational retraining than fully trained models.
We show how prior work gives guarantees for non-adaptive deletion sequences, giving strong provable deletion guarantees for extremely strong algorithms.
arXiv Detail & Related papers (2021-06-08T14:11:53Z) - Compressive Summarization with Plausibility and Salience Modeling [54.37665950633147]
We propose to relax the rigid syntactic constraints on candidate spans and instead leave compression decisions to two data-driven criteria: plausibility and salience.
Our method achieves strong in-domain results on benchmark summarization datasets, and human evaluation shows that the plausibility model generally selects for grammatical and factual deletions.
arXiv Detail & Related papers (2020-10-15T17:07:10Z) - Machine Unlearning: Linear Filtration for Logit-based Classifiers [2.174931329479201]
Recently enacted legislation grants individuals certain rights to decide in what fashion their personal data may be used.
This poses a challenge to machine learning: how to proceed when an individual retracts permission to use data.
arXiv Detail & Related papers (2020-02-07T12:16:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.