Vertical Machine Unlearning: Selectively Removing Sensitive Information
From Latent Feature Space
- URL: http://arxiv.org/abs/2202.13295v1
- Date: Sun, 27 Feb 2022 05:25:15 GMT
- Title: Vertical Machine Unlearning: Selectively Removing Sensitive Information
From Latent Feature Space
- Authors: Tao Guo, Song Guo, Jiewei Zhang, Wenchao Xu, Junxiao Wang
- Abstract summary: We investigate a vertical unlearning mode, aiming at removing only sensitive information from latent feature space.
We introduce intuitive and formal definitions for this unlearning and show its relationship with existing horizontal unlearning.
We propose an approximation with an upper bound to estimate it, with rigorous theoretical analysis.
- Score: 21.8933559159369
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the enactment of privacy regulations has promoted the rise of
machine unlearning paradigm. Most existing studies mainly focus on removing
unwanted data samples from a learnt model. Yet we argue that they remove
overmuch information of data samples from latent feature space, which is far
beyond the sensitive feature scope that genuinely needs to be unlearned. In
this paper, we investigate a vertical unlearning mode, aiming at removing only
sensitive information from latent feature space. First, we introduce intuitive
and formal definitions for this unlearning and show its orthogonal relationship
with existing horizontal unlearning. Secondly, given the fact of lacking
general solutions to vertical unlearning, we introduce a ground-breaking
solution based on representation detachment, where the task-related information
is encouraged to retain while the sensitive information is progressively
forgotten. Thirdly, observing that some computation results during
representation detachment are hard to obtain in practice, we propose an
approximation with an upper bound to estimate it, with rigorous theoretical
analysis. We validate our method by spanning several datasets and models with
prevailing performance. We envision this work as a necessity for future machine
unlearning system and an essential component of the latest privacy-related
legislation.
Related papers
- Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models [49.043599241803825]
Iterative Contrastive Unlearning (ICU) framework consists of three core components.
A Knowledge Unlearning Induction module removes specific knowledge through an unlearning loss.
A Contrastive Learning Enhancement module to preserve the model's expressive capabilities against the pure unlearning goal.
And an Iterative Unlearning Refinement module that dynamically assess the unlearning extent on specific data pieces and make iterative update.
arXiv Detail & Related papers (2024-07-25T07:09:35Z) - Label-Agnostic Forgetting: A Supervision-Free Unlearning in Deep Models [7.742594744641462]
Machine unlearning aims to remove information derived from forgotten data while preserving that of the remaining dataset in a well-trained model.
We propose a supervision-free unlearning approach that operates without the need for labels during the unlearning process.
arXiv Detail & Related papers (2024-03-31T00:29:00Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - An Information Theoretic Approach to Machine Unlearning [45.600917449314444]
Key challenge in unlearning is forgetting the necessary data in a timely manner, while preserving model performance.
In this work, we address the zero-shot unlearning scenario, whereby an unlearning algorithm must be able to remove data given only a trained model and the data to be forgotten.
We derive a simple but principled zero-shot unlearning method based on the geometry of the model.
arXiv Detail & Related papers (2024-02-02T13:33:30Z) - Robust Machine Learning by Transforming and Augmenting Imperfect
Training Data [6.928276018602774]
This thesis explores several data sensitivities of modern machine learning.
We first discuss how to prevent ML from codifying prior human discrimination measured in the training data.
We then discuss the problem of learning from data containing spurious features, which provide predictive fidelity during training but are unreliable upon deployment.
arXiv Detail & Related papers (2023-12-19T20:49:28Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Localized Shortcut Removal [4.511561231517167]
High performance on held-out test data does not necessarily indicate that a model generalizes or learns anything meaningful.
This is often due to the existence of machine learning shortcuts - features in the data that are predictive but unrelated to the problem at hand.
We use an adversarially trained lens to detect and eliminate highly predictive but semantically unconnected clues in images.
arXiv Detail & Related papers (2022-11-24T13:05:33Z) - A Survey of Learning on Small Data: Generalization, Optimization, and
Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI.
This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data.
Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.