Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers
- URL: http://arxiv.org/abs/2301.11578v3
- Date: Mon, 15 Jan 2024 22:22:24 GMT
- Title: Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers
- Authors: Sungmin Cha, Sungjun Cho, Dasol Hwang, Honglak Lee, Taesup Moon, and
Moontae Lee
- Abstract summary: We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
- Score: 71.70205894168039
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since the recent advent of regulations for data protection (e.g., the General
Data Protection Regulation), there has been increasing demand in deleting
information learned from sensitive data in pre-trained models without
retraining from scratch. The inherent vulnerability of neural networks towards
adversarial attacks and unfairness also calls for a robust method to remove or
correct information in an instance-wise fashion, while retaining the predictive
performance across remaining data. To this end, we consider instance-wise
unlearning, of which the goal is to delete information on a set of instances
from a pre-trained model, by either misclassifying each instance away from its
original prediction or relabeling the instance to a different label. We also
propose two methods that reduce forgetting on the remaining data: 1) utilizing
adversarial examples to overcome forgetting at the representation-level and 2)
leveraging weight importance metrics to pinpoint network parameters guilty of
propagating unwanted information. Both methods only require the pre-trained
model and data instances to forget, allowing painless application to real-life
settings where the entire training set is unavailable. Through extensive
experimentation on various image classification benchmarks, we show that our
approach effectively preserves knowledge of remaining data while unlearning
given instances in both single-task and continual unlearning scenarios.
Related papers
- Data Selection for Transfer Unlearning [14.967546081883034]
We advocate for a relaxed definition of unlearning that does not address privacy applications.
We propose a new method that uses a mechanism for selecting relevant examples from an auxiliary "static" dataset.
We find that our method outperforms the gold standard "exact unlearning" on several datasets.
arXiv Detail & Related papers (2024-05-16T20:09:41Z) - Partially Blinded Unlearning: Class Unlearning for Deep Networks a Bayesian Perspective [4.31734012105466]
Machine Unlearning is the process of selectively discarding information designated to specific sets or classes of data from a pre-trained model.
We propose a methodology tailored for the purposeful elimination of information linked to a specific class of data from a pre-trained classification network.
Our novel approach, termed textbfPartially-Blinded Unlearning (PBU), surpasses existing state-of-the-art class unlearning methods, demonstrating superior effectiveness.
arXiv Detail & Related papers (2024-03-24T17:33:22Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Few-Shot Unlearning by Model Inversion [3.486204232859346]
We consider the problem of machine unlearning to erase a target dataset, which causes an unwanted behavior.
We devise a new model inversion technique to retrieve the training data from the model, followed by filtering out samples similar to the target samples and then relearning.
We demonstrate that our method using only a subset of target data can outperform the state-of-the-art methods with a full indication of target data.
arXiv Detail & Related papers (2022-05-31T06:57:56Z) - On the Necessity of Auditable Algorithmic Definitions for Machine
Unlearning [13.149070833843133]
Machine unlearning, i.e. having a model forget about some of its training data, has become increasingly important as privacy legislation promotes variants of the right-to-be-forgotten.
We first show that the definition that underlies approximate unlearning, which seeks to prove the approximately unlearned model is close to an exactly retrained model, is incorrect because one can obtain the same model using different datasets.
We then turn to exact unlearning approaches and ask how to verify their claims of unlearning.
arXiv Detail & Related papers (2021-10-22T16:16:56Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Self-training Improves Pre-training for Natural Language Understanding [63.78927366363178]
We study self-training as another way to leverage unlabeled data through semi-supervised learning.
We introduce SentAugment, a data augmentation method which computes task-specific query embeddings from labeled data.
Our approach leads to scalable and effective self-training with improvements of up to 2.6% on standard text classification benchmarks.
arXiv Detail & Related papers (2020-10-05T17:52:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.