Few-Shot Unlearning by Model Inversion
- URL: http://arxiv.org/abs/2205.15567v1
- Date: Tue, 31 May 2022 06:57:56 GMT
- Title: Few-Shot Unlearning by Model Inversion
- Authors: Youngsik Yoon, Jinhwan Nam, Hyojeong Yun, Dongwoo Kim, Jungseul Ok
- Abstract summary: We consider the problem of machine unlearning to erase a target dataset, which causes an unwanted behavior.
We devise a new model inversion technique to retrieve the training data from the model, followed by filtering out samples similar to the target samples and then relearning.
We demonstrate that our method using only a subset of target data can outperform the state-of-the-art methods with a full indication of target data.
- Score: 3.486204232859346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of machine unlearning to erase a target dataset,
which causes an unwanted behavior, from the trained model when the training
dataset is not given. Previous works have assumed that the target dataset
indicates all the training data imposing the unwanted behavior. However, it is
often infeasible to obtain such a complete indication. We hence address a
practical scenario of unlearning provided a few samples of target data,
so-called few-shot unlearning. To this end, we devise a straightforward
framework, including a new model inversion technique to retrieve the training
data from the model, followed by filtering out samples similar to the target
samples and then relearning. We demonstrate that our method using only a subset
of target data can outperform the state-of-the-art methods with a full
indication of target data.
Related papers
- Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Data Selection for Transfer Unlearning [14.967546081883034]
We advocate for a relaxed definition of unlearning that does not address privacy applications.
We propose a new method that uses a mechanism for selecting relevant examples from an auxiliary "static" dataset.
We find that our method outperforms the gold standard "exact unlearning" on several datasets.
arXiv Detail & Related papers (2024-05-16T20:09:41Z) - Distilled Datamodel with Reverse Gradient Matching [74.75248610868685]
We introduce an efficient framework for assessing data impact, comprising offline training and online evaluation stages.
Our proposed method achieves comparable model behavior evaluation while significantly speeding up the process compared to the direct retraining method.
arXiv Detail & Related papers (2024-04-22T09:16:14Z) - An Information Theoretic Approach to Machine Unlearning [45.600917449314444]
Key challenge in unlearning is forgetting the necessary data in a timely manner, while preserving model performance.
In this work, we address the zero-shot unlearning scenario, whereby an unlearning algorithm must be able to remove data given only a trained model and the data to be forgotten.
We derive a simple but principled zero-shot unlearning method based on the geometry of the model.
arXiv Detail & Related papers (2024-02-02T13:33:30Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Data-Free Model Extraction Attacks in the Context of Object Detection [0.6719751155411076]
A significant number of machine learning models are vulnerable to model extraction attacks.
We propose an adversary black box attack extending to a regression problem for predicting bounding box coordinates in object detection.
We find that the proposed model extraction method achieves significant results by using reasonable queries.
arXiv Detail & Related papers (2023-08-09T06:23:54Z) - Building Manufacturing Deep Learning Models with Minimal and Imbalanced
Training Data Using Domain Adaptation and Data Augmentation [15.333573151694576]
We propose a novel domain adaptation (DA) approach to address the problem of labeled training data scarcity for a target learning task.
Our approach works for scenarios where the source dataset and the dataset available for the target learning task have same or different feature spaces.
We evaluate our combined approach using image data for wafer defect prediction.
arXiv Detail & Related papers (2023-05-31T21:45:34Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.