Deep Unlearn: Benchmarking Machine Unlearning
- URL: http://arxiv.org/abs/2410.01276v1
- Date: Wed, 2 Oct 2024 06:41:58 GMT
- Title: Deep Unlearn: Benchmarking Machine Unlearning
- Authors: Xavier F. Cadet, Anastasia Borovykh, Mohammad Malekzadeh, Sara Ahmadi-Abhari, Hamed Haddadi,
- Abstract summary: Machine unlearning (MU) aims to remove the influence of particular data points from the learnable parameters of a trained machine learning model.
This paper investigates 18 state-of-the-art MU methods across various benchmark datasets and models.
- Score: 7.450700594277741
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine unlearning (MU) aims to remove the influence of particular data points from the learnable parameters of a trained machine learning model. This is a crucial capability in light of data privacy requirements, trustworthiness, and safety in deployed models. MU is particularly challenging for deep neural networks (DNNs), such as convolutional nets or vision transformers, as such DNNs tend to memorize a notable portion of their training dataset. Nevertheless, the community lacks a rigorous and multifaceted study that looks into the success of MU methods for DNNs. In this paper, we investigate 18 state-of-the-art MU methods across various benchmark datasets and models, with each evaluation conducted over 10 different initializations, a comprehensive evaluation involving MU over 100K models. We show that, with the proper hyperparameters, Masked Small Gradients (MSG) and Convolution Transpose (CT), consistently perform better in terms of model accuracy and run-time efficiency across different models, datasets, and initializations, assessed by population-based membership inference attacks (MIA) and per-sample unlearning likelihood ratio attacks (U-LiRA). Furthermore, our benchmark highlights the fact that comparing a MU method only with commonly used baselines, such as Gradient Ascent (GA) or Successive Random Relabeling (SRL), is inadequate, and we need better baselines like Negative Gradient Plus (NG+) with proper hyperparameter selection.
Related papers
- MUSO: Achieving Exact Machine Unlearning in Over-Parameterized Regimes [19.664090734076712]
Machine unlearning (MU) makes a well-trained model behave as if it had never been trained on specific data.
We propose an alternating optimization algorithm that unifies the tasks of unlearning and relabeling.
The algorithm's effectiveness, confirmed through numerical experiments, highlights its superior performance in unlearning across various scenarios.
arXiv Detail & Related papers (2024-10-11T06:17:17Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation [105.23631749213729]
We propose a novel method for unsupervised pre-training in low-data regimes.
Inspired by the recently successful prompting technique, we introduce a new method, Unsupervised Pre-training with Language-Vision Prompts.
We show that our method can converge faster and perform better than CNN-based models in low-data regimes.
arXiv Detail & Related papers (2024-05-22T06:48:43Z) - Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning [9.998859702421417]
Machine unlearning (MU) aims to eliminate the influence of chosen data points on model performance.
Despite various MU methods for data influence erasure, evaluations have largely focused on random data forgetting.
We propose identifying the data subset that presents the most significant challenge for influence erasure, pinpointing the worst-case forget set.
arXiv Detail & Related papers (2024-03-12T06:50:32Z) - Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - Stabilizing Subject Transfer in EEG Classification with Divergence
Estimation [17.924276728038304]
We propose several graphical models to describe an EEG classification task.
We identify statistical relationships that should hold true in an idealized training scenario.
We design regularization penalties to enforce these relationships in two stages.
arXiv Detail & Related papers (2023-10-12T23:06:52Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - MUSCLE: Strengthening Semi-Supervised Learning Via Concurrent
Unsupervised Learning Using Mutual Information Maximization [29.368950377171995]
We introduce Mutual-information-based Unsupervised & Semi-supervised Concurrent LEarning (MUSCLE) to combine both unsupervised and semi-supervised learning.
MUSCLE can be used as a stand-alone training scheme for neural networks, and can also be incorporated into other learning approaches.
We show that the proposed hybrid model outperforms state of the art on several standard benchmarks, including CIFAR-10, CIFAR-100, and Mini-Imagenet.
arXiv Detail & Related papers (2020-11-30T23:01:04Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.