Certified Machine Unlearning via Noisy Stochastic Gradient Descent
- URL: http://arxiv.org/abs/2403.17105v2
- Date: Fri, 11 Oct 2024 22:22:16 GMT
- Title: Certified Machine Unlearning via Noisy Stochastic Gradient Descent
- Authors: Eli Chien, Haoyu Wang, Ziang Chen, Pan Li,
- Abstract summary: Machine unlearning aims to efficiently remove the effect of certain data points on the trained model.
We propose to leverage noisy gradient descent for unlearning and establish its first approximate unlearning guarantee.
- Score: 20.546589699647416
- License:
- Abstract: ``The right to be forgotten'' ensured by laws for user data privacy becomes increasingly important. Machine unlearning aims to efficiently remove the effect of certain data points on the trained model parameters so that it can be approximately the same as if one retrains the model from scratch. We propose to leverage projected noisy stochastic gradient descent for unlearning and establish its first approximate unlearning guarantee under the convexity assumption. Our approach exhibits several benefits, including provable complexity saving compared to retraining, and supporting sequential and batch unlearning. Both of these benefits are closely related to our new results on the infinite Wasserstein distance tracking of the adjacent (un)learning processes. Extensive experiments show that our approach achieves a similar utility under the same privacy constraint while using $2\%$ and $10\%$ of the gradient computations compared with the state-of-the-art gradient-based approximate unlearning methods for mini-batch and full-batch settings, respectively.
Related papers
- FLOPS: Forward Learning with OPtimal Sampling [1.694989793927645]
gradient-based computation methods have recently gained focus for learning with only forward passes, also referred to as queries.
Conventional forward learning consumes enormous queries on each data point for accurate gradient estimation through Monte Carlo sampling.
We propose to allocate the optimal number of queries over each data in one batch during training to achieve a good balance between estimation accuracy and computational efficiency.
arXiv Detail & Related papers (2024-10-08T12:16:12Z) - Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions [11.955062839855334]
Machine unlearning algorithms aim to efficiently data from a model without it from scratch, in order to enforce data privacy, remove corrupted or outdated data, or respect a user's right to forgotten"
Our algorithm is black-box, in that it be directly applied to models with vanilla gradient descent with no prior consideration of unlearning.
arXiv Detail & Related papers (2024-09-15T15:58:08Z) - Machine Unlearning with Minimal Gradient Dependence for High Unlearning Ratios [18.73206066109299]
Mini-Unlearning is a novel approach that capitalizes on a critical observation: unlearned parameters correlate with retrained parameters through contraction mapping.
This lightweight, scalable method significantly enhances model accuracy and strengthens resistance to membership inference attacks.
Our experiments demonstrate that Mini-Unlearning not only works under higher unlearning ratios but also outperforms existing techniques in both accuracy and security.
arXiv Detail & Related papers (2024-06-24T01:43:30Z) - CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective [48.99488315273868]
We present a contrastive knowledge distillation approach, which can be formulated as a sample-wise alignment problem with intra- and inter-sample constraints.
Our method minimizes logit differences within the same sample by considering their numerical values.
We conduct comprehensive experiments on three datasets including CIFAR-100, ImageNet-1K, and MS COCO.
arXiv Detail & Related papers (2024-04-22T11:52:40Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Efficient Gradient Estimation via Adaptive Sampling and Importance
Sampling [34.50693643119071]
adaptive or importance sampling reduces noise in gradient estimation.
We present an algorithm that can incorporate existing importance functions into our framework.
We observe improved convergence in classification and regression tasks with minimal computational overhead.
arXiv Detail & Related papers (2023-11-24T13:21:35Z) - Fighting Uncertainty with Gradients: Offline Reinforcement Learning via
Diffusion Score Matching [22.461036967440723]
We study smoothed distance to data as an uncertainty metric, and claim that it has two beneficial properties.
We show these gradients can be efficiently learned with score-matching techniques.
We propose Score-Guided Planning (SGP) to enable first-order planning in high-dimensional problems.
arXiv Detail & Related papers (2023-06-24T23:40:58Z) - Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized
Language Model Finetuning Using Shared Randomness [86.61582747039053]
Language model training in distributed settings is limited by the communication cost of exchanges.
We extend recent work using shared randomness to perform distributed fine-tuning with low bandwidth.
arXiv Detail & Related papers (2023-06-16T17:59:51Z) - Adaptive Cross Batch Normalization for Metric Learning [75.91093210956116]
Metric learning is a fundamental problem in computer vision.
We show that it is equally important to ensure that the accumulated embeddings are up to date.
In particular, it is necessary to circumvent the representational drift between the accumulated embeddings and the feature embeddings at the current training iteration.
arXiv Detail & Related papers (2023-03-30T03:22:52Z) - Large Scale Private Learning via Low-rank Reparametrization [77.38947817228656]
We propose a reparametrization scheme to address the challenges of applying differentially private SGD on large neural networks.
We are the first able to apply differential privacy on the BERT model and achieve an average accuracy of $83.9%$ on four downstream tasks.
arXiv Detail & Related papers (2021-06-17T10:14:43Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.