On Newton's Method to Unlearn Neural Networks
- URL: http://arxiv.org/abs/2406.14507v1
- Date: Thu, 20 Jun 2024 17:12:20 GMT
- Title: On Newton's Method to Unlearn Neural Networks
- Authors: Nhung Bui, Xinyang Lu, See-Kiong Ng, Bryan Kian Hsian Low,
- Abstract summary: We propose a cubic-regularized Newton's method for unlearning an NN.
We show that our method is more resilient to catastrophic forgetting and performs better than the baselines.
- Score: 17.163054897098068
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine unlearning facilitates personal data ownership, including the ``right to be forgotten''. The proliferation of applications of \emph{neural networks} (NNs) trained on users' personal data calls for the need to develop algorithms to unlearn an NN. Since retraining is costly, efficiency is often achieved through approximate unlearning which aims to unlearn a trained NN to be close to the retrained one (in distribution). Though the Newton's method has been used by previous works to approximately unlearn linear models, adapting it for unlearning an NN often encounters degenerate Hessians that make computing the Newton's update impossible. In this paper, we will first show that when coupled with naive yet often effective solutions to mitigate the degeneracy issue for unlearning, the Newton's method surprisingly suffers from catastrophic forgetting. To overcome this difficulty, we revise the Newton's method to include a theoretically justified regularizer and propose a cubic-regularized Newton's method for unlearning an NN. The cubic regularizer comes with the benefits of not requiring manual finetuning and affording a natural interpretation. Empirical evaluation on several models and real-world datasets shows that our method is more resilient to catastrophic forgetting and performs better than the baselines, especially in sequential unlearning.
Related papers
- Variational Learning is Effective for Large Deep Networks [76.94351631300788]
We show that an Improved Variational Online Newton consistently matches or outperforms Adam for training large networks.
IVON's computational costs are nearly identical to Adam but its predictive uncertainty is better.
We find overwhelming evidence that variational learning is effective.
arXiv Detail & Related papers (2024-02-27T16:11:05Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Learn, Unlearn and Relearn: An Online Learning Paradigm for Deep Neural
Networks [12.525959293825318]
We introduce Learn, Unlearn, and Relearn (LURE) an online learning paradigm for deep neural networks (DNNs)
LURE interchanges between the unlearning phase, which selectively forgets the undesirable information in the model, and the relearning phase, which emphasizes learning on generalizable features.
We show that our training paradigm provides consistent performance gains across datasets in both classification and few-shot settings.
arXiv Detail & Related papers (2023-03-18T16:45:54Z) - Improving Representational Continuity via Continued Pretraining [76.29171039601948]
Transfer learning community (LP-FT) outperforms naive training and other continual learning methods.
LP-FT also reduces forgetting in a real world satellite remote sensing dataset (FMoW)
variant of LP-FT gets state-of-the-art accuracies on an NLP continual learning benchmark.
arXiv Detail & Related papers (2023-02-26T10:39:38Z) - DLCFT: Deep Linear Continual Fine-Tuning for General Incremental
Learning [29.80680408934347]
We propose an alternative framework to incremental learning where we continually fine-tune the model from a pre-trained representation.
Our method takes advantage of linearization technique of a pre-trained neural network for simple and effective continual learning.
We show that our method can be applied to general continual learning settings, we evaluate our method in data-incremental, task-incremental, and class-incremental learning problems.
arXiv Detail & Related papers (2022-08-17T06:58:14Z) - It's DONE: Direct ONE-shot learning without training optimization [0.7340017786387767]
Direct ONE-shot learning (DONE) is the simplest way to learn a new concept from one example.
DONE adds a new class to a pretrained deep neural network (DNN) classifier with neither training optimization nor other-classes modification.
DONE requires just one inference for obtaining the output of the final dense layer.
arXiv Detail & Related papers (2022-04-28T09:09:37Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - Research of Damped Newton Stochastic Gradient Descent Method for Neural
Network Training [6.231508838034926]
First-order methods like gradient descent(SGD) are recently the popular optimization method to train deep neural networks (DNNs)
In this paper, we propose the Damped Newton Descent(DN-SGD) and Gradient Descent Damped Newton(SGD-DN) methods to train DNNs for regression problems with Mean Square Error(MSE) and classification problems with Cross-Entropy Loss(CEL)
Our methods just accurately compute a small part of the parameters, which greatly reduces the computational cost and makes the learning process much faster and more accurate than SGD.
arXiv Detail & Related papers (2021-03-31T02:07:18Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z) - Neural Network Retraining for Model Serving [32.857847595096025]
We propose incremental (re)training of a neural network model to cope with a continuous flow of new data in inference.
We address two challenges of life-long retraining: catastrophic forgetting and efficient retraining.
arXiv Detail & Related papers (2020-04-29T13:52:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.