Related papers: Approximate Data Deletion from Machine Learning Models

Approximate Data Deletion from Machine Learning Models

URL: http://arxiv.org/abs/2002.10077v2
Date: Tue, 23 Feb 2021 18:56:03 GMT
Title: Approximate Data Deletion from Machine Learning Models
Authors: Zachary Izzo, Mary Anne Smart, Kamalika Chaudhuri, James Zou
Abstract summary: Deleting data from a trained machine learning (ML) model is a critical task in many applications. We propose a new approximate deletion method for linear and logistic models. We also develop a new feature-injection test to evaluate the thoroughness of data deletion from ML models.
Score: 31.689174311625084
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deleting data from a trained machine learning (ML) model is a critical task in many applications. For example, we may want to remove the influence of training points that might be out of date or outliers. Regulations such as EU's General Data Protection Regulation also stipulate that individuals can request to have their data deleted. The naive approach to data deletion is to retrain the ML model on the remaining data, but this is too time consuming. In this work, we propose a new approximate deletion method for linear and logistic models whose computational cost is linear in the the feature dimension $d$ and independent of the number of training data $n$. This is a significant gain over all existing methods, which all have superlinear time dependence on the dimension. We also develop a new feature-injection test to evaluate the thoroughness of data deletion from ML models.

Related papers

The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements. LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information. Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z)
Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning. Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset. We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU) We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z)
AI Model Disgorgement: Methods and Choices [127.54319351058167]
We introduce a taxonomy of possible disgorgement methods that are applicable to modern machine learning systems. We investigate the meaning of "removing the effects" of data in the trained model in a way that does not require retraining from scratch.
arXiv Detail & Related papers (2023-04-07T08:50:18Z)
Machine Unlearning Method Based On Projection Residual [23.24026891609028]
This paper adopts the projection residual method based on Newton method. The main purpose is to implement machine unlearning tasks in the context of linear regression models and neural network models. Experiments show that this method is more thorough in deleting data, which is close to model retraining.
arXiv Detail & Related papers (2022-09-30T07:29:55Z)
Datamodels: Predicting Predictions from Training Data [86.66720175866415]
We present a conceptual framework, datamodeling, for analyzing the behavior of a model class in terms of the training data. We show that even simple linear datamodels can successfully predict model outputs.
arXiv Detail & Related papers (2022-02-01T18:15:24Z)
Zero-Shot Machine Unlearning [6.884272840652062]
Modern privacy regulations grant citizens the right to be forgotten by products, services and companies. No data related to the training process or training samples may be accessible for the unlearning purpose. We propose two novel solutions for zero-shot machine unlearning based on (a) error minimizing-maximizing noise and (b) gated knowledge transfer.
arXiv Detail & Related papers (2022-01-14T19:16:09Z)
Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models. Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z)
Certifiable Machine Unlearning for Linear Models [1.484852576248587]
Machine unlearning is the task of updating machine learning (ML) models after a subset of the training data they were trained on is deleted. We present an experimental study of the three state-of-the-art approximate unlearning methods for linear models.
arXiv Detail & Related papers (2021-06-29T05:05:58Z)
Supervised Machine Learning with Plausible Deniability [1.685485565763117]
We study the question of how well machine learning (ML) models trained on a certain data set provide privacy for the training data. We show that one can take a set of purely random training data, and from this define a suitable learning rule'' that will produce a ML model that is exactly $f$.
arXiv Detail & Related papers (2021-06-08T11:54:51Z)
Certified Data Removal from Machine Learning Models [79.91502073022602]
Good data stewardship requires removal of data at the request of the data's owner. This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request. We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with.
arXiv Detail & Related papers (2019-11-08T03:57:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.