Mendata: A Framework to Purify Manipulated Training Data
- URL: http://arxiv.org/abs/2312.01281v1
- Date: Sun, 3 Dec 2023 04:40:08 GMT
- Title: Mendata: A Framework to Purify Manipulated Training Data
- Authors: Zonghao Huang, Neil Gong, Michael K. Reiter
- Abstract summary: We propose Mendata, a framework to purify manipulated training data.
Mendata perturbs the training inputs so that they retain their utility but are distributed similarly to the reference data.
We demonstrate the effectiveness of Mendata by applying it to defeat state-of-the-art data poisoning and data tracing techniques.
- Score: 12.406255198638064
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Untrusted data used to train a model might have been manipulated to endow the
learned model with hidden properties that the data contributor might later
exploit. Data purification aims to remove such manipulations prior to training
the model. We propose Mendata, a novel framework to purify manipulated training
data. Starting from a small reference dataset in which a large majority of the
inputs are clean, Mendata perturbs the training inputs so that they retain
their utility but are distributed similarly (as measured by Wasserstein
distance) to the reference data, thereby eliminating hidden properties from the
learned model. A key challenge is how to find such perturbations, which we
address by formulating a min-max optimization problem and developing a two-step
method to iteratively solve it. We demonstrate the effectiveness of Mendata by
applying it to defeat state-of-the-art data poisoning and data tracing
techniques.
Related papers
- Data Taggants: Dataset Ownership Verification via Harmless Targeted Data Poisoning [12.80649024603656]
This paper introduces data taggants, a novel non-backdoor dataset ownership verification technique.
We validate our approach through comprehensive and realistic experiments on ImageNet1k using ViT and ResNet models with state-of-the-art training recipes.
arXiv Detail & Related papers (2024-10-09T12:49:23Z) - TCGU: Data-centric Graph Unlearning based on Transferable Condensation [36.670771080732486]
Transferable Condensation Graph Unlearning (TCGU) is a data-centric solution to zero-glance graph unlearning.
We show that TCGU can achieve superior performance in terms of model utility, unlearning efficiency, and unlearning efficacy than existing GU methods.
arXiv Detail & Related papers (2024-10-09T02:14:40Z) - Corrective Machine Unlearning [22.342035149807923]
We formalize Corrective Machine Unlearning as the problem of mitigating the impact of data affected by unknown manipulations on a trained model.
We find most existing unlearning methods, including retraining-from-scratch without the deletion set, require most of the manipulated data to be identified for effective corrective unlearning.
One approach, Selective Synaptic Dampening, achieves limited success, unlearning adverse effects with just a small portion of the manipulated samples in our setting.
arXiv Detail & Related papers (2024-02-21T18:54:37Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - AI Model Disgorgement: Methods and Choices [127.54319351058167]
We introduce a taxonomy of possible disgorgement methods that are applicable to modern machine learning systems.
We investigate the meaning of "removing the effects" of data in the trained model in a way that does not require retraining from scratch.
arXiv Detail & Related papers (2023-04-07T08:50:18Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - On-the-fly Denoising for Data Augmentation in Natural Language
Understanding [101.46848743193358]
We propose an on-the-fly denoising technique for data augmentation that learns from soft augmented labels provided by an organic teacher model trained on the cleaner original data.
Our method can be applied to general augmentation techniques and consistently improve the performance on both text classification and question-answering tasks.
arXiv Detail & Related papers (2022-12-20T18:58:33Z) - Machine Unlearning Method Based On Projection Residual [23.24026891609028]
This paper adopts the projection residual method based on Newton method.
The main purpose is to implement machine unlearning tasks in the context of linear regression models and neural network models.
Experiments show that this method is more thorough in deleting data, which is close to model retraining.
arXiv Detail & Related papers (2022-09-30T07:29:55Z) - Reminding the Incremental Language Model via Data-Free Self-Distillation [26.960750314663294]
Incremental language learning with pseudo-data can alleviate catastrophic forgetting in neural networks.
We propose reminding incremental language model via data-free self-distillation (DFSD)
Our DFSD can exceed the previous state-of-the-art methods even if the maximum decrease in pseudo-data is 90%.
arXiv Detail & Related papers (2021-10-17T07:27:43Z) - Variational Bayesian Unlearning [54.26984662139516]
We study the problem of approximately unlearning a Bayesian model from a small subset of the training data to be erased.
We show that it is equivalent to minimizing an evidence upper bound which trades off between fully unlearning from erased data vs. not entirely forgetting the posterior belief.
In model training with VI, only an approximate (instead of exact) posterior belief given the full data can be obtained, which makes unlearning even more challenging.
arXiv Detail & Related papers (2020-10-24T11:53:00Z) - New Properties of the Data Distillation Method When Working With Tabular
Data [77.34726150561087]
Data distillation is the problem of reducing the volume oftraining data while keeping only the necessary information.
We show that the model trained on distilled samples can outperform the model trained on the original dataset.
arXiv Detail & Related papers (2020-10-19T20:27:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.