Mendata: A Framework to Purify Manipulated Training Data
- URL: http://arxiv.org/abs/2312.01281v1
- Date: Sun, 3 Dec 2023 04:40:08 GMT
- Title: Mendata: A Framework to Purify Manipulated Training Data
- Authors: Zonghao Huang, Neil Gong, Michael K. Reiter
- Abstract summary: We propose Mendata, a framework to purify manipulated training data.
Mendata perturbs the training inputs so that they retain their utility but are distributed similarly to the reference data.
We demonstrate the effectiveness of Mendata by applying it to defeat state-of-the-art data poisoning and data tracing techniques.
- Score: 12.406255198638064
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Untrusted data used to train a model might have been manipulated to endow the
learned model with hidden properties that the data contributor might later
exploit. Data purification aims to remove such manipulations prior to training
the model. We propose Mendata, a novel framework to purify manipulated training
data. Starting from a small reference dataset in which a large majority of the
inputs are clean, Mendata perturbs the training inputs so that they retain
their utility but are distributed similarly (as measured by Wasserstein
distance) to the reference data, thereby eliminating hidden properties from the
learned model. A key challenge is how to find such perturbations, which we
address by formulating a min-max optimization problem and developing a two-step
method to iteratively solve it. We demonstrate the effectiveness of Mendata by
applying it to defeat state-of-the-art data poisoning and data tracing
techniques.
Related papers
- Dataset Ownership Verification in Contrastive Pre-trained Models [37.03747798645621]
We propose the first dataset ownership verification method tailored specifically for self-supervised pre-trained models by contrastive learning.
We validate the efficacy of this approach across multiple contrastive pre-trained models including SimCLR, BYOL, SimSiam, MOCO v3, and DINO.
arXiv Detail & Related papers (2025-02-11T05:42:21Z) - Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training.
We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO.
As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z) - Data Taggants: Dataset Ownership Verification via Harmless Targeted Data Poisoning [12.80649024603656]
This paper introduces data taggants, a novel non-backdoor dataset ownership verification technique.
We validate our approach through comprehensive and realistic experiments on ImageNet1k using ViT and ResNet models with state-of-the-art training recipes.
arXiv Detail & Related papers (2024-10-09T12:49:23Z) - Remaining-data-free Machine Unlearning by Suppressing Sample Contribution [22.30844094734722]
Un unlearned model should approach the retrained model, where the forgetting data are not involved in the training process and hence do not contribute to the retrained model.
We propose MU-Mis (Machine Unlearning by Minimizing input sensitivity) to suppress the contribution of the forgetting data.
It is the first time that a remaining-data-free method can outperform state-of-the-art unlearning methods that utilize the remaining data.
arXiv Detail & Related papers (2024-02-23T05:44:15Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - AI Model Disgorgement: Methods and Choices [127.54319351058167]
We introduce a taxonomy of possible disgorgement methods that are applicable to modern machine learning systems.
We investigate the meaning of "removing the effects" of data in the trained model in a way that does not require retraining from scratch.
arXiv Detail & Related papers (2023-04-07T08:50:18Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - On-the-fly Denoising for Data Augmentation in Natural Language
Understanding [101.46848743193358]
We propose an on-the-fly denoising technique for data augmentation that learns from soft augmented labels provided by an organic teacher model trained on the cleaner original data.
Our method can be applied to general augmentation techniques and consistently improve the performance on both text classification and question-answering tasks.
arXiv Detail & Related papers (2022-12-20T18:58:33Z) - Machine Unlearning Method Based On Projection Residual [23.24026891609028]
This paper adopts the projection residual method based on Newton method.
The main purpose is to implement machine unlearning tasks in the context of linear regression models and neural network models.
Experiments show that this method is more thorough in deleting data, which is close to model retraining.
arXiv Detail & Related papers (2022-09-30T07:29:55Z) - Reminding the Incremental Language Model via Data-Free Self-Distillation [26.960750314663294]
Incremental language learning with pseudo-data can alleviate catastrophic forgetting in neural networks.
We propose reminding incremental language model via data-free self-distillation (DFSD)
Our DFSD can exceed the previous state-of-the-art methods even if the maximum decrease in pseudo-data is 90%.
arXiv Detail & Related papers (2021-10-17T07:27:43Z) - New Properties of the Data Distillation Method When Working With Tabular
Data [77.34726150561087]
Data distillation is the problem of reducing the volume oftraining data while keeping only the necessary information.
We show that the model trained on distilled samples can outperform the model trained on the original dataset.
arXiv Detail & Related papers (2020-10-19T20:27:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.