DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via
Diffusion Models
- URL: http://arxiv.org/abs/2312.11057v2
- Date: Wed, 20 Dec 2023 01:40:15 GMT
- Title: DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via
Diffusion Models
- Authors: Jiachen Zhou, Peizhuo Lv, Yibing Lan, Guozhu Meng, Kai Chen, Hualong
Ma
- Abstract summary: We propose DataElixir, a novel sanitization approach tailored to purify poisoned datasets.
We leverage diffusion models to eliminate trigger features and restore benign features, thereby turning the poisoned samples into benign ones.
Experiments conducted on 9 popular attacks demonstrates that DataElixir effectively mitigates various complex attacks while exerting minimal impact on benign accuracy.
- Score: 12.42597979026873
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dataset sanitization is a widely adopted proactive defense against
poisoning-based backdoor attacks, aimed at filtering out and removing poisoned
samples from training datasets. However, existing methods have shown limited
efficacy in countering the ever-evolving trigger functions, and often leading
to considerable degradation of benign accuracy. In this paper, we propose
DataElixir, a novel sanitization approach tailored to purify poisoned datasets.
We leverage diffusion models to eliminate trigger features and restore benign
features, thereby turning the poisoned samples into benign ones. Specifically,
with multiple iterations of the forward and reverse process, we extract
intermediary images and their predicted labels for each sample in the original
dataset. Then, we identify anomalous samples in terms of the presence of label
transition of the intermediary images, detect the target label by quantifying
distribution discrepancy, select their purified images considering pixel and
feature distance, and determine their ground-truth labels by training a benign
model. Experiments conducted on 9 popular attacks demonstrates that DataElixir
effectively mitigates various complex attacks while exerting minimal impact on
benign accuracy, surpassing the performance of baseline defense methods.
Related papers
- CBPF: Filtering Poisoned Data Based on Composite Backdoor Attack [11.815603563125654]
This paper explores strategies for mitigating the risks associated with backdoor attacks by examining the filtration of poisoned samples.
A novel three-stage poisoning data filtering approach, known as Composite Backdoor Poison Filtering (CBPF), is proposed as an effective solution.
arXiv Detail & Related papers (2024-06-23T14:37:24Z) - PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection [57.571451139201855]
Prediction Shift Backdoor Detection (PSBD) is a novel method for identifying backdoor samples in deep neural networks.
PSBD is motivated by an intriguing Prediction Shift (PS) phenomenon, where poisoned models' predictions on clean data often shift away from true labels towards certain other labels.
PSBD identifies backdoor training samples by computing the Prediction Shift Uncertainty (PSU), the variance in probability values when dropout layers are toggled on and off during model inference.
arXiv Detail & Related papers (2024-06-09T15:31:00Z) - On Practical Aspects of Aggregation Defenses against Data Poisoning
Attacks [58.718697580177356]
Attacks on deep learning models with malicious training samples are known as data poisoning.
Recent advances in defense strategies against data poisoning have highlighted the effectiveness of aggregation schemes in achieving certified poisoning robustness.
Here we focus on Deep Partition Aggregation, a representative aggregation defense, and assess its practical aspects, including efficiency, performance, and robustness.
arXiv Detail & Related papers (2023-06-28T17:59:35Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - Diffusion-Based Adversarial Sample Generation for Improved Stealthiness
and Controllability [62.105715985563656]
We propose a novel framework dubbed Diffusion-Based Projected Gradient Descent (Diff-PGD) for generating realistic adversarial samples.
Our framework can be easily customized for specific tasks such as digital attacks, physical-world attacks, and style-based attacks.
arXiv Detail & Related papers (2023-05-25T21:51:23Z) - Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning
Attacks [31.339252233416477]
We introduce the notion of model poisoning reachability as a technical tool to explore the intrinsic limits of data poisoning attacks towards target parameters.
We derive an easily computable threshold to establish and quantify a surprising phase transition phenomenon among popular ML models.
Our work highlights the critical role played by the poisoning ratio, and sheds new insights on existing empirical results, attacks and mitigation strategies in data poisoning.
arXiv Detail & Related papers (2023-03-07T01:55:26Z) - Not All Poisons are Created Equal: Robust Training against Data
Poisoning [15.761683760167777]
Data poisoning causes misclassification of test time target examples by injecting maliciously crafted samples in the training data.
We propose an efficient defense mechanism that significantly reduces the success rate of various data poisoning attacks.
arXiv Detail & Related papers (2022-10-18T08:19:41Z) - Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated
Label Mixing [104.630875328668]
Mixup scheme suggests mixing a pair of samples to create an augmented training sample.
We present a novel, yet simple Mixup-variant that captures the best of both worlds.
arXiv Detail & Related papers (2021-12-16T11:27:48Z) - Dealing with Distribution Mismatch in Semi-supervised Deep Learning for
Covid-19 Detection Using Chest X-ray Images: A Novel Approach Using Feature
Densities [0.6882042556551609]
Semi-supervised deep learning is an attractive alternative to large labelled datasets.
In real-world usage settings, an unlabelled dataset might present a different distribution than the labelled dataset.
This results in a distribution mismatch between the unlabelled and labelled datasets.
arXiv Detail & Related papers (2021-08-17T00:35:43Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.