BagFlip: A Certified Defense against Data Poisoning
- URL: http://arxiv.org/abs/2205.13634v1
- Date: Thu, 26 May 2022 21:09:24 GMT
- Title: BagFlip: A Certified Defense against Data Poisoning
- Authors: Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni
- Abstract summary: BagFlip is a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attacks.
We evaluate BagFlip on image classification and malware detection datasets.
- Score: 15.44806926189642
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine learning models are vulnerable to data-poisoning attacks, in which an
attacker maliciously modifies the training set to change the prediction of a
learned model. In a trigger-less attack, the attacker can modify the training
set but not the test inputs, while in a backdoor attack the attacker can also
modify test inputs. Existing model-agnostic defense approaches either cannot
handle backdoor attacks or do not provide effective certificates (i.e., a proof
of a defense). We present BagFlip, a model-agnostic certified approach that can
effectively defend against both trigger-less and backdoor attacks. We evaluate
BagFlip on image classification and malware detection datasets. BagFlip is
equal to or more effective than the state-of-the-art approaches for
trigger-less attacks and more effective than the state-of-the-art approaches
for backdoor attacks.
Related papers
- Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models.
In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned.
We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z) - Does Few-shot Learning Suffer from Backdoor Attacks? [63.9864247424967]
We show that few-shot learning can still be vulnerable to backdoor attacks.
Our method demonstrates a high Attack Success Rate (ASR) in FSL tasks with different few-shot learning paradigms.
This study reveals that few-shot learning still suffers from backdoor attacks, and its security should be given attention.
arXiv Detail & Related papers (2023-12-31T06:43:36Z) - Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation.
Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them.
We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z) - BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense [26.314275611787984]
Attack forensics is a critical counter-measure for traditional cyber attacks.
Deep Learning backdoor attacks have a threat model similar to traditional cyber attacks.
We propose a novel model backdoor forensics technique.
arXiv Detail & Related papers (2023-01-16T02:59:40Z) - Narcissus: A Practical Clean-Label Backdoor Attack with Limited
Information [22.98039177091884]
"Clean-label" backdoor attacks require knowledge of the entire training set to be effective.
This paper provides an algorithm to mount clean-label backdoor attacks based only on the knowledge of representative examples from the target class.
Our attack works well across datasets and models, even when the trigger presents in the physical world.
arXiv Detail & Related papers (2022-04-11T16:58:04Z) - On the Effectiveness of Adversarial Training against Backdoor Attacks [111.8963365326168]
A backdoored model always predicts a target class in the presence of a predefined trigger pattern.
In general, adversarial training is believed to defend against backdoor attacks.
We propose a hybrid strategy which provides satisfactory robustness across different backdoor attacks.
arXiv Detail & Related papers (2022-02-22T02:24:46Z) - Backdoor Attack in the Physical World [49.64799477792172]
Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs)
Most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images.
We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2021-04-06T08:37:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.