Deep Partition Aggregation: Provable Defense against General Poisoning
Attacks
- URL: http://arxiv.org/abs/2006.14768v2
- Date: Thu, 18 Mar 2021 05:50:12 GMT
- Title: Deep Partition Aggregation: Provable Defense against General Poisoning
Attacks
- Authors: Alexander Levine, Soheil Feizi
- Abstract summary: Adrial poisoning attacks distort training data in order to corrupt the test-time behavior of a classifier.
We propose two novel provable defenses against poisoning attacks.
DPA is a certified defense against a general poisoning threat model.
SS-DPA is a certified defense against label-flipping attacks.
- Score: 136.79415677706612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial poisoning attacks distort training data in order to corrupt the
test-time behavior of a classifier. A provable defense provides a certificate
for each test sample, which is a lower bound on the magnitude of any
adversarial distortion of the training set that can corrupt the test sample's
classification. We propose two novel provable defenses against poisoning
attacks: (i) Deep Partition Aggregation (DPA), a certified defense against a
general poisoning threat model, defined as the insertion or deletion of a
bounded number of samples to the training set -- by implication, this threat
model also includes arbitrary distortions to a bounded number of images and/or
labels; and (ii) Semi-Supervised DPA (SS-DPA), a certified defense against
label-flipping poisoning attacks. DPA is an ensemble method where base models
are trained on partitions of the training set determined by a hash function.
DPA is related to both subset aggregation, a well-studied ensemble method in
classical machine learning, as well as to randomized smoothing, a popular
provable defense against evasion attacks. Our defense against label-flipping
attacks, SS-DPA, uses a semi-supervised learning algorithm as its base
classifier model: each base classifier is trained using the entire unlabeled
training set in addition to the labels for a partition. SS-DPA significantly
outperforms the existing certified defense for label-flipping attacks on both
MNIST and CIFAR-10: provably tolerating, for at least half of test images, over
600 label flips (vs. < 200 label flips) on MNIST and over 300 label flips (vs.
175 label flips) on CIFAR-10. Against general poisoning attacks, where no prior
certified defenses exists, DPA can certify >= 50% of test images against over
500 poison image insertions on MNIST, and nine insertions on CIFAR-10. These
results establish new state-of-the-art provable defenses against poisoning
attacks.
Related papers
- FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models [38.019489232264796]
We propose FCert, the first certified defense against data poisoning attacks to few-shot classification.
Our experimental results show our FCert: 1) maintains classification accuracy without attacks, 2) outperforms existing certified defenses for data poisoning attacks, and 3) is efficient and general.
arXiv Detail & Related papers (2024-04-12T17:50:40Z) - Diffusion Denoising as a Certified Defense against Clean-label Poisoning [56.04951180983087]
We show how an off-the-shelf diffusion model can sanitize the tampered training data.
We extensively test our defense against seven clean-label poisoning attacks and reduce their attack success to 0-16% with only a negligible drop in the test time accuracy.
arXiv Detail & Related papers (2024-03-18T17:17:07Z) - PACOL: Poisoning Attacks Against Continual Learners [1.569413950416037]
In this work, we demonstrate that continual learning systems can be manipulated by malicious misinformation.
We present a new category of data poisoning attacks specific for continual learners, which we refer to as em Poisoning Attacks Against Continual learners (PACOL)
A comprehensive set of experiments shows the vulnerability of commonly used generative replay and regularization-based continual learning approaches against attack methods.
arXiv Detail & Related papers (2023-11-18T00:20:57Z) - Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception.
We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception.
We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z) - Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack [53.032801921915436]
Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars.
Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks.
We show such threats exist, even when the attacker only has access to the input/output of the model.
We propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR.
arXiv Detail & Related papers (2022-11-21T09:51:28Z) - Lethal Dose Conjecture on Data Poisoning [122.83280749890078]
Data poisoning considers an adversary that distorts the training set of machine learning algorithms for malicious purposes.
In this work, we bring to light one conjecture regarding the fundamentals of data poisoning, which we call the Lethal Dose Conjecture.
arXiv Detail & Related papers (2022-08-05T17:53:59Z) - Improved Certified Defenses against Data Poisoning with (Deterministic)
Finite Aggregation [122.83280749890078]
We propose an improved certified defense against general poisoning attacks, namely Finite Aggregation.
In contrast to DPA, which directly splits the training set into disjoint subsets, our method first splits the training set into smaller disjoint subsets.
We offer an alternative view of our method, bridging the designs of deterministic and aggregation-based certified defenses.
arXiv Detail & Related papers (2022-02-05T20:08:58Z) - A BIC based Mixture Model Defense against Data Poisoning Attacks on
Classifiers [24.53226962899903]
Data Poisoning (DP) is an effective attack that causes trained classifiers to misclassify their inputs.
We propose a novel mixture model defense against DP attacks.
arXiv Detail & Related papers (2021-05-28T01:06:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.