A BIC based Mixture Model Defense against Data Poisoning Attacks on
Classifiers
- URL: http://arxiv.org/abs/2105.13530v1
- Date: Fri, 28 May 2021 01:06:09 GMT
- Title: A BIC based Mixture Model Defense against Data Poisoning Attacks on
Classifiers
- Authors: Xi Li, David J. Miller, Zhen Xiang, George Kesidis
- Abstract summary: Data Poisoning (DP) is an effective attack that causes trained classifiers to misclassify their inputs.
We propose a novel mixture model defense against DP attacks.
- Score: 24.53226962899903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data Poisoning (DP) is an effective attack that causes trained classifiers to
misclassify their inputs.DP attacks significantly degrade a classifier's
accuracy by covertly injecting attack samples into the training set. Broadly
applicable to different classifier structures, without strong assumptions about
the attacker, we herein propose a novel Bayesian Information Criterion
(BIC)-based mixture model defense against DP attacks that: 1) applies a mixture
model both to well-fit potentially multi-modal class distributions and to
capture adversarial samples within a small subset of mixture components; 2)
jointly identifies poisoned components and samples by minimizing the BIC cost
over all classes, with the identified poisoned data removed prior to classifier
training. Our experimental results, for various classifier structures,
demonstrate the effectiveness and universality of our defense under strong DP
attacks, as well as the superiority over other works.
Related papers
- FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models [38.019489232264796]
We propose FCert, the first certified defense against data poisoning attacks to few-shot classification.
Our experimental results show our FCert: 1) maintains classification accuracy without attacks, 2) outperforms existing certified defenses for data poisoning attacks, and 3) is efficient and general.
arXiv Detail & Related papers (2024-04-12T17:50:40Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Adversarial Attacks are a Surprisingly Strong Baseline for Poisoning
Few-Shot Meta-Learners [28.468089304148453]
We attack amortized meta-learners, which allows us to craft colluding sets of inputs that fool the system's learning algorithm.
We show that in a white box setting, these attacks are very successful and can cause the target model's predictions to become worse than chance.
We explore two hypotheses to explain this: 'overfitting' by the attack, and mismatch between the model on which the attack is generated and that to which the attack is transferred.
arXiv Detail & Related papers (2022-11-23T14:55:44Z) - Towards Fair Classification against Poisoning Attacks [52.57443558122475]
We study the poisoning scenario where the attacker can insert a small fraction of samples into training data.
We propose a general and theoretically guaranteed framework which accommodates traditional defense methods to fair classification against poisoning attacks.
arXiv Detail & Related papers (2022-10-18T00:49:58Z) - Improved Certified Defenses against Data Poisoning with (Deterministic)
Finite Aggregation [122.83280749890078]
We propose an improved certified defense against general poisoning attacks, namely Finite Aggregation.
In contrast to DPA, which directly splits the training set into disjoint subsets, our method first splits the training set into smaller disjoint subsets.
We offer an alternative view of our method, bridging the designs of deterministic and aggregation-based certified defenses.
arXiv Detail & Related papers (2022-02-05T20:08:58Z) - PARL: Enhancing Diversity of Ensemble Networks to Resist Adversarial
Attacks via Pairwise Adversarially Robust Loss Function [13.417003144007156]
adversarial attacks tend to rely on the principle of transferability.
Ensemble methods against adversarial attacks demonstrate that an adversarial example is less likely to mislead multiple classifiers.
Recent ensemble methods have either been shown to be vulnerable to stronger adversaries or shown to lack an end-to-end evaluation.
arXiv Detail & Related papers (2021-12-09T14:26:13Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - De-Pois: An Attack-Agnostic Defense against Data Poisoning Attacks [17.646155241759743]
De-Pois is an attack-agnostic defense against poisoning attacks.
We implement four types of poisoning attacks and evaluate De-Pois with five typical defense methods.
arXiv Detail & Related papers (2021-05-08T04:47:37Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.