Classification Auto-Encoder based Detector against Diverse Data
Poisoning Attacks
- URL: http://arxiv.org/abs/2108.04206v1
- Date: Mon, 9 Aug 2021 17:46:52 GMT
- Title: Classification Auto-Encoder based Detector against Diverse Data
Poisoning Attacks
- Authors: Fereshteh Razmi, Li Xiong
- Abstract summary: Poisoning attacks are a category of adversarial machine learning threats.
In this paper, we propose CAE, a Classification Auto-Encoder based detector against poisoned data.
We show that an enhanced version of CAE (called CAE+) does not have to employ a clean data set to train the defense model.
- Score: 7.150136251781658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Poisoning attacks are a category of adversarial machine learning threats in
which an adversary attempts to subvert the outcome of the machine learning
systems by injecting crafted data into training data set, thus increasing the
machine learning model's test error. The adversary can tamper with the data
feature space, data labels, or both, each leading to a different attack
strategy with different strengths. Various detection approaches have recently
emerged, each focusing on one attack strategy. The Achilles heel of many of
these detection approaches is their dependence on having access to a clean,
untampered data set. In this paper, we propose CAE, a Classification
Auto-Encoder based detector against diverse poisoned data. CAE can detect all
forms of poisoning attacks using a combination of reconstruction and
classification errors without having any prior knowledge of the attack
strategy. We show that an enhanced version of CAE (called CAE+) does not have
to employ a clean data set to train the defense model. Our experimental results
on three real datasets MNIST, Fashion-MNIST and CIFAR demonstrate that our
proposed method can maintain its functionality under up to 30% contaminated
data and help the defended SVM classifier to regain its best accuracy.
Related papers
- Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks.
It is quite beneficial and challenging to detect poisoned samples from a mixed dataset.
We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z) - FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning
Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks.
FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain.
We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Autoregressive Perturbations for Data Poisoning [54.205200221427994]
Data scraping from social media has led to growing concerns regarding unauthorized use of data.
Data poisoning attacks have been proposed as a bulwark against scraping.
We introduce autoregressive (AR) poisoning, a method that can generate poisoned data without access to the broader dataset.
arXiv Detail & Related papers (2022-06-08T06:24:51Z) - PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in
Contrastive Learning [69.70602220716718]
We propose PoisonedEncoder, a data poisoning attack to contrastive learning.
In particular, an attacker injects carefully crafted poisoning inputs into the unlabeled pre-training data.
We evaluate five defenses against PoisonedEncoder, including one pre-processing, three in-processing, and one post-processing defenses.
arXiv Detail & Related papers (2022-05-13T00:15:44Z) - Gradient-based Data Subversion Attack Against Binary Classifiers [9.414651358362391]
In this work, we focus on label contamination attack in which an attacker poisons the labels of data to compromise the functionality of the system.
We exploit the gradients of a differentiable convex loss function with respect to the predicted label as a warm-start and formulate different strategies to find a set of data instances to contaminate.
Our experiments show that the proposed approach outperforms the baselines and is computationally efficient.
arXiv Detail & Related papers (2021-05-31T09:04:32Z) - ExAD: An Ensemble Approach for Explanation-based Adversarial Detection [17.455233006559734]
We propose ExAD, a framework to detect adversarial examples using an ensemble of explanation techniques.
We evaluate our approach using six state-of-the-art adversarial attacks on three image datasets.
arXiv Detail & Related papers (2021-03-22T00:53:07Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Towards Class-Oriented Poisoning Attacks Against Neural Networks [1.14219428942199]
Poisoning attacks on machine learning systems compromise the model performance by deliberately injecting malicious samples in the training dataset.
We propose a class-oriented poisoning attack that is capable of forcing the corrupted model to predict in two specific ways.
To maximize the adversarial effect as well as reduce the computational complexity of poisoned data generation, we propose a gradient-based framework.
arXiv Detail & Related papers (2020-07-31T19:27:37Z) - Adversarial Detection and Correction by Matching Prediction
Distributions [0.0]
The detector almost completely neutralises powerful attacks like Carlini-Wagner or SLIDE on MNIST and Fashion-MNIST.
We show that our method is still able to detect the adversarial examples in the case of a white-box attack where the attacker has full knowledge of both the model and the defence.
arXiv Detail & Related papers (2020-02-21T15:45:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.