Towards Class-Oriented Poisoning Attacks Against Neural Networks
- URL: http://arxiv.org/abs/2008.00047v2
- Date: Mon, 11 Oct 2021 19:51:38 GMT
- Title: Towards Class-Oriented Poisoning Attacks Against Neural Networks
- Authors: Bingyin Zhao, Yingjie Lao
- Abstract summary: Poisoning attacks on machine learning systems compromise the model performance by deliberately injecting malicious samples in the training dataset.
We propose a class-oriented poisoning attack that is capable of forcing the corrupted model to predict in two specific ways.
To maximize the adversarial effect as well as reduce the computational complexity of poisoned data generation, we propose a gradient-based framework.
- Score: 1.14219428942199
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Poisoning attacks on machine learning systems compromise the model
performance by deliberately injecting malicious samples in the training dataset
to influence the training process. Prior works focus on either availability
attacks (i.e., lowering the overall model accuracy) or integrity attacks (i.e.,
enabling specific instance-based backdoor). In this paper, we advance the
adversarial objectives of the availability attacks to a per-class basis, which
we refer to as class-oriented poisoning attacks. We demonstrate that the
proposed attack is capable of forcing the corrupted model to predict in two
specific ways: (i) classify unseen new images to a targeted "supplanter" class,
and (ii) misclassify images from a "victim" class while maintaining the
classification accuracy on other non-victim classes. To maximize the
adversarial effect as well as reduce the computational complexity of poisoned
data generation, we propose a gradient-based framework that crafts poisoning
images with carefully manipulated feature information for each scenario. Using
newly defined metrics at the class level, we demonstrate the effectiveness of
the proposed class-oriented poisoning attacks on various models (e.g., LeNet-5,
Vgg-9, and ResNet-50) over a wide range of datasets (e.g., MNIST, CIFAR-10, and
ImageNet-ILSVRC2012) in an end-to-end training setting.
Related papers
- FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning
Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks.
FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain.
We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z) - On the Exploitability of Instruction Tuning [103.8077787502381]
In this work, we investigate how an adversary can exploit instruction tuning to change a model's behavior.
We propose textitAutoPoison, an automated data poisoning pipeline.
Our results show that AutoPoison allows an adversary to change a model's behavior by poisoning only a small fraction of data.
arXiv Detail & Related papers (2023-06-28T17:54:04Z) - Can Adversarial Examples Be Parsed to Reveal Victim Model Information? [62.814751479749695]
In this work, we ask whether it is possible to infer data-agnostic victim model (VM) information from data-specific adversarial instances.
We collect a dataset of adversarial attacks across 7 attack types generated from 135 victim models.
We show that a simple, supervised model parsing network (MPN) is able to infer VM attributes from unseen adversarial attacks.
arXiv Detail & Related papers (2023-03-13T21:21:49Z) - Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks [22.742818282850305]
camouflaged data poisoning attacks arise when model retraining may be induced.
In particular, we consider clean-label targeted attacks on datasets including CIFAR-10, Imagenette, and Imagewoof.
This attack is realized by constructing camouflage datapoints that mask the effect of a poisoned dataset.
arXiv Detail & Related papers (2022-12-21T01:52:17Z) - Detection and Mitigation of Byzantine Attacks in Distributed Training [24.951227624475443]
An abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference.
Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities.
arXiv Detail & Related papers (2022-08-17T05:49:52Z) - Learning from Attacks: Attacking Variational Autoencoder for Improving
Image Classification [17.881134865491063]
Adversarial attacks are often considered as threats to the robustness of Deep Neural Networks (DNNs)
This work analyzes adversarial attacks from a different perspective. Namely, adversarial examples contain implicit information that is useful to the predictions.
We propose an algorithmic framework that leverages the advantages of the DNNs for data self-expression and task-specific predictions.
arXiv Detail & Related papers (2022-03-11T08:48:26Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z) - Leveraging Siamese Networks for One-Shot Intrusion Detection Model [0.0]
Supervised Machine Learning (ML) to enhance Intrusion Detection Systems has been the subject of significant research.
retraining the models in-situ renders the network susceptible to attacks owing to the time-window required to acquire a sufficient volume of data.
Here, a complementary approach referred to as 'One-Shot Learning', whereby a limited number of examples of a new attack-class is used to identify a new attack-class.
A Siamese Network is trained to differentiate between classes based on pairs similarities, rather than features, allowing to identify new and previously unseen attacks.
arXiv Detail & Related papers (2020-06-27T11:40:01Z) - Poisoning Attacks on Algorithmic Fairness [14.213638219685656]
We introduce an optimization framework for poisoning attacks against algorithmic fairness.
We develop a gradient-based poisoning attack aimed at introducing classification disparities among different groups in the data.
We believe that our findings pave the way towards the definition of an entirely novel set of adversarial attacks targeting algorithmic fairness in different scenarios.
arXiv Detail & Related papers (2020-04-15T08:07:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.