Active Learning Under Malicious Mislabeling and Poisoning Attacks
- URL: http://arxiv.org/abs/2101.00157v2
- Date: Wed, 24 Mar 2021 01:07:29 GMT
- Title: Active Learning Under Malicious Mislabeling and Poisoning Attacks
- Authors: Jing Lin, Ryan Luley, and Kaiqi Xiong
- Abstract summary: Deep neural networks usually require large labeled datasets for training.
Most of these data are unlabeled and are vulnerable to data poisoning attacks.
In this paper, we develop an efficient active learning method that requires fewer labeled instances.
- Score: 2.4660652494309936
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep neural networks usually require large labeled datasets for training to
achieve the start-of-the-art performance in many tasks, such as image
classification and natural language processing. Though a lot of data is created
each day by active Internet users through various distributed systems across
the world, most of these data are unlabeled and are vulnerable to data
poisoning attacks. In this paper, we develop an efficient active learning
method that requires fewer labeled instances and incorporates the technique of
adversarial retraining in which additional labeled artificial data are
generated without increasing the labeling budget. The generated adversarial
examples also provide a way to measure the vulnerability of the model. To check
the performance of the proposed method under an adversarial setting, i.e.,
malicious mislabeling and data poisoning attacks, we perform an extensive
evaluation on the reduced CIFAR-10 dataset, which contains only two classes:
'airplane' and 'frog' by using the private cloud on campus. Our experimental
results demonstrate that the proposed active learning method is efficient for
defending against malicious mislabeling and data poisoning attacks.
Specifically, whereas the baseline active learning method based on the random
sampling strategy performs poorly (about 50%) under a malicious mislabeling
attack, the proposed active learning method can achieve the desired accuracy of
89% using only one-third of the dataset on average.
Related papers
- Machine Unlearning Fails to Remove Data Poisoning Attacks [20.495836283745618]
In addition to complying with data deletion requests, one often-cited potential application for unlearning methods is to remove the effects of training on poisoned data.
We experimentally demonstrate that, while existing unlearning methods have been demonstrated to be effective in a number of evaluation settings, they fail to remove the effects of data poisoning.
arXiv Detail & Related papers (2024-06-25T02:05:29Z) - Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector.
Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z) - Gradient-based Data Subversion Attack Against Binary Classifiers [9.414651358362391]
In this work, we focus on label contamination attack in which an attacker poisons the labels of data to compromise the functionality of the system.
We exploit the gradients of a differentiable convex loss function with respect to the predicted label as a warm-start and formulate different strategies to find a set of data instances to contaminate.
Our experiments show that the proposed approach outperforms the baselines and is computationally efficient.
arXiv Detail & Related papers (2021-05-31T09:04:32Z) - Poisoning the Unlabeled Dataset of Semi-Supervised Learning [26.093821359987224]
We study a new class of vulnerabilities: poisoning attacks that modify the unlabeled dataset.
In order to be useful, unlabeled datasets are given strictly less review than labeled datasets.
Our attacks are highly effective across datasets and semi-supervised learning methods.
arXiv Detail & Related papers (2021-05-04T16:55:20Z) - Adversarial Vulnerability of Active Transfer Learning [0.0]
Two widely used techniques for training supervised machine learning models on small datasets are Active Learning and Transfer Learning.
We show that the combination of these techniques is particularly susceptible to a new kind of data poisoning attack.
We show that a model trained on such a poisoned dataset has a significantly deteriorated performance, dropping from 86% to 34% test accuracy.
arXiv Detail & Related papers (2021-01-26T14:07:09Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Semi-supervised Batch Active Learning via Bilevel Optimization [89.37476066973336]
We formulate our approach as a data summarization problem via bilevel optimization.
We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.
arXiv Detail & Related papers (2020-10-19T16:53:24Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.