Adversarial Vulnerability of Active Transfer Learning
- URL: http://arxiv.org/abs/2101.10792v1
- Date: Tue, 26 Jan 2021 14:07:09 GMT
- Title: Adversarial Vulnerability of Active Transfer Learning
- Authors: Nicolas M. M\"uller, Konstantin B\"ottinger
- Abstract summary: Two widely used techniques for training supervised machine learning models on small datasets are Active Learning and Transfer Learning.
We show that the combination of these techniques is particularly susceptible to a new kind of data poisoning attack.
We show that a model trained on such a poisoned dataset has a significantly deteriorated performance, dropping from 86% to 34% test accuracy.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Two widely used techniques for training supervised machine learning models on
small datasets are Active Learning and Transfer Learning. The former helps to
optimally use a limited budget to label new data. The latter uses large
pre-trained models as feature extractors and enables the design of complex,
non-linear models even on tiny datasets. Combining these two approaches is an
effective, state-of-the-art method when dealing with small datasets.
In this paper, we share an intriguing observation: Namely, that the
combination of these techniques is particularly susceptible to a new kind of
data poisoning attack: By adding small adversarial noise on the input, it is
possible to create a collision in the output space of the transfer learner. As
a result, Active Learning algorithms no longer select the optimal instances,
but almost exclusively the ones injected by the attacker. This allows an
attacker to manipulate the active learner to select and include arbitrary
images into the data set, even against an overwhelming majority of unpoisoned
samples. We show that a model trained on such a poisoned dataset has a
significantly deteriorated performance, dropping from 86\% to 34\% test
accuracy. We evaluate this attack on both audio and image datasets and support
our findings empirically. To the best of our knowledge, this weakness has not
been described before in literature.
Related papers
- Corrective Machine Unlearning [22.342035149807923]
We formalize Corrective Machine Unlearning as the problem of mitigating the impact of data affected by unknown manipulations on a trained model.
We find most existing unlearning methods, including retraining-from-scratch without the deletion set, require most of the manipulated data to be identified for effective corrective unlearning.
One approach, Selective Synaptic Dampening, achieves limited success, unlearning adverse effects with just a small portion of the manipulated samples in our setting.
arXiv Detail & Related papers (2024-02-21T18:54:37Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Active Learning with Combinatorial Coverage [0.0]
Active learning is a practical field of machine learning that automates the process of selecting which data to label.
Current methods are effective in reducing the burden of data labeling but are heavily model-reliant.
This has led to the inability of sampled data to be transferred to new models as well as issues with sampling bias.
We propose active learning methods utilizing coverage to overcome these issues.
arXiv Detail & Related papers (2023-02-28T13:43:23Z) - On Inductive Biases for Machine Learning in Data Constrained Settings [0.0]
This thesis explores a different answer to the problem of learning expressive models in data constrained settings.
Instead of relying on big datasets to learn neural networks, we will replace some modules by known functions reflecting the structure of the data.
Our approach falls under the hood of "inductive biases", which can be defined as hypothesis on the data at hand restricting the space of models to explore.
arXiv Detail & Related papers (2023-02-21T14:22:01Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Poisoning the Unlabeled Dataset of Semi-Supervised Learning [26.093821359987224]
We study a new class of vulnerabilities: poisoning attacks that modify the unlabeled dataset.
In order to be useful, unlabeled datasets are given strictly less review than labeled datasets.
Our attacks are highly effective across datasets and semi-supervised learning methods.
arXiv Detail & Related papers (2021-05-04T16:55:20Z) - Active Learning Under Malicious Mislabeling and Poisoning Attacks [2.4660652494309936]
Deep neural networks usually require large labeled datasets for training.
Most of these data are unlabeled and are vulnerable to data poisoning attacks.
In this paper, we develop an efficient active learning method that requires fewer labeled instances.
arXiv Detail & Related papers (2021-01-01T03:43:36Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.