Adversarial Attacks are a Surprisingly Strong Baseline for Poisoning
Few-Shot Meta-Learners
- URL: http://arxiv.org/abs/2211.12990v1
- Date: Wed, 23 Nov 2022 14:55:44 GMT
- Title: Adversarial Attacks are a Surprisingly Strong Baseline for Poisoning
Few-Shot Meta-Learners
- Authors: Elre T. Oldewage, John Bronskill, Richard E. Turner
- Abstract summary: We attack amortized meta-learners, which allows us to craft colluding sets of inputs that fool the system's learning algorithm.
We show that in a white box setting, these attacks are very successful and can cause the target model's predictions to become worse than chance.
We explore two hypotheses to explain this: 'overfitting' by the attack, and mismatch between the model on which the attack is generated and that to which the attack is transferred.
- Score: 28.468089304148453
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper examines the robustness of deployed few-shot meta-learning systems
when they are fed an imperceptibly perturbed few-shot dataset. We attack
amortized meta-learners, which allows us to craft colluding sets of inputs that
are tailored to fool the system's learning algorithm when used as training
data. Jointly crafted adversarial inputs might be expected to synergistically
manipulate a classifier, allowing for very strong data-poisoning attacks that
would be hard to detect. We show that in a white box setting, these attacks are
very successful and can cause the target model's predictions to become worse
than chance. However, in opposition to the well-known transferability of
adversarial examples in general, the colluding sets do not transfer well to
different classifiers. We explore two hypotheses to explain this: 'overfitting'
by the attack, and mismatch between the model on which the attack is generated
and that to which the attack is transferred. Regardless of the mitigation
strategies suggested by these hypotheses, the colluding inputs transfer no
better than adversarial inputs that are generated independently in the usual
way.
Related papers
- An Adversarial Approach to Evaluating the Robustness of Event Identification Models [12.862865254507179]
This paper considers a physics-based modal decomposition method to extract features for event classification.
The resulting classifiers are tested against an adversarial algorithm to evaluate their robustness.
arXiv Detail & Related papers (2024-02-19T18:11:37Z) - Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack [53.032801921915436]
Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars.
Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks.
We show such threats exist, even when the attacker only has access to the input/output of the model.
We propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR.
arXiv Detail & Related papers (2022-11-21T09:51:28Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Adversarial Robustness of Deep Reinforcement Learning based Dynamic
Recommender Systems [50.758281304737444]
We propose to explore adversarial examples and attack detection on reinforcement learning-based interactive recommendation systems.
We first craft different types of adversarial examples by adding perturbations to the input and intervening on the casual factors.
Then, we augment recommendation systems by detecting potential attacks with a deep learning-based classifier based on the crafted data.
arXiv Detail & Related papers (2021-12-02T04:12:24Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - Adversarial defenses via a mixture of generators [0.0]
adversarial examples remain a relatively weakly understood feature of deep learning systems.
We show that it is possible to train such a system without supervision, simultaneously on multiple adversarial attacks.
Our system is able to recover class information for previously-unseen examples with neither attack nor data labels on the MNIST dataset.
arXiv Detail & Related papers (2021-10-05T21:27:50Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - ATRO: Adversarial Training with a Rejection Option [10.36668157679368]
This paper proposes a classification framework with a rejection option to mitigate the performance deterioration caused by adversarial examples.
Applying the adversarial training objective to both a classifier and a rejection function simultaneously, we can choose to abstain from classification when it has insufficient confidence to classify a test data point.
arXiv Detail & Related papers (2020-10-24T14:05:03Z) - TREND: Transferability based Robust ENsemble Design [6.663641564969944]
We study the effect of network architecture, input, weight and activation quantization on transferability of adversarial samples.
We show that transferability is significantly hampered by input quantization between source and target.
We propose a new state-of-the-art ensemble attack to combat this.
arXiv Detail & Related papers (2020-08-04T13:38:14Z) - Adversarial Example Games [51.92698856933169]
Adrial Example Games (AEG) is a framework that models the crafting of adversarial examples.
AEG provides a new way to design adversarial examples by adversarially training a generator and aversa from a given hypothesis class.
We demonstrate the efficacy of AEG on the MNIST and CIFAR-10 datasets.
arXiv Detail & Related papers (2020-07-01T19:47:23Z) - Extending Adversarial Attacks to Produce Adversarial Class Probability
Distributions [1.439518478021091]
We show that we can approximate any probability distribution for the classes while maintaining a high fooling rate.
Our results demonstrate that we can closely approximate any probability distribution for the classes while maintaining a high fooling rate.
arXiv Detail & Related papers (2020-04-14T09:39:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.