Related papers: Adversarial Transfer Attacks With Unknown Data and Class Overlap

Adversarial Transfer Attacks With Unknown Data and Class Overlap

URL: http://arxiv.org/abs/2109.11125v2
Date: Fri, 24 Sep 2021 22:01:04 GMT
Title: Adversarial Transfer Attacks With Unknown Data and Class Overlap
Authors: Luke E. Richards, Andr\'e Nguyen, Ryan Capps, Steven Forsythe, Cynthia Matuszek, Edward Raff
Abstract summary: Current transfer attack research has an unrealistic advantage for the attacker. We present the first study of transferring adversarial attacks focusing on the data available to attacker and victim under imperfect settings. This threat model is relevant to applications in medicine, malware, and others.
Score: 19.901933940805684
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The ability to transfer adversarial attacks from one model (the surrogate) to another model (the victim) has been an issue of concern within the machine learning (ML) community. The ability to successfully evade unseen models represents an uncomfortable level of ease toward implementing attacks. In this work we note that as studied, current transfer attack research has an unrealistic advantage for the attacker: the attacker has the exact same training data as the victim. We present the first study of transferring adversarial attacks focusing on the data available to attacker and victim under imperfect settings without querying the victim, where there is some variable level of overlap in the exact data used or in the classes learned by each model. This threat model is relevant to applications in medicine, malware, and others. Under this new threat model attack success rate is not correlated with data or class overlap in the way one would expect, and varies with dataset. This makes it difficult for attacker and defender to reason about each other and contributes to the broader study of model robustness and security. We remedy this by developing a masked version of Projected Gradient Descent that simulates class disparity, which enables the attacker to reliably estimate a lower-bound on their attack's success.

Related papers

Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks [11.390175856652856]
Clean-label attacks are a more stealthy form of backdoor attacks that can perform the attack without changing the labels of poisoned data. We study different strategies for selectively poisoning a small set of training samples in the target class to boost the attack success rate. Our threat model poses a serious threat in training machine learning models with third-party datasets.
arXiv Detail & Related papers (2024-07-15T15:38:21Z)
Your Attack Is Too DUMB: Formalizing Attacker Scenarios for Adversarial Transferability [17.899587145780817]
Evasion attacks are a threat to machine learning models, where adversaries attempt to affect classifiers by injecting malicious samples. We propose the DUMB attacker model, which allows analyzing if evasion attacks fail to transfer when the training conditions of surrogate and victim models differ. Our analysis, which generated 13K tests over 14 distinct attacks, led to numerous novel findings in the scope of transferable attacks with surrogate models.
arXiv Detail & Related papers (2023-06-27T10:21:27Z)
Can Adversarial Examples Be Parsed to Reveal Victim Model Information? [62.814751479749695]
In this work, we ask whether it is possible to infer data-agnostic victim model (VM) information from data-specific adversarial instances. We collect a dataset of adversarial attacks across 7 attack types generated from 135 victim models. We show that a simple, supervised model parsing network (MPN) is able to infer VM attributes from unseen adversarial attacks.
arXiv Detail & Related papers (2023-03-13T21:21:49Z)
Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack [53.032801921915436]
Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars. Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks. We show such threats exist, even when the attacker only has access to the input/output of the model. We propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR.
arXiv Detail & Related papers (2022-11-21T09:51:28Z)
Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model. We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation. Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z)
Manipulating SGD with Data Ordering Attacks [23.639512087220137]
We present a class of training-time attacks that require no changes to the underlying model dataset or architecture. In particular, an attacker can disrupt the integrity and availability of a model by simply reordering training batches. Attacks have a long-term impact in that they decrease model performance hundreds of epochs after the attack took place.
arXiv Detail & Related papers (2021-04-19T22:17:27Z)
Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples. We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)
Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model. We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance. For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
Adversarial Imitation Attack [63.76805962712481]
A practical adversarial attack should require as little as possible knowledge of attacked models. Current substitute attacks need pre-trained models to generate adversarial examples. In this study, we propose a novel adversarial imitation attack.
arXiv Detail & Related papers (2020-03-28T10:02:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.