Related papers: On Adaptive Attacks to Adversarial Example Defenses

On Adaptive Attacks to Adversarial Example Defenses

URL: http://arxiv.org/abs/2002.08347v2
Date: Fri, 23 Oct 2020 12:07:41 GMT
Title: On Adaptive Attacks to Adversarial Example Defenses
Authors: Florian Tramer, Nicholas Carlini, Wieland Brendel, Aleksander Madry
Abstract summary: This paper lays out the methodology and the approach necessary to perform an adaptive attack against defenses to adversarial examples. We hope that these analyses will serve as guidance on how to properly perform adaptive attacks against defenses to adversarial examples.
Score: 123.32678153377915
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adaptive attacks have (rightfully) become the de facto standard for evaluating defenses to adversarial examples. We find, however, that typical adaptive evaluations are incomplete. We demonstrate that thirteen defenses recently published at ICLR, ICML and NeurIPS---and chosen for illustrative and pedagogical purposes---can be circumvented despite attempting to perform evaluations using adaptive attacks. While prior evaluation papers focused mainly on the end result---showing that a defense was ineffective---this paper focuses on laying out the methodology and the approach necessary to perform an adaptive attack. We hope that these analyses will serve as guidance on how to properly perform adaptive attacks against defenses to adversarial examples, and thus will allow the community to make further progress in building more robust models.

Related papers

Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents [3.5248694676821484]
We evaluate eight different defenses and bypass all of them using adaptive attacks, consistently achieving an attack success rate of over 50%. Our research underscores the need for adaptive attack evaluation when designing defenses to ensure robustness and reliability.
arXiv Detail & Related papers (2025-02-27T04:04:50Z)
MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models. Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs. Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z)
Improving behavior based authentication against adversarial attack using XAI [3.340314613771868]
We propose an eXplainable AI (XAI) based defense strategy against adversarial attacks in such scenarios. A feature selector, trained with our method, can be used as a filter in front of the original authenticator. We demonstrate that our XAI based defense strategy is effective against adversarial attacks and outperforms other defense strategies.
arXiv Detail & Related papers (2024-02-26T09:29:05Z)
Adversarial Markov Games: On Adaptive Decision-Based Attacks and Defenses [21.759075171536388]
We show how attacks but also defenses can benefit by it and by learning from each other through interaction. We demonstrate that active defenses, which control how the system responds, are a necessary complement to model hardening when facing decision-based attacks. We lay out effective strategies in ensuring the robustness of ML-based systems deployed in the real-world.
arXiv Detail & Related papers (2023-12-20T21:24:52Z)
Deep-Attack over the Deep Reinforcement Learning [26.272161868927004]
adversarial attack developments have made reinforcement learning more vulnerable. We propose a reinforcement learning-based attacking framework by considering the effectiveness and stealthy spontaneously. We also propose a new metric to evaluate the performance of the attack model in these two aspects.
arXiv Detail & Related papers (2022-05-02T10:58:19Z)
Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack [96.50202709922698]
A practical evaluation method should be convenient (i.e., parameter-free), efficient (i.e., fewer iterations) and reliable. We propose a parameter-free Adaptive Auto Attack (A$3$) evaluation method which addresses the efficiency and reliability in a test-time-training fashion.
arXiv Detail & Related papers (2022-03-10T04:53:54Z)
Evaluating the Adversarial Robustness of Adaptive Test-time Defenses [60.55448652445904]
We categorize such adaptive testtime defenses and explain their potential benefits and drawbacks. Unfortunately, none significantly improve upon static models when evaluated appropriately. Some even weaken the underlying static model while simultaneously increasing inference cost.
arXiv Detail & Related papers (2022-02-28T12:11:40Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
Guided Adversarial Attack for Evaluating and Enhancing Adversarial Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training. We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries. We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.