Related papers: MultiRobustBench: Benchmarking Robustness Against Multiple Attacks

MultiRobustBench: Benchmarking Robustness Against Multiple Attacks

URL: http://arxiv.org/abs/2302.10980v3
Date: Thu, 20 Jul 2023 01:34:16 GMT
Title: MultiRobustBench: Benchmarking Robustness Against Multiple Attacks
Authors: Sihui Dai, Saeed Mahloujifar, Chong Xiang, Vikash Sehwag, Pin-Yu Chen, Prateek Mittal
Abstract summary: We present the first unified framework for considering multiple attacks against machine learning (ML) models. Our framework is able to model different levels of learner's knowledge about the test-time adversary. We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types.
Score: 86.70417016955459
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The bulk of existing research in defending against adversarial examples focuses on defending against a single (typically bounded Lp-norm) attack, but for a practical setting, machine learning (ML) models should be robust to a wide variety of attacks. In this paper, we present the first unified framework for considering multiple attacks against ML models. Our framework is able to model different levels of learner's knowledge about the test-time adversary, allowing us to model robustness against unforeseen attacks and robustness against unions of attacks. Using our framework, we present the first leaderboard, MultiRobustBench, for benchmarking multiattack evaluation which captures performance across attack types and attack strengths. We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types, including Lp-based threat models, spatial transformations, and color changes, at 20 different attack strengths (180 attacks total). Additionally, we analyze the state of current defenses against multiple attacks. Our analysis shows that while existing defenses have made progress in terms of average robustness across the set of attacks used, robustness against the worst-case attack is still a big open problem as all existing models perform worse than random guessing.

Related papers

Adapting to Evolving Adversaries with Regularized Continual Robust Training [47.93633573641843]
We present theoretical results which show that the gap in a model's robustness against different attacks is bounded by how far each attack perturbs a sample in the model's logit space. Our findings and open-source code lay the groundwork for the deployment of models robust to evolving attacks.
arXiv Detail & Related papers (2025-02-06T17:38:41Z)
AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples [26.37278338032268]
Adrial examples are typically optimized with gradient-based attacks. Each is shown to outperform its predecessors using different experimental setups. This provides overly-optimistic and even biased evaluations.
arXiv Detail & Related papers (2024-04-30T11:19:05Z)
BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack [22.408968332454062]
We study the unique, less-well understood problem of generating sparse adversarial samples simply by observing the score-based replies to model queries. We develop the BruSLeAttack-a new, faster (more query-efficient) algorithm for the problem. Our work facilitates faster evaluation of model vulnerabilities and raises our vigilance on the safety, security and reliability of deployed systems.
arXiv Detail & Related papers (2024-04-08T08:59:26Z)
Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked. We propose an attack-agnostic defense method named Meta Invariance Defense (MID) We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z)
Versatile Defense Against Adversarial Attacks on Image Recognition [2.9980620769521513]
Defending against adversarial attacks in a real-life setting can be compared to the way antivirus software works. It appears that a defense method based on image-to-image translation may be capable of this. The trained model has successfully improved the classification accuracy from nearly zero to an average of 86%.
arXiv Detail & Related papers (2024-03-13T01:48:01Z)
Understanding the Robustness of Randomized Feature Defense Against Query-Based Adversarial Attacks [23.010308600769545]
Deep neural networks are vulnerable to adversarial examples that find samples close to the original image but can make the model misclassify. We propose a simple and lightweight defense against black-box attacks by adding random noise to hidden features at intermediate layers of the model at inference time. Our method effectively enhances the model's resilience against both score-based and decision-based black-box attacks.
arXiv Detail & Related papers (2023-10-01T03:53:23Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
"What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models. We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z)
OpenAttack: An Open-source Textual Adversarial Attack Toolkit [73.22185718706602]
We present an open-source textual adversarial attack toolkit named OpenAttack to solve these issues. OpenAttack has its unique strengths in support for all attack types, multilinguality, and parallel processing.
arXiv Detail & Related papers (2020-09-19T09:02:56Z)
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function. We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.