MultiRobustBench: Benchmarking Robustness Against Multiple Attacks
- URL: http://arxiv.org/abs/2302.10980v3
- Date: Thu, 20 Jul 2023 01:34:16 GMT
- Title: MultiRobustBench: Benchmarking Robustness Against Multiple Attacks
- Authors: Sihui Dai, Saeed Mahloujifar, Chong Xiang, Vikash Sehwag, Pin-Yu Chen,
Prateek Mittal
- Abstract summary: We present the first unified framework for considering multiple attacks against machine learning (ML) models.
Our framework is able to model different levels of learner's knowledge about the test-time adversary.
We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types.
- Score: 86.70417016955459
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The bulk of existing research in defending against adversarial examples
focuses on defending against a single (typically bounded Lp-norm) attack, but
for a practical setting, machine learning (ML) models should be robust to a
wide variety of attacks. In this paper, we present the first unified framework
for considering multiple attacks against ML models. Our framework is able to
model different levels of learner's knowledge about the test-time adversary,
allowing us to model robustness against unforeseen attacks and robustness
against unions of attacks. Using our framework, we present the first
leaderboard, MultiRobustBench, for benchmarking multiattack evaluation which
captures performance across attack types and attack strengths. We evaluate the
performance of 16 defended models for robustness against a set of 9 different
attack types, including Lp-based threat models, spatial transformations, and
color changes, at 20 different attack strengths (180 attacks total).
Additionally, we analyze the state of current defenses against multiple
attacks. Our analysis shows that while existing defenses have made progress in
terms of average robustness across the set of attacks used, robustness against
the worst-case attack is still a big open problem as all existing models
perform worse than random guessing.
Related papers
- AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples [26.37278338032268]
Adrial examples are typically optimized with gradient-based attacks.
Each is shown to outperform its predecessors using different experimental setups.
This provides overly-optimistic and even biased evaluations.
arXiv Detail & Related papers (2024-04-30T11:19:05Z) - BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack [22.408968332454062]
We study the unique, less-well understood problem of generating sparse adversarial samples simply by observing the score-based replies to model queries.
We develop the BruSLeAttack-a new, faster (more query-efficient) algorithm for the problem.
Our work facilitates faster evaluation of model vulnerabilities and raises our vigilance on the safety, security and reliability of deployed systems.
arXiv Detail & Related papers (2024-04-08T08:59:26Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Versatile Defense Against Adversarial Attacks on Image Recognition [2.9980620769521513]
Defending against adversarial attacks in a real-life setting can be compared to the way antivirus software works.
It appears that a defense method based on image-to-image translation may be capable of this.
The trained model has successfully improved the classification accuracy from nearly zero to an average of 86%.
arXiv Detail & Related papers (2024-03-13T01:48:01Z) - Understanding the Robustness of Randomized Feature Defense Against
Query-Based Adversarial Attacks [23.010308600769545]
Deep neural networks are vulnerable to adversarial examples that find samples close to the original image but can make the model misclassify.
We propose a simple and lightweight defense against black-box attacks by adding random noise to hidden features at intermediate layers of the model at inference time.
Our method effectively enhances the model's resilience against both score-based and decision-based black-box attacks.
arXiv Detail & Related papers (2023-10-01T03:53:23Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - "What's in the box?!": Deflecting Adversarial Attacks by Randomly
Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models.
We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z) - OpenAttack: An Open-source Textual Adversarial Attack Toolkit [73.22185718706602]
We present an open-source textual adversarial attack toolkit named OpenAttack to solve these issues.
OpenAttack has its unique strengths in support for all attack types, multilinguality, and parallel processing.
arXiv Detail & Related papers (2020-09-19T09:02:56Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.