Revisiting DeepFool: generalization and improvement
- URL: http://arxiv.org/abs/2303.12481v1
- Date: Wed, 22 Mar 2023 11:49:35 GMT
- Title: Revisiting DeepFool: generalization and improvement
- Authors: Alireza Abdollahpourrostam, Mahed Abroshan, Seyed-Mohsen
Moosavi-Dezfooli
- Abstract summary: We introduce a new family of adversarial attacks that strike a balance between effectiveness and computational efficiency.
Our proposed attacks are also suitable for evaluating the robustness of large models.
- Score: 17.714671419826715
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks have been known to be vulnerable to adversarial
examples, which are inputs that are modified slightly to fool the network into
making incorrect predictions. This has led to a significant amount of research
on evaluating the robustness of these networks against such perturbations. One
particularly important robustness metric is the robustness to minimal l2
adversarial perturbations. However, existing methods for evaluating this
robustness metric are either computationally expensive or not very accurate. In
this paper, we introduce a new family of adversarial attacks that strike a
balance between effectiveness and computational efficiency. Our proposed
attacks are generalizations of the well-known DeepFool (DF) attack, while they
remain simple to understand and implement. We demonstrate that our attacks
outperform existing methods in terms of both effectiveness and computational
efficiency. Our proposed attacks are also suitable for evaluating the
robustness of large models and can be used to perform adversarial training (AT)
to achieve state-of-the-art robustness to minimal l2 adversarial perturbations.
Related papers
- Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - How many perturbations break this model? Evaluating robustness beyond
adversarial accuracy [28.934863462633636]
We introduce adversarial sparsity, which quantifies how difficult it is to find a successful perturbation given both an input point and a constraint on the direction of the perturbation.
We show that sparsity provides valuable insight into neural networks in multiple ways.
arXiv Detail & Related papers (2022-07-08T21:25:17Z) - Masking Adversarial Damage: Finding Adversarial Saliency for Robust and
Sparse Network [33.18197518590706]
Adversarial examples provoke weak reliability and potential security issues in deep neural networks.
We propose a novel adversarial pruning method, Masking Adversarial Damage (MAD) that employs second-order information of adversarial loss.
We show that MAD effectively prunes adversarially trained networks without loosing adversarial robustness and shows better performance than previous adversarial pruning methods.
arXiv Detail & Related papers (2022-04-06T11:28:06Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Exploring Misclassifications of Robust Neural Networks to Enhance
Adversarial Attacks [3.3248768737711045]
We analyze the classification decisions of 19 different state-of-the-art neural networks trained to be robust against adversarial attacks.
We propose a novel loss function for adversarial attacks that consistently improves attack success rate.
arXiv Detail & Related papers (2021-05-21T12:10:38Z) - Second Order Optimization for Adversarial Robustness and
Interpretability [6.700873164609009]
We propose a novel regularizer which incorporates first and second order information via a quadratic approximation to the adversarial loss.
It is shown that using only a single iteration in our regularizer achieves stronger robustness than prior gradient and curvature regularization schemes.
It retains the interesting facet of AT that networks learn features which are well-aligned with human perception.
arXiv Detail & Related papers (2020-09-10T15:05:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.