Sparse-PGD: A Unified Framework for Sparse Adversarial Perturbations Generation
- URL: http://arxiv.org/abs/2405.05075v3
- Date: Tue, 24 Dec 2024 05:55:43 GMT
- Title: Sparse-PGD: A Unified Framework for Sparse Adversarial Perturbations Generation
- Authors: Xuyang Zhong, Chen Liu,
- Abstract summary: This work studies sparse adversarial perturbations, including both unstructured and structured ones.
We propose a framework based on a white-box PGD-like attack method named Sparse-PGD to effectively and efficiently generate such perturbations.
We combine Sparse-PGD with a black-box attack to evaluate the models' robustness against unstructured and structured sparse adversarial perturbations.
- Score: 2.3371504588528635
- License:
- Abstract: This work studies sparse adversarial perturbations, including both unstructured and structured ones. We propose a framework based on a white-box PGD-like attack method named Sparse-PGD to effectively and efficiently generate such perturbations. Furthermore, we combine Sparse-PGD with a black-box attack to comprehensively and more reliably evaluate the models' robustness against unstructured and structured sparse adversarial perturbations. Moreover, the efficiency of Sparse-PGD enables us to conduct adversarial training to build robust models against various sparse perturbations. Extensive experiments demonstrate that our proposed attack algorithm exhibits strong performance in different scenarios. More importantly, compared with other robust models, our adversarially trained model demonstrates state-of-the-art robustness against various sparse attacks.
Related papers
- Defensive Dual Masking for Robust Adversarial Defense [5.932787778915417]
This paper introduces the Defensive Dual Masking (DDM) algorithm, a novel approach designed to enhance model robustness against such attacks.
DDM utilizes a unique adversarial training strategy where [MASK] tokens are strategically inserted into training samples to prepare the model to handle adversarial perturbations more effectively.
During inference, potentially adversarial tokens are dynamically replaced with [MASK] tokens to neutralize potential threats while preserving the core semantics of the input.
arXiv Detail & Related papers (2024-12-10T00:41:25Z) - Robust Feature Inference: A Test-time Defense Strategy using Spectral Projections [12.807619042576018]
We propose a novel test-time defense strategy called Robust Feature Inference (RFI)
RFI is easy to integrate with any existing (robust) training procedure without additional test-time computation.
We show that RFI improves robustness across adaptive and transfer attacks consistently.
arXiv Detail & Related papers (2023-07-21T16:18:58Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and
Boosting Segmentation Robustness [63.726895965125145]
Deep neural network-based image classifications are vulnerable to adversarial perturbations.
In this work, we propose an effective and efficient segmentation attack method, dubbed SegPGD.
Since SegPGD can create more effective adversarial examples, the adversarial training with our SegPGD can boost the robustness of segmentation models.
arXiv Detail & Related papers (2022-07-25T17:56:54Z) - Towards Compositional Adversarial Robustness: Generalizing Adversarial
Training to Composite Semantic Perturbations [70.05004034081377]
We first propose a novel method for generating composite adversarial examples.
Our method can find the optimal attack composition by utilizing component-wise projected gradient descent.
We then propose generalized adversarial training (GAT) to extend model robustness from $ell_p$-ball to composite semantic perturbations.
arXiv Detail & Related papers (2022-02-09T02:41:56Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Generating Structured Adversarial Attacks Using Frank-Wolfe Method [7.84752424025677]
Constraining adversarial search with different norms results in disparately structured adversarial examples.
structured adversarial examples can be used for adversarial regularization of models to make models more robust or improve their performance on datasets which are structurally different.
arXiv Detail & Related papers (2021-02-15T06:36:50Z) - Adversarial Attack and Defense of Structured Prediction Models [58.49290114755019]
In this paper, we investigate attacks and defenses for structured prediction tasks in NLP.
The structured output of structured prediction models is sensitive to small perturbations in the input.
We propose a novel and unified framework that learns to attack a structured prediction model using a sequence-to-sequence model.
arXiv Detail & Related papers (2020-10-04T15:54:03Z) - Boosting Adversarial Training with Hypersphere Embedding [53.75693100495097]
Adversarial training is one of the most effective defenses against adversarial attacks for deep learning models.
In this work, we advocate incorporating the hypersphere embedding mechanism into the AT procedure.
We validate our methods under a wide range of adversarial attacks on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2020-02-20T08:42:29Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.